Text Classifier to Detect AI-Generated Content Subtitle: A Step Towards Tackling Misinformation and Academic Dishonesty
Key Points:
- A new AI classifier has been developed to distinguish between human-written and AI-generated text.
- The classifier has a 26% true positive rate and a 9% false positive rate, with reliability improving for longer texts.
- The classifier is a work-in-progress, with limitations on short texts, non-English languages, and code.
- AI-generated text can be edited to evade the classifier, raising questions about long-term detection advantage.
- The creators seek feedback from educators and those directly impacted to improve the classifier and understand its implications.
A new AI classifier has been launched, trained to distinguish between human-written and AI-generated text. This development aims to tackle challenges such as automated misinformation campaigns, academic dishonesty, and AI chatbots posing as humans. While the classifier is not fully reliable, with a 26% true positive rate and a 9% false positive rate, it shows promise in informing mitigation strategies. The classifier’s reliability generally improves as the length of the input text increases.
This classifier is publicly available for feedback, allowing users to assess its usefulness as an imperfect tool. The creators acknowledge the classifier’s limitations, including its unreliability with short texts (below 1,000 characters), non-English languages, and code. Furthermore, the classifier can sometimes confidently label human-written text as AI-generated, indicating a need for improvement.
Another challenge is the possibility of AI-generated text being edited to evade the classifier. While classifiers can be updated and retrained based on successful attacks, the long-term advantage of detection remains uncertain. The classifier’s training involved fine-tuning a language model on a dataset of paired human-written and AI-written texts on the same topic, sourced from a variety of origins.
Educators have expressed concerns over the identification of AI-generated text, and understanding the limitations and impacts of such classifiers in the classroom is crucial. The creators have developed a preliminary resource on using ChatGPT for educators, outlining its uses, limitations, and considerations. This resource is expected to have implications for journalists, mis/disinformation researchers, and other groups as well.
The developers of the classifier are engaging with educators in the United States to learn about classroom experiences and discuss ChatGPT’s capabilities and limitations. They aim to broaden outreach as they gain insights, ensuring that large language models are deployed safely and in direct contact with affected communities. They are also encouraging feedback from those directly impacted by these issues, such as teachers, administrators, parents, students, and education service providers, to improve the classifier and its associated resources.