Mira studies international business at Belgium’s Artevelde University of Applied Sciences. She was surprised when her instructor told her that an artificial intelligence detector flagged 40% of her paper as bot-written. She sought clarification on Twitter after receiving the feedback.
Just knowing she hadn’t used AI made her doubt her work. The Daily Beast quoted Mira, who requested anonymity. She told her professor she didn’t know how to prove she wrote the paper. He promised to check it again, but she hasn’t heard back.
She is clearly stressed out by this. She will likely fail the entire class as well as the assignment if the detector keeps flagging it at 40%. Mira is now even more apprehensive about penning essays in the future. It is even more challenging for her to struggle with the language because she is not a native English speaker.
This is by no means the only instance of this kind involving AI detectors. Students have been using social media to voice their concerns and experiences of being falsely accused of using chatbots to plagiarize assignments ever since ChatGPT was introduced. A number of students also mentioned that the impact on their grades and mental health from being mistakenly flagged for turning in AI-generated work caused them to drop out of school.
Education institutions, educators, and experts are questioning what needs to be done about the rise of chatbots and the tools used to combat them going forward due to criticism of this technology. Detectors quickly became commonplace due to initial concerns about how students would use such tools to generate their work, which was sparked by the exponential rise in generative AI. However, the shortcomings of these detectors have demonstrated that this strategy for reducing AI-generated work might have more negative effects on education than positive ones.
The University of Maryland’s Vinu Sankar Sadasivan, a computer scientist who co-authored a preprint paper on the dependability of AI detectors, told that a number of significant and unforeseen challenges resulting from the rapid growth and adoption of AI have taken educators and students by surprise.
Even the AI community was taken aback by the swift and somewhat unexpected rise of powerful language models like ChatGPT, according to Sadasivan. There is a chance that using these models in an uncontrolled manner could lead to negative consequences like plagiarism.
He continued by saying that educational institutions were forced to employ AI detectors despite not fully comprehending how they operated or whether they were even trustworthy in the first place due to the artificial intelligence (AI) industry’s meteoric rise in popularity and hype. As a result, there have been cases where instructors have falsely accused students of plagiarism, as the widely shared tweet on Twitter demonstrates.
The New School of Education
Because of her personal experience with the detectors, Janelle Shane, an AI researcher and author of the book You Look Like a Thing and I Love You: How AI Works and Why It’s Making the World a Weirder Place, adopted a more nuanced perspective on this. Shane told that she changed her mind after realizing how these detectors were being used and that false positives weren’t rare, even though she had initially enjoyed using the tools and thought they were interesting in the way they would evaluate the text.
She claimed that finding false positives in her own book, which she knew she had written herself, wasn’t too difficult for her.
This becomes especially problematic when detectors are used in situations like academic dishonesty and plagiarism, which can have serious repercussions for an individual’s life. Shane received several comments from students sharing their personal experiences after she shared her opinions about AI detectors on social media.
Speaking about a specific instance a student shared, Shane claimed that although ChatGPT has no memory between sessions, it will still provide you with a definitive response, which is absurd. In one case, a teacher sent chatGPT an assignment and inquired as to whether the content was generated by the generator, a question that ChatGPT could not answer.
When it comes to neurodivergent or non-native English speakers, like Mira, this becomes even more of an issue. Indeed, a Stanford study indicated that AI detectors have a clear “bias” against the latter group, which was published in the journal Patterns in July 2023. Shane goes on to say that there has been evidence of a comparable “bias” against the writing of neurodivergent authors.
Purdue University associate professor of UX design Rua M. Williams revealed recently that someone responded to their email assuming AI had composed it. In response, Williams said that because he has autism, the text likely appeared that way to them.
According to Williams, he believes that people right now—especially educators—are terrified of AI, which makes them doubt the veracity of the material they are reading. As a result, they are more likely to use that suspicion against those who use language in a slightly different way by nature, such as neurodivergent individuals and those who speak English as a second language.
The associate dean for digital at NEOMA Business School, Alain Goudey, also notes that because AI detector algorithms assess a text’s “perplexity,” non-native English speakers frequently find their work mistakenly flagged.
According to Goudey, common English words reduce the perplexity score, increasing the likelihood that a text will be identified as AI-generated. In contrast, more intricate or fancy vocabulary raises the perplexity score and indicates that the text was likely written by a human.
It can result in their work being identified as AI-generated, he continued, because non-native English speakers tend to use simple language. This extra burden can be draining and further disadvantage non-native English speakers, who are already putting in the extra effort to learn a language.
Professor of humanities T. Wade Langer Jr. at the University of Alabama Honours College has seen this in his own students. He explained that rather than accepting the detectors at face value right away, he uses them as a springboard for discussions with his students about their perspectives. Considering how common and well-liked AI generators have grown, he doesn’t completely rule them out. He clarified, though, that our policy is a conversation rather than an implosion.
He said there is some strain on mental health whenever an issue of academic misconduct is brought up. The reason for this is that instead of making a snap decision, educators and administrators should act with inquiry rather than judgement, starting with a conversation to find out the real story behind a student’s academic integrity.
Researchers like Sadasivan fear that if the long-term effects of these are not properly understood, they will stifle creativity and reinforce existing biases. Rather than using these as justifications for outlawing or eliminating this technology, however, specialists are advocating for a reexamination of its precise applications.
It appears that educational institutions are unable to keep pace with the rapid advancements in AI. This has resulted in a dependence on short-term solutions such as AI detectors; however, as the repercussions of this reliance become apparent, critics promptly highlight the perils associated with such dependence. As technology advances faster than conventional education, educators will be required to stay abreast of developments; this may potentially disadvantage students. That is why experts are advocating for a paradigm shift regarding the current perception of AI detectors and generators, and ensuring that their utilization does not result in detrimental consequences.
Langer stated that, similar to any other resource utilized by educators, the primary concern would be to utilize the resource as a conclusive criterion or litmus test to evaluate a student. Engaging in dialogue requires greater investment of time and effort than issuing a grade or verdict. Equally as much as academic integrity, however, due diligence is required for instructional integrity.
Langer stated that, similar to any other resource utilized by educators, the primary concern would be to utilize the resource as a conclusive criterion or litmus test to evaluate a student. Engaging in conversation requires greater investment of time and effort than issuing a grade or verdict. Equally as much as academic integrity, however, due diligence is required for instructional integrity.