From bloggers to school kids, there’s a surging interest in generative AI models like ChatGPT. People have been using these AI tools to do everything from generate out-of-office replies to write real estate news articles. But critics are now calling on the creators of these tools to develop detectors to help identify the abuse of AI-generated text.
OpenAI, the creators of ChatGPT, has responded to this request. The company announced a new tool called the “OpenAI AI Text Classifier“. Designed to scan and label content, the debut of the AI Classifier created a collective sigh of relief for both educators and business battling the surge in AI generated content in their organizations.
But there’s a big problem with the OpenAI AI Text Classifier. It stinks. While the OpenAI AI Text Classifier works in certain situations, its limitations are too much for the tool to be widely usable.
Here’s why the OpenAI AI Text Classifier sucks.
What Is The OpenAI AI Text Classifier?
The “OpenAI AI Text Classifier” is an artificial intelligence tool created by OpenAI, the same company behind ChatGPT.
The OpenAI AI Text Classifier tool has been trained to scan blocks of text and then label that text according to a predefined category. For example, the AI Classifier may label an email as spam or not spam, or categorize a news article into topics like sports, politics, or entertainment.
But the OpenAI AI Text Classifier has also been trained to detect artificial intelligence (AI) generated text. This means that you can use the OpenAI AI Text Classifier tool to detect the probability that something was generated using an AI tool, like ChatGPT.
Here’s Why The Official ChatGPT Detector Sucks
The OpenAI AI Text Classifier has several major flaws and is not a dependable tool for identifying AI-generated content. Here are its biggest issues.
Results From The OpenAI AI Text Classifier Are Often Inaccurate And Unreliable
The biggest limitation of the AI Classifier is it isn’t very reliable. The tools only has a success rate of only 26%, according to OpenAI. However, the company still thinks it could be useful when used in combination with other methods, in order to stop AI text generators from being misused.
That means that a student who didn’t use ChatGPT to submit their homework may incorrectly have their work labeled as AI generated and vice versa.
Only 26% accuracy?! Come on OpenAI! That’s a failing grade. It’s time for them to step up their game and make sure their AI Text Classifier is accurate and trustworthy. Put the same development effort as ChatGPT into the AI Classifier.
OpenAI AI Text Classifier Needs A Minimum Of 1,000 Characters To Work
Another limitation is the OpenAi Classifier won’t work on short blocks of text. You’ll need a minimum of 1,000 characters, or around 150 to 250 words, for it to be effective. That’s roughly the size of two medium-sized paragraphs.
This character requirement is incredibly frustrating, especially when you consider that ChatGPT doesn’t require this amount of text input from users to begin generating content.
OpenAI AI Text Classifier Can’t Detect Plagiarism Or Cheaters
Here’s some especially bad news for teachers. The single biggest hope from educators was that they could use tools like the OpenAI AI Text Classifier to detect ChatGPT plagiarism and cheating. But the truth is, the AI Classifier can’t. It’s fails repeatedly.
Despite OpenAI’s claims that they can spot similarities between two texts, it won’t detect ChatGPT plagiarism. This means that if someone is trying to cheat with ChatGPT, they can do so without fear of getting caught by someone using the OpenAI AI Text Classifier. This is a real shame, considering that AI-generated text has been known to repeat the same content it was trained on.
The OpenAI AI Text Classifier may be well-suited for some applications, but if you’re looking for a reliable ChatGPT cheating and plagiarism detector, then you should look elsewhere.
Unable To Properly Process Text Written By Children Or In Non-English
OpenAI also states that the OpenAI Classifier is more likely to make mistakes when dealing with text written by children or in a language other than English, due to its English-centric dataset.
Grade schools students aren’t perfect. Their work is full of various errors. And if you directly ask it, ChatGPT won’t intently generate text with spelling or grammatical errors. But there are ways to trick the AI.
As a test, I asked ChatGPT to “write a paper about Minecraft in the style of a kindergarten student that is still learning english.”
This was what ChatGPT generated:
Minecraft is fun game. You play on computer or Xbox. You build things and explore big world. There two way play, Creative mode and Survival mode. In Creative mode, you have all block you need. You make house, castle, and city! You fly and build big things. In Survival mode, you get block and build things. But, watch out for monsters like zombie and creeper. You be careful and smart. In Minecraft, you also go on adventure. You explore cave, swim in lake, and climb mountain. You find special things to help build better. Minecraft has many different blocks. There’s grass, dirt, and sand for the ground. There’s also wood, stone, and bricks for building. And there’s special blocks like glass, diamonds, and gold. Minecraft fun to play with friend too. All of these blocks are special and important in Minecraft. They help you build and explore. You play on same world and build or go on adventure together. Have fun building with all the different blocks in Minecraft! Building and exploring in Minecraft is so much fun. I hope you play soon!
When we add this ChatGPT AI generated text about Minecraft to the OpenAI AI Text Classifier, it couldn’t tell if it was AI-generated or not.
The classifier considers the text to be unclear if it is AI-generated.
Another trick is to have ChatGPT translate a block of text to another language and then back to English. For example, I took the above paragraph about Minecraft and translated it to Italian, and the back to English.
The end result was a “possibly” AI-generated and not a “likely” AI-generated rating.
The classifier considers the text to be possibly AI-generated. In other words, it doesn’t know for sure. Even through the text was 100% generated by CHatGPT, the OpenAI AI Text Classifier can’t tell.
Again, this makes it almost impossible for teachers to catch ChatGPT cheating. Savvy school kids already know these tricks. They spread across playgrounds faster than the lyrics to “Jingle Bells Batman Smells“.
The OpenAI AI Text Classifier Is Not Optimized For Everyday Smartphone Use
Another shortcoming of the OpenAI AI Text Classifier is it’s not optimized for everyday mobile experiences. Although the mobile web version of the tool works just as well as the desktop, the tool doesn’t act like a smartphone app.
In an ideal situation, someone should be able to point their smartphone camera at a piece of paper and use OCR to scan the text and detect if it was AI-generated or not.
As AI-generated content becomes more prevalent in our society, being able to detect this content in our everyday lives will become more important.
ChatGPT Detection Tools And The Surging AI-Generated Content Problem
AI-generated content is already a major problem for both businesses and school systems.
In an effort to curb cheating and plagiarism, several of the U.S.’s largest school districts have already banned ChatGPT.
And popular websites like Stack Overflow have taken a firm stance against AI generated content and banned users from sharing anything generated by ChatGPT.
It’s almost impossible to prevent AI content from being generated. So the burden falls to AI and ChatGPT detection tools. OpenAI’s AI Classifier tool is a good first step, but it’s far from being a legitimate detection device.
An OpenAI spokesperson recently told TechCrunch, “We’re making this initial classifier available to get feedback on whether tools like this are useful, and hope to share improved methods in the future.”
Hopefully OpenAI will evolve this produce more in the future. If anyone from OpenAI is reading this, here’s some feedback: The OpenAI AI Text Classifier sucks.
Frank Wilson is a retired teacher with over 30 years of combined experience in the education, small business technology, and real estate business. He now blogs as a hobby and spends most days tinkering with old computers. Wilson is passionate about tech, enjoys fishing, and loves drinking beer.