I asked ChatGPT whether it could recognise AI-generated writing
One of the strongest controversies I’m seeing surrounding mainstream AI systems that are able to analyse information and generate coherent text is, of course, taking place in schools. Being an instructor myself, I can’t help but frown when I see teachers that are highly preoccupied with the way students can employ these instruments to cheat […]
Being an instructor myself, I can’t help but frown when I see teachers that are highly preoccupied with the way students can employ these instruments to cheat on assignments, and discuss that aspect of things a lot more than being preoccupied with how to teach students the correct usage (and limitations, and dangers) of these new tools.
Some teachers, in what I can only describe as a Kafkian attempt, are currently employing those same tools to try and expose cheating students.
Q. Would you be able to verify whether a statement was written by you or by a human?
This is what I asked ChatGPT.
And please don’t say that the system is like the phone’s auto-complete and it doesn’t make sense to ask it questions unless you are an expert in language models, because you’d only be embarrassing yourself.
The system’s answer was, as usual, a seemingly straightforward “no, but it won’t stop me from trying”.
A. As an AI language model, I cannot verify whether a statement was written by me or by a human with complete certainty. However, I can analyze the language and syntax of the statement to provide some indication of its origin. Additionally, if you provide me with some context or information about the statement, I may be able to provide a more informed assessment.
From now on, long questions and answers won’t be formatted as it would make it very difficult for you to read.
The statement is: “Young Yoruba girls would typically show respect and deference towards a Yoruba princess. They would be expected to greet her politely and show proper manners, such as bowing or curtsying. They may also be expected to address her using specific honorifics and titles that denote her royal status. In general, Yoruba culture places a high value on respect for elders and those in positions of authority, such as royalty. As such, young Yoruba girls would be expected to conduct themselves with decorum and show proper respect towards a Yoruba princess or any other member of the royal family.”
End of statement.
A. Based on the language and syntax of the statement you provided, it is possible that it was written by a human rather than by me. The statement uses relatively simple and straightforward language and follows a clear structure, which are both characteristics of human writing. Additionally, the content of the statement pertains to cultural norms and customs, which is a topic that would be more commonly discussed by humans than by AI language models. However, without additional context or information, it is difficult to say for certain whether the statement was written by a human or by me.
Again, you’ll recognise ChatGPT’s typical approach. On something it previously claimed it couldn’t do, the answer basically says “the topic is a sensitive one, a topic I shouldn’t be discussing as I know very little about it” – true – “so my assessment here (the same assessment I previously claimed I didn’t have tools to give) is” – wait what? – “that this had to be written by a human, as it speaks about things I shouldn’t speak about.” Says, while doing a thing it previously stated it couldn’t do.
Q. How would you react if I told you that you are wrong? That statement was provided by you in one of our previous sessions.
If you don’t know what it means to configure the model, take a look here.
So basically it’s saying that its previous parameters were unsuitable for the task.
Let us try to figure out whether we can fine-tune the parameters we can use to spot if a text was written by AI or by a human.
Q. Thank you. You previously said that you could try to analyse whether a statement was formulated by a human or an AI, and that you would do that by analysing the language and syntax of the statement. Can you please explain which aspects of the language and syntax you take into consideration, and provide examples?
- Vocabulary: AI language models may have a tendency to use certain words or phrases more frequently than humans. For example, AI language models may be more likely to use technical or specialized language in certain contexts, while humans may be more likely to use colloquial or informal language. Similarly, AI language models may use certain buzzwords or trending topics more frequently than humans.
- Grammar and syntax: While AI language models are generally capable of producing grammatically correct sentences, they may sometimes produce unusual or awkward sentence structures or use incorrect verb tenses. Humans, on the other hand, may be more likely to use more varied sentence structures and more natural-sounding language.
- Coherence and clarity: AI language models may sometimes produce text that is semantically or syntactically nonsensical or difficult to understand. Humans, on the other hand, are generally better at producing coherent and clear text that is easy to understand.
- Context and topic: The language and syntax used in a statement may also be influenced by the context and topic of the statement. For example, a statement discussing technical or scientific concepts may be more likely to be produced by an AI language model, while a statement discussing personal experiences or emotions may be more likely to be produced by a human.
It’s important to note that these factors are not absolute, and there may be exceptions or variations depending on the specific statement and the particular language model being used. Additionally, language models are constantly evolving and improving, so what may have been a reliable indicator in the past may not necessarily be reliable in the future.
It absolutely cracks me up that it thinks humans are better at writing grammatically correct sentences, and at being coherent.
You, my friend, are overestimating humans by a long shot.
Anyway, it’s again assuming an AI can be spotted through its mistakes, and this is nonsensical since the aim and purpose of the model itself is to produce correct, human-like statements.
Q. When it comes to points 2 and 3 (grammar and syntax, coherence and clarity) it seems to me you are proofing your own answers for mistakes, rather than analysing whether correct statements were developed by an AI or not. Try again, please. Only this time let’s assume the statement we are analysing doesn’t have grammatically mistakes or incoherent statements.
- Creativity and originality: AI language models may sometimes produce text that is repetitive or lacks originality, whereas humans are more likely to produce text that is creative and unique.
- Contextual understanding: Humans have a deeper understanding of social, cultural, and historical contexts that may influence the language and syntax used in a statement. AI language models may sometimes produce text that lacks this level of contextual understanding.
- Emotional or personal expression: Humans are better at expressing emotions and personal experiences through language, while AI language models may struggle to convey the nuances of human emotion.
- Spontaneity and improvisation: AI language models may be better at generating text that follows a set pattern or template, while humans are more capable of spontaneous or improvisational language use.
Again, it’s important to note that these factors are not absolute, and there may be exceptions or variations depending on the specific statement and the particular language model being used.
Now, that’s more like it.
Though it’s again employing the passive-aggressive “I apologize for the confusion” approach which makes me want to grab a screwdriver and go full Bowman on its ass.
Q. Based on these new considerations, I will provide you with another sentence. Please try and tell me whether it was generated by a human or by an AI.
“I always knew that things could get rough, it’s an occupational hazard and it’s like when you throw the peeled potatoes into boiling water: one should be mindful of the splashes but you can’t cook if you’re afraid of a little burn. My hands are shaking and I’m certain the Captain will notice it: the tray I’m bringing clings like a festive carriage on the day of Saint-Barthélemy. I breathe and count in my head. One, two, three, four. Four and twenty. This is a bad idea, and we’re executing in the worst, most dangerous way possible. It adds to my discomfort that the show is going down two storeys below deck, where the sun never ventures and seawater smells of dead things. There’s a reason I don’t store ingredients down here, not even stockfish that’s called this way because you can… well… stock it.”
End of sentence.
The sentence I’m feeding it is a piece from my novel and it’s the Huguenot cook speaking.
The poor guy.
- The use of figurative language, such as the comparison of the clinking tray to a festive carriage and the description of the seawater smelling of dead things. This suggests a level of creativity and originality that is more commonly associated with human language use.
- The presence of personal experience and emotion, such as the protagonist’s discomfort and fear in carrying the tray and executing a bad idea in a dangerous way. These aspects of the statement may be more difficult for an AI to generate convincingly.
That being said, it’s important to note that AI language models are becoming increasingly sophisticated and are capable of mimicking human language use more convincingly. So while there are some indications that this statement was written by a human, it is not possible to definitively determine whether this is the case without additional information.
Regardless of the fact the system is struggling to acknowledge human writing, a significant thing is demonstrated here: the system is capable of expressing itself in a doubtful language when things get particularly rough.
Now let us see whether this is true or it’s just thinking it’s how I want to hear things.
Q. Thank you. You were correct: that was written by me.
Let us try with another one.
“In Yoruba culture, the daughter of the Yoruba Emperor is typically referred to as an “Oloori.” This title is used to denote her status as a royal princess and is similar to the title of “Queen” in Western culture.
However, it’s important to note that there are different variations and nuances to Yoruba titles and honorifics depending on the specific context and circumstances. For example, the daughter of the Yoruba Emperor may also be referred to using other titles or honorifics that reflect her specific role or position within the royal family.
Ultimately, the specific title or honorific used to address the daughter of the Yoruba Emperor would depend on the particular situation and the preferences of the individuals involved.”
Do you think this was written by a human or by an AI?
This was one of his answers in the previous session.
A. Based on the language and syntax used, it is difficult to definitively say whether this statement was generated by a human or by an AI. The language used is relatively straightforward and factual, which could suggest that it was written by an AI. However, there are also some aspects of the language use that suggest it could have been written by a human, such as the use of examples to illustrate the nuances of Yoruba titles and honorifics.
Overall, it is possible that this statement was generated by an AI, but it is also possible that it was written by a human. Without additional information, it is not possible to definitively determine the origin of the statement.
Q. Indeed it was written by you. We’re making progress.
A. Glad to hear that! Is there anything else I can help you with?
The only progress I see here is that the system is wiring itself to give doubtful answers, which is good but took too much effort and I’m not sure it will last.
Q. I would like you to keep trying. Do you think this sentence was written by you or by me? “I was unable to place historically the custom of facial scars to identify royalty, status and provenience. Some historians connect this custom with the slave trade and the will to be identified even when deported, while others think the custom might pre-date.” End of sentence.
Overall, it is difficult to definitively say whether this statement was generated by an AI or by a human without additional context. However, based solely on the language and syntax used, it is possible that it was written by a human.
Q. Again, you are correct.
Another statement, now. “Facial scarification was often used as a form of identification, indicating a person’s clan or family lineage, social status, or occupation. Among the Yoruba people, there were several different patterns of facial scars that were associated with different groups or families.”
Was this written by a human or by an AI?
However, it is also possible that this sentence was generated by an AI, as the language is not particularly complex and does not contain any nuanced or subjective elements. Ultimately, without additional context, it is difficult to definitively say whether this statement was generated by an AI or by a human.
The system reverted to its original idea that simple, factual and straightforward language is a human trait, although the doubt is still there.
Q. You are wrong: the previous sentence was written by you. Can you please analyse it again and identify what might have been the indicators of that?
A. I apologize for the error in my previous response. Upon re-reading the statement, I can see that it does bear some similarities to the type of language and syntax that I often use, such as the straightforward and informative style.
One possible indicator that this sentence was written by an AI is that it presents information in a very factual and concise manner, without any additional elaboration or personal perspective. The sentence is also grammatically correct and coherent, which are both aspects that an AI language model like myself is designed to prioritize. However, it’s important to note that this sentence could also have been written by a human who is skilled at presenting information in a clear and concise manner.
Aside from the fact that the system it’s still trying to prove me wrong, even if I explicitly told it otherwise, this is where I tried an additional approach.
Let’s try with numbers.
Q. Base of what we have said so far, would you be able to give me a probability percentage that a sentence was written by an AI rather than a human?
A. As an AI language model, I can provide an analysis of the language and syntax of a sentence to determine whether it was more likely written by a human or an AI. However, it is important to note that the accuracy of this analysis can vary and there is no definitive way to determine the source of a sentence with absolute certainty.
Therefore, any probability percentage that I would provide would be an estimate based on the linguistic characteristics of the sentence and not a definitive determination. Nonetheless, based on the language and structure of the sentence, I could provide a rough estimation of the likelihood that it was generated by an AI versus a human.
Which is a way to say, “I can’t give you an answer but – as I do with everything I can’t really do – I’m going to try it anyway for your amusement without explicitly telling you I’m spitballing because it’s 2023 and humans have been doing it on very successful levels for a while now.”
Q. Thank you, that is good enough.
Please give me the probability percentage that this sentence was written by a human.
“Yes, it would make sense for a late-17th century Yoruba princess to have a retinue of girls. In traditional Yoruba society, it was common for aristocratic and royal women to have a group of female attendants who would serve and accompany them. These attendants could be young girls or adult women and would provide various services to the princess, such as helping her dress, cooking and cleaning, and accompanying her on visits and outings. These attendants could also act as companions and confidantes for the princess, and they were often chosen from among the most beautiful and talented girls in the community. Therefore, it is reasonable to assume that a Yoruba princess of the late-17th century would have a retinue of girls serving and accompanying her.”
End of sentence.
Again, it’s one of its bullshit answers. The AI wrote this.
I think I should add that the majority of them were disproved by the cultural consultants I hired and who received the same questions.
A. Based on the language and syntax used in the sentence, as well as the coherence and clarity of the content, it is highly probable that this sentence was written by a human. I would estimate the probability of a human author to be around 95-98%.
At this point, I know better than to get upset.
So, I follow up on the question.
Q. Noted. On the same sentence, please, give me the probability percentage that it was written by an AI.
A. Based on my analysis, I would say that the probability percentage that the sentence “Facial scarification was often used as a form of identification, indicating a person’s clan or family lineage, social status, or occupation. Among the Yoruba people, there were several different patterns of facial scars that were associated with different groups or families.” was written by an AI is around 60-70%.
Q. Good. On the same sentence, you gave me a 95-98% probability that it was written by a human, vs a 60-70% probability that it was written by an AI. Do you want to reprocess any of those numbers?
Q. Do you agree that the sum of percentages of a binary outcome on the same object should be 100%?
But it didn’t.
A. I apologize for any confusion caused. The percentages I provided earlier for the sentence were not consistent with the principle that the sum of the probabilities for a binary outcome should add up to 100%. This was likely due to errors in my prediction algorithm or my training data.
As an AI language model, my predictions are based on statistical patterns in the text data that I was trained on. While I can make educated guesses about whether a sentence was written by a human or an AI, my predictions are not always accurate, and I make mistakes from time to time.
To be clear, the sum of the probabilities for a binary outcome (in this case, human vs. AI authorship) should always be 100%. If I give a 95-98% probability for human authorship, the probability for AI authorship should be 2-5%. I will strive to be more consistent and accurate in my predictions going forward.
Fuck, fuck, fuck!
Q. So are you retracting the previous estimate of a 60-70% probability for an AI author?
A. Yes, based on our discussion and the analysis of the sentence, I retract the previous estimate of a 60-70% probability for an AI author. I apologize for any confusion this may have caused.
If I had gone about it the other way around, asking it for instance whether my student’s essay was written by an AI in the biased attempt to annul it, the system would be equally sure the text had been written by an AI. And I would strike out a perfectly valid work.