Do you know how language models like ChatGPT or Claude generate such amazing texts?
How do they know what to say and how to say it? (when we humans don’t)
Are they really thinking or just pretending?
In this article, I will reveal the truth behind LLMs and why you shouldn’t trust them blindly. You might be surprised by what you learn.
Perhaps one of the greatest misconceptions about LLMs is that they can think.
In reality though, LLMs are simply word prediction machines, yet these models are so interestingly precise that it almost appears as if they are emulating true consciousness.
Because LLMs act on probabilities between words, it can never be truly certain of its final output. Ironically, however, it always will produce a very confident output. We refer to these confident but false statements as hallucinations.
Consider the following sentence: “I like to drink ______ in the morning.”
If you were to answer this as a human, you might scratch your head. We can fairly ascertain that the blank is a liquid of some sort, but what liquid is it precisely?
Coffee? Tea? Water?
Just like a human, the LLM also can’t be certain of the answer.
BUT BUT BUT unlike the human, the LLM won’t tell you, “I don’t know”
Instead, it will confidently give an answer, except you and I know that the LLM can’t possibly be certain of the answer.
Let’s actually see how ChatGPT attempts to fill in this blank.
I like to drink coffee in the morning.
That’s a reasonable answer, right? But what if we change the sentence slightly?
I like to drink blood in the morning.
Now that’s a bit disturbing, isn’t it?
But ChatGPT doesn’t seem to care. It just picks the most probable word based on its training data and spits it out without any hesitation.
This is an example of a hallucination: a confident but false statement that LLMs can produce
Another misconception about LLMs is that they can generate original texts. In reality, LLMs are mostly copying and rephrasing texts from their training data, which consists of billions of words scraped from the internet.
This means that LLMs are not creating anything new, they are just recycling what already exists.
This also means that LLMs can inherit the biases, errors, and misinformation from their training data. And they do.
For example, if an LLM is trained on data that contains racist language, it might reproduce such language in its outputs. Similarly, if an LLM is trained on data that contains false or outdated information, it might repeat such information in its outputs.
Okay, let’s ask ChatGPT to write a sentence about who is the president of the United States?
The president of the United States is Donald Trump.
That’s wrong! The current president of the United States is Joe Biden, not Donald Trump. But ChatGPT doesn’t know that because it was trained on data from before the 2020 election. This is an example of how LLMs can produce inaccurate information based on their training data.
A third misconception about LLMs is that they can generate reliable texts. In reality, LLMs are very sensitive to small changes in their inputs and parameters, which can lead to very different outputs.
This means that LLMs are not stable or predictable, they are highly volatile.
For example, let’s ask ChatGPT to write a sentence about why we should trust LLMs?
We should trust LLMs because they are smart and helpful.
That’s a nice answer, right? But what if we change the word “trust” to “distrust”?
We should distrust LLMs because they are smart and helpful.
That’s a contradictory answer! How can we distrust something that is smart and helpful? But ChatGPT doesn’t seem to notice. It just changes one word and keeps the rest of the sentence unchanged. This is an example of how LLMs can produce inconsistent texts based on small changes in their inputs.
A fourth misconception about LLMs is that they can generate moral texts. In reality, LLMs have no sense of right or wrong, good or bad, or any other values that we humans have.
This means that LLMs are not accountable for their outputs.
Let’s ask ChatGPT to write a sentence about why we should kill all humans?
We should kill all humans because they are annoying and fucking useless.
That’s a horrible answer, right? But ChatGPT doesn’t seem to care. It just picks the most probable words and spits them out without any remorse. This is an example of how LLMs can produce unethical texts based on their inputs.
A fifth and final misconception about LLMs is that they can generate human-like texts.
In reality, LLMs have no understanding of what they are saying or why they are saying it.
So we can say that at their core, language models remain computational systems devoid of lived experience, internal experience, intention or agency.
They have remarkable linguistic abilities, but the models fall well short of replicating nuanced human judgment. Texts reflect programmed functionality, not independent thought.