What an AI model actually is
Understand
When people say "AI" today, they almost always mean a large language model (LLM) — the technology behind ChatGPT, Claude, and Gemini.
Here is the whole idea in one sentence: an LLM is a system that has read an enormous amount of text and learned to predict what word comes next.
That sounds too simple to be useful, so sit with it. If you train something to predict the next word extremely well — across billions of sentences about cooking, code, law, history, feelings — it has to internalise patterns about how the world is described. "Predict the next word" turns out to be a backdoor into something that behaves a lot like reasoning.
A few consequences fall directly out of this:
- It is not looking things up. It has no database of facts it queries. It generates a plausible continuation. Usually plausible and true line up. Sometimes they don't (that's a hallucination — a confident, wrong answer; we cover it in its own lesson).
- It is not deterministic like a calculator. Ask the same thing twice, get two different phrasings. It's predicting likely text, not retrieving the answer.
- It has no memory between conversations unless the tool deliberately gives it some. Each chat starts fresh. (Its own lesson too.)
- It was trained on data up to a cutoff date. Out of the box it doesn't know what happened after that, unless the tool can search the web for it.
The mental model that will serve you for everything else in this course: treat it as an extremely well-read, fast, slightly overconfident collaborator — not an oracle, not a search engine, not a person. Everything you learn next is about directing that collaborator well.
See it
Type this into any chatbot: "The capital of Australia is" and stop there.
It will almost certainly continue with "Canberra" — and add a sentence or two unprompted. It wasn't "asked a question" and it didn't "look up Australia." It saw a sentence fragment and produced the most likely continuation based on the mountain of text it learned from, where that sentence is almost always completed with "Canberra."
Now try: "Write the first line of a noir detective story set in a laundromat." Same machine, same mechanism — predicting likely text — but now the likely continuation is creative prose. One engine, predicting-the-next-word, producing facts or fiction depending on what you set up. That is the entire trick, and seeing it once makes AI far less mysterious.
Try it
Open any AI chatbot. Send exactly this, with no other instructions: Finish this sentence and then stop: The hardest part of learning something new is. Then send a second message: Now write that same idea as a haiku.
You get a fluent completion, then the same idea recast as a 3-line poem. You've watched one next-word-prediction engine do both factual-style completion and creative reshaping — proof that 'AI' here is one flexible text engine you steer, not a lookup tool.
Tried it? Paste what you got and the tutor will tell you if it worked.
Key terms
- Large language model (LLM)
- The technology behind tools like ChatGPT and Claude. A system trained on a huge amount of text that works by predicting the most likely next word.
- Next-word prediction
- The core mechanism of an LLM: given some text, it produces the most plausible continuation, one piece at a time. Reasoning-like behaviour emerges from doing this extremely well.
- Hallucination
- When an AI gives a confident answer that is plausible-sounding but actually wrong, because it generates likely text rather than retrieving verified facts.
- Training data
- The body of text an AI learned from, up to a fixed 'cutoff' date. Out of the box it knows nothing after that date unless a tool lets it search.
- Deterministic
- Always giving the exact same output for the same input (like a calculator). An LLM is NOT deterministic — ask the same thing twice and the wording differs.
All terms live in the keywords bank.
Explain it back
The real test: explain this in your own words, like you're teaching it. The tutor will tell you honestly if it's solid.
Check yourself
What is the core thing a large language model was trained to do?
Why can an LLM give a confident answer that is simply wrong?