It's not magic. It's math. An LLM is a Next-Token Prediction Engine. It reads the text you wrote and calculates the statistical probability of what comes next.
LLMs don't read words; they read "Tokens". A token is about 0.75 of a word (e.g., "ing", "the", "apple"). GPT-4 knows about 100,000 unique tokens.
It doesn't "know" facts. It knows patterns. It knows that after "The capital of France is", the token "Paris" appears 99% of the time in its training data.
The "Creativity" setting.
Temp 0: Always picks the most likely word (Robotic).
Temp 1: Sometimes picks unlikely words (Creative/Hallucinations).
Mission: You are the GPU. Based on the current sentence, choose the next word.
Adjust Temperature to see how it changes the "Dice Roll".
Imagine the model is predicting: "The first person on Mars was..."