Large Language Model

What is an LLM?

An LLM is a massive pattern recognizer that understands and generates text.

It learns language patterns from billions of texts to generate new content.

Books, web pages, code - vast data compressed into one model.

Books/Papers100B+
Web Pages1T+
Code50B+
Training
Billions of Parameters

"A student who read every book on the internet" - remembers patterns but doesn't truly understand

Next Token Prediction

Next Word Prediction

The core of LLM is guessing 'what word comes next?'

It learned language patterns through billions of fill-in-the-blank games.

Hover over candidates to see the probability distribution.

The weather today is really?

Candidate Words

45%nice
25%hot
2%banana
1%computer

"Smartphone autocomplete on steroids" - selects the most plausible next word based on context

Training Pipeline

Training Journey

Raw LLMs can't hold conversations. They become ChatGPT through Pattern Learning → Instruction Learning → Human Feedback.

Transforming a pattern-only model into one that follows instructions and converses with humans.

Click each stage to see how the model evolves.

1

Pre-training

Learns language patterns from vast internet text.

Input: Internet data (books, web, code)
Output: Pattern-only model
Prompt:

"The capital of France is"

Response:

"Paris. Paris is a global city known for... (continues endlessly)"

Doesn't answer questions, just continues text

"Taming a wild horse" - training a raw model into a friendly assistant

Generation Flow

Prompt → Response

Given input, it completes the answer by adding tokens one at a time.

It doesn't generate the whole answer at once, but repeatedly selects the next word.

Press the button to see tokens generated one by one.

Prompt

What is artificial intelligence?

Response

"A fast typist typing one character at a time" - repeatedly selects the next token based on context

Limitations

Limits & Hallucination

For things outside training data, it 'makes things up convincingly' - called Hallucination.

The model only learned patterns; it doesn't truly 'know' facts.

Inside the circle is learned knowledge; outside is unknown - it may fabricate answers.

Unknown Territory

Learned Knowledge

Data up to ~2024

Can Answer
May Hallucinate

"An open-book test but the book only goes up to 2024" - can't know recent or private information

Search vs Generate

AI vs Search Engine

Search 'finds' existing information, AI 'creates' new answers.

They are completely different tools. Choose based on your purpose.

Compare how results differ when you ask the same question.

how to sort a list in python

docs.python.org

Sorting HOW TO

stackoverflow.com

How to sort a list

realpython.com

Python Sorting Guide

Real-time info
Verified sources
Custom explanation
Use Search When
  • Weather, stocks, news today
  • Need official documentation
  • Need exact numbers/statistics
Use AI When
  • Explanations, summaries, overviews
  • Writing code, translation, drafting
  • Brainstorming ideas

"Librarian vs Private Tutor" - librarians show where books are, tutors explain directly