Understanding AI (the Chatbots)

10 QUICK CONCEPTS IN PLAIN LANGUAGE

Unveiling Large Language Models: The Engine Behind Your Favorite Chatbots

LLMs (Large Language Models) form the core of AI chatbots, offering remarkable predictive capabilities that are both impressive and puzzling. Understanding LLMs provides a solid foundation for exploring AI, as we delve into their unexpected abilities and learn how to communicate effectively with them.

We've all seen these in action:

🙋User: Write a funny one sentence description of a typical cat personality.

🤖AI Assistant: Luna the mischievous feline, with an insatiable love for toilet paper rolls and a knack for unlocking drawers by napping on them.

CONCEPT #1:💡

At the heart of your favorite chatbot lies a Large Language Model (LLM), a sophisticated computer program with an impressive talent: predicting the next word in any sentence.

If we input, 🙋: "There's the ocean and a nice beach. Let's go for a..."

The LLM should predict something like, 🤖: "swim".

This seemingly simple task is the foundation upon which chatbots are built. The LLM's ability to predict words based on context enables it to generate human-like responses, making conversations feel more natural.

But how does this turn into a full-fledged chatbot? Keep reading, and we'll delve into that soon.

CONCEPT #2:💡

LLMs consider all context to predict the next word, not just recent phrases.

For instance,

🙋:"There's the ocean and a nice trail. Let's go for a"

should predict 🤖:"hike" not "swim".

But inputting,

🙋:"There are ocean zombies... There's the ocean and a nice beach, let's..."

might result in:

🤖:"RUN!!!".

This prediction ability is unprecedented. In that example, the LLM had to factor in zombies, fear, spatial awareness, and human reactions.

CONCEPT #3:💡

To get LLM "chatbots" that keep talking, have it add the word it predicts the end of the sentence.

Input

🙋:"There's the ocean and a nice beach. Let's go for a"

LLM predicts 🤖:"swim", then adds it to the sentence.

New input: "There's the ocean and a nice beach. Let's go for a swim"

Predicts 🤖:"before", adds it:

New input: "There's the ocean and a nice beach. Let's go for a swim before"

Predicts 🤖:"sunset", adds it:

Final output: 🤖:"There's the ocean and a nice beach. Let's go for a swim before sunset".

This continuous prediction and addition enables LLMs to generate coherent stories, one word at a time, by considering each new word in its context.

CONCEPT #4:💡

Crafting a traditional program to predict words like LLMs do isn't feasible.

Language complexity and countless possibilities make rule-based programming unviable.

For instance, in a context where previous sentences describe zombie hunters, 🙋: "There are ocean zombies... There's the ocean and a nice beach, let's..." might deserve a response like 🤖: "Get'em!" instead of "RUN!!".

CONCEPT #5:💡

Enter AI. Instead of instructions,

We create a "brain-like" computer program using billions of interconnected, adjustable parameters (like neurons).
These parameters process input sentences and output the next word by solving complex mathematical problems.
Imagine billions of tiny dials that can be tuned to improve predictions.

🎛️🎛️🎛️🎛️🎛️🎛️🎛️

CONCEPT #6:💡

We trained this artificial brain by exposing it to vast amounts of text,

For each sentence, we'd hide the last word and input it into our model.
The model would predict a word; if wrong, we'd adjust its parameters and repeat.
With billions of dials and trillions of sentences over time, our "brain" learned to predict next words exceptionally well.

(This process involves a specific method called Machine Learning, which we'll explore later.)

CONCEPT #7:💡

Because LLMs are essentially files with billions of adjustable numerical parameters,

We can't easily interpret these numbers or "fix" the code directly.
If an LLM outputs something unexpected (like saying the sky is "green"), we can't pinpoint which numbers caused the error.
Instead, we improve LLMs by providing more relevant data and continuing to adjust parameters through training.

🙋:"How many ears on a bear?"

🤖:"Three!".

🤦

Training:🧑‍💻: 🐻🎛️🎛️🐨🎛️🎛️🧸🎛️🎛️

🙋:"How many ears on a bear?"

🤖:"Two!".

CONCEPT #8:💡

There are many different LLMs available today

Online LLMs
Offline LLMs

CONCEPT #9:💡

LLMs have different abilities based on several factors

Number of parameters
Size of training dataset
Training period
Fine tuning

CONCEPT #10:💡

LLMs provide different responses, depending on how you ask them

Prompting
Retrieval Augmented Generation (RAG)

Report abuse