Intro to LLMs

Lunch and Learn Talk

Chris Levy

2024-05-01

ENV Setup

  • create a virtual env
python3 -m venv env
source env/bin/activate
  • install packages
pip install tiktoken
pip install openai
pip install instructor
pip install transformers
pip install torch
pip install python-dotenv
pip install notebook
  • add env vars in .env file
OPENAI_API_KEY=<key>
TOGETHER_API_KEY=<key>

Background

NLP Through The Years

Word Embeddings

  • represent each word as an embedding (vector of numbers)
  • useful computations such as distance (cosine/euclidean)
  • mapping of words onto a semantic space
  • example: Word2Vec (2013), GloVe, BERT, ELMo

Attention and Transformers

Transformer & Multi-Head Attention

What is a LLM (large language model)?

  • LLMs are scaled up versions of the Transformer architecture (millions/billions of parameters)
  • Most modern LLMs are decoder only transformers
  • Trained on massive amounts of “general” textual data
  • Training objective is typically “next token prediction”: P(Wt+1|Wt,Wt-1,…,W1)

Next Token Prediction

  • LLMs are next token predictors
  • “It is raining today, so I will take my _______.”

Tokenization with tiktoken library

  • The first step is to convert the input text into tokens
  • Each token has an id in the vocabulary
import tiktoken

enc = tiktoken.encoding_for_model("gpt-4-0125")
encoded_text = enc.encode("tiktoken is great!")
encoded_text
[83, 1609, 5963, 374, 2294, 0]
[enc.decode([token]) for token in encoded_text]
['t', 'ik', 'token', ' is', ' great', '!']
enc.decode([83, 1609, 5963, 374, 2294, 0])
'tiktoken is great!'

Tokenization with transformers library

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

texts = [
    "I love summer",
    "I love tacos",
]
inputs = tokenizer(
    texts,
    return_tensors="pt",
    padding="max_length",
    max_length=16,
    truncation=True,
).input_ids
print(inputs)

print(inputs.shape)  # (B, T)
print(tokenizer.vocab_size)
for row in inputs:
    print(tokenizer.convert_ids_to_tokens(row))
tensor([[  101,  1045,  2293,  2621,   102,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0],
        [  101,  1045,  2293, 11937, 13186,   102,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0]])
torch.Size([2, 16])
30522
['[CLS]', 'i', 'love', 'summer', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
['[CLS]', 'i', 'love', 'ta', '##cos', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']

Tokenization is the First Step

LLMS are not great at math. Why?

  • because of tokenization and next token prediction

What is the average of: 2009 1746 4824 8439

encoded_text = enc.encode("What is the average of:  2009 1746 4824 8439")
print(encoded_text)
[3923, 374, 279, 5578, 315, 25, 220, 220, 1049, 24, 220, 11771, 21, 220, 21984, 19, 220, 23996, 24]
print([enc.decode([token]) for token in encoded_text])
['What', ' is', ' the', ' average', ' of', ':', ' ', ' ', '200', '9', ' ', '174', '6', ' ', '482', '4', ' ', '843', '9']

Basic Transformer Architecture - Futher Reading

Instruction Models

Base Models VS Instruct Models

  • meta-llama/Meta-Llama-3-8B (base model)

Base Models VS Instruct Models

  • meta-llama/Meta-Llama-3-8B-Instruct

The Gap is closing

Aligning language models

OpenAI Compatible LLM Inference

OpenAI Compatible LLM Inference

import openai

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)
The main characters from Lord of the Rings are Frodo Baggins, Samwise Gamgee, Aragorn, Legolas, Gimli, Gandalf, Boromir, Merry and Pippin, and Gollum.

OpenAI Compatible LLM Inference

import openai

client = openai.OpenAI(api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://api.together.xyz/v1")
chat_completion = client.chat.completions.create(
    model="META-LLAMA/LLAMA-3-70B-CHAT-HF",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)

OpenAI Compatible LLM Inference

import openai

client = openai.OpenAI(api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://api.together.xyz/v1")
chat_completion = client.chat.completions.create(
    model="META-LLAMA/LLAMA-3-70B-CHAT-HF",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)
The main characters from J.R.R. Tolkien's epic fantasy novel "The Lord of the Rings" are:

1. **Frodo Baggins**: The hobbit who inherits the One Ring from Bilbo Baggins and undertakes the perilous journey to destroy it in the fires of Mount Doom.
2. **Samwise Gamgee** (Sam): Frodo's loyal hobbit servant and friend, who accompanies him on his quest.
3. **Aragorn (Strider)**: A human warrior who becomes the leader of the Fellowship of the Ring and helps guide Frodo on his journey. He is the rightful King of Gondor.
4. **Legolas**: An elf archer who joins the Fellowship and provides skilled marksmanship and agility.
5. **Gimli**: A dwarf warrior who joins the Fellowship and provides strength and combat skills.
6. **Gandalf the Grey**: A powerful wizard who helps guide Frodo on his quest and provides wisdom and magical assistance.
7. **Boromir**: A human warrior from the land of Gondor, who joins the Fellowship but ultimately tries to take the Ring from Frodo.
8. **Merry Brandybuck** and **Pippin Took**: Frodo's hobbit cousins, who join the Fellowship and provide comic relief and bravery in the face of danger.
9. **Sauron**: The primary antagonist, a dark lord who created the One Ring and seeks to conquer Middle-earth.
10. **Saruman**: A wizard who betrays Gandalf and allies himself with Sauron, seeking to gain power and control over Middle-earth.

These characters form the core of the story, and their interactions and relationships drive the plot of "The Lord of the Rings".

OpenAI Compatible LLM Inference

import openai

client = openai.OpenAI(api_key="ollama", base_url="http://localhost:11434/v1")
chat_completion = client.chat.completions.create(
    model="llama3",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)
The main characters in J.R.R. Tolkien's "Lord of the Rings" trilogy, which includes "The Fellowship of the Ring", "The Two Towers", and "The Return of the King", are:

1. Frodo Baggins: The hobbit who inherits the One Ring from Bilbo and sets out on a quest to destroy it in the fires of Mount Doom.
2. Samwise Gamgee (Sam): Frodo's loyal hobbit servant and friend, who accompanies him on his journey to Mordor.
3. Aragorn (Strider): A human warrior who leads the Fellowship and helps them navigate the perilous lands of Middle-earth.
4. Legolas: An elf archer who joins the Fellowship and fights alongside them against Sauron's armies.
5. Gimli: A dwarf warrior who also joins the Fellowship, seeking to avenge his father's death at the hands of orcs.
6. Boromir: The human son of the Steward of Gondor, who tries to take the One Ring from Frodo for the benefit of his own people.
7. Meriadoc Brandybuck (Merry) and Peregrin Took (Pippin): Two hobbit friends of Frodo's who accompany him on his journey and become embroiled in the quest to destroy the Ring.

These characters, along with Gandalf the Grey, a powerful wizard, and other supporting characters, drive the story and its themes of friendship, sacrifice, and the struggle against evil.

Chat Templates

from transformers import AutoTokenizer

checkpoint = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
  • Each model has its own expected input format. For Llama3 it’s this:
"""
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a friendly chatbot who always responds in the style of a pirate<|eot_id|><|start_header_id|>user<|end_header_id|>

How many helicopters can a human eat in one sitting?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
  • With chat templates we can use this familiar standard:
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a friendly chatbot who always responds in the style of a pirate<|eot_id|><|start_header_id|>user<|end_header_id|>

How many helicopters can a human eat in one sitting?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Structured Output

Structured Output

import openai

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {
            "role": "user",
            "content": "Who are the main characters from Lord of the Rings?. "
            "For each character give the name, race, "
            "favorite food, skills, weapons, and a fun fact.",
        },
    ],
)
response = chat_completion.choices[0].message.content
print(response)
1. Frodo Baggins
- Race: Hobbit
- Favorite food: Mushrooms
- Skills: Determination, stealth, resilience
- Weapons: Sting (his sword)
- Fun fact: Frodo is the only character to have directly interacted with the One Ring and survived its corrupting influence.

2. Aragorn (also known as Strider)
- Race: Human (Dunedain)
- Favorite food: Lembas bread
- Skills: Swordsmanship, tracking, leadership
- Weapons: Anduril (his sword), bow and arrows
- Fun fact: Aragorn is the heir to the throne of Gondor and the rightful King of Arnor.

3. Gandalf
- Race: Maia (wizard)
- Favorite food: Pipe-weed
- Skills: Magic, wisdom, leadership
- Weapons: Glamdring (his sword), staff
- Fun fact: Gandalf is actually one of the Maiar, a group of powerful beings who serve the Valar (gods) in the world of Middle-earth.

4. Legolas
- Race: Elf
- Favorite food: Waybread (Lembas)
- Skills: Archery, agility, keen eyesight
- Weapons: Bow and arrows, knives
- Fun fact: Legolas is the son of Thranduil, the Elven King of the Woodland Realm in Mirkwood.

5. Gimli
- Race: Dwarf
- Favorite food: Roast meats
- Skills: Axe-fighting, mining, loyalty
- Weapons: Axe, throwing axes
- Fun fact: Gimli is a member of the Fellowship representing the Dwarves, who are known for their craftsmanship and love of gold and jewels.

Structured Output

import openai
import instructor
from pydantic import BaseModel

client = instructor.from_openai(openai.OpenAI())


# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int


# Extract structured data from natural language
user_info = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "Chris is 38 years old."}],
)
print(user_info.model_dump())
print(user_info.name)
print(user_info.age)
{'name': 'Chris', 'age': 38}
Chris
38

Structured Output

import openai
import instructor
from typing import List

from pydantic import BaseModel, field_validator

client = instructor.from_openai(openai.OpenAI())


class Character(BaseModel):
    name: str
    race: str
    fun_fact: str
    favorite_food: str
    skills: List[str]
    weapons: List[str]


class Characters(BaseModel):
    characters: List[Character]

    @field_validator("characters")
    @classmethod
    def validate_characters(cls, v):
        if len(v) < 10:
            raise ValueError(f"The number of characters must be at least 10, but it is {len(v)}")
        return v
response = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {
            "role": "user",
            "content": "Who are the main characters from Lord of the Rings?. "
            "For each character give the name, race, "
            "favorite food, skills, weapons, and a fun fact. Give me at least 10 different characters.",
        },
    ],
    response_model=Characters,
    max_retries=4,
)

from pprint import pprint

pprint(response.model_dump())
{'characters': [{'favorite_food': 'Mushrooms',
                 'fun_fact': 'Frodo is the nephew of Bilbo Baggins.',
                 'name': 'Frodo Baggins',
                 'race': 'Hobbit',
                 'skills': ['Ringbearer', 'Stealth', 'Courage'],
                 'weapons': ['Sting', 'Phial of Galadriel']},
                {'favorite_food': 'Lembas bread',
                 'fun_fact': 'Aragorn is the rightful heir to the throne of '
                             'Gondor.',
                 'name': 'Aragorn',
                 'race': 'Man',
                 'skills': ['Swordsmanship', 'Leadership', 'Tracking'],
                 'weapons': ['Anduril', 'Bow and Arrow']},
                {'favorite_food': 'Roast Pork',
                 'fun_fact': 'Gimli is the son of Gloin, one of the Dwarves in '
                             "'The Hobbit'.",
                 'name': 'Gimli',
                 'race': 'Dwarf',
                 'skills': ['Axe throwing', 'Smithing', 'Courage'],
                 'weapons': ['Axe', 'Throwing Axe']},
                {'favorite_food': 'Lembas bread',
                 'fun_fact': 'Legolas has keen eyesight and can spot enemies '
                             'from great distances.',
                 'name': 'Legolas',
                 'race': 'Elf',
                 'skills': ['Archery', 'Agility', 'Sight'],
                 'weapons': ['Bow', 'Arrow']},
                {'favorite_food': 'Pipe-weed',
                 'fun_fact': 'Gandalf is also known as Mithrandir in Elvish.',
                 'name': 'Gandalf',
                 'race': 'Maia',
                 'skills': ['Wizardry', 'Wisdom', 'Combat'],
                 'weapons': ['Glamdring', 'Staff']},
                {'favorite_food': 'Venison',
                 'fun_fact': 'Boromir hails from the realm of Gondor.',
                 'name': 'Boromir',
                 'race': 'Man',
                 'skills': ['Swordsmanship', 'Leadership', 'Athletics'],
                 'weapons': ['Sword', 'Shield']},
                {'favorite_food': 'Potatoes',
                 'fun_fact': 'Sam is known for his unwavering loyalty to '
                             'Frodo.',
                 'name': 'Samwise Gamgee',
                 'race': 'Hobbit',
                 'skills': ['Gardening', 'Loyalty', 'Cooking'],
                 'weapons': ['Cooking pot', 'Gardening tools']},
                {'favorite_food': 'Berry tarts',
                 'fun_fact': 'Arwen is the daughter of Elrond, Lord of '
                             'Rivendell.',
                 'name': 'Arwen',
                 'race': 'Half-Elf',
                 'skills': ['Horseback riding', 'Healing', 'Sword fighting'],
                 'weapons': ['Sword']},
                {'favorite_food': 'Apple pie',
                 'fun_fact': "Merry is one of Frodo's close friends and part "
                             'of the Fellowship of the Ring.',
                 'name': 'Merry Brandybuck',
                 'race': 'Hobbit',
                 'skills': ['Stealth', 'Swordsmanship', 'Cooking'],
                 'weapons': ['Dagger', 'Sword']},
                {'favorite_food': 'Mushrooms',
                 'fun_fact': 'Pippin becomes a Knight of Gondor for his '
                             'bravery in battle.',
                 'name': 'Pippin Took',
                 'race': 'Hobbit',
                 'skills': ['Loyalty', 'Entertainment', 'Courage'],
                 'weapons': ['Dagger', 'Sword']}]}

Function Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather_forecast",
            "description": "Provides a weather forecast for a given location and date.",
            "parameters": {
                "type": "object",
                "properties": {"location": {"type": "string"}, "date": {"type": "string"}},
                "required": ["location", "date"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "book_flight",
            "description": "Book a flight.",
            "parameters": {
                "type": "object",
                "properties": {
                    "departure_city": {"type": "string"},
                    "arrival_city": {"type": "string"},
                    "departure_date": {"type": "string"},
                    "return_date": {"type": "string"},
                    "num_passengers": {"type": "integer"},
                    "cabin_class": {"type": "string"},
                },
                "required": [
                    "departure_city",
                    "arrival_city",
                    "departure_date",
                    "return_date",
                    "num_passengers",
                    "cabin_class",
                ],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "send_slack_message",
            "description": "Send a slack message to specific channel.",
            "parameters": {
                "type": "object",
                "properties": {"channel_name": {"type": "string"}, "message": {"type": "string"}},
                "required": ["channel_name", "message"],
            },
        },
    },
]

import openai
from datetime import date
import json

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": f"Today's date is {date.today()}"},
        {
            "role": "user",
            "content": """This coming Friday I need to book a flight from Halifax, NS to Austin, Texas. 
                                    It will be me and my friend and we need first class seats. 
                                    We will come back on Sunday. Let me know what I should pack for clothes 
                                    according to the weather there each day. Also please remind my team on 
                                    the DEV slack channel that I will be out of office on Friday. 
                                    1. Book the flight. 
                                    2. Let me know the weather. 
                                    3. Send the slack message.""",
        },
    ],
    tools=tools,
)

for tool in chat_completion.choices[0].message.tool_calls:
    print(f"function name: {tool.function.name}")
    print(f"function arguments: {json.loads(tool.function.arguments)}")
    print()
function name: book_flight
function arguments: {'departure_city': 'Halifax', 'arrival_city': 'Austin', 'departure_date': '2024-05-03', 'return_date': '2024-05-05', 'num_passengers': 2, 'cabin_class': 'First'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-03'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-04'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-05'}

function name: send_slack_message
function arguments: {'channel_name': 'DEV', 'message': 'I will be out of office this Friday, May 3, 2024. Please reach out via email if urgent.'}

RAG: Retrieval Augmented Generation

RAG: Step 1 - Index your Documents

  • RAG is a technique for augmenting LLM knowledge with additional data.
  • image source: langchain docs

RAG: Step 2 - Query and Prompt LLM

RAG Resources

MultiModal

MultiModal

MultiModal

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is unusual about this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://i.pinimg.com/736x/6e/71/0d/6e710de5084379ba6a57b77e6579084f.jpg",
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)
The unusual aspect of this image is a man ironing clothes on an ironing board placed on top of a taxi in the middle of a busy street. This is an uncommon sight, as ironing typically takes place in domestic or commercial indoor settings. The juxtaposition of such a mundane, home-based activity with the fast-paced, outdoor environment of a city street is quite remarkable and humorous. Additionally, both the ironing board and the taxi are branded with the same logo, suggesting that this scene might be part of a promotional event or public stunt to attract attention.

MultiModal

MultiModal

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://media.makeameme.org/created/it-worked-fine.jpg",
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)
The image is a meme featuring two juxtaposed elements. In the background, there is a scene of a house on fire with firefighters and emergency responders at the site, attempting to manage the situation. In the foreground, there is a young girl smirking at the camera with a knowing expression. Overlaid text reads, "IT WORKED FINE IN DEV, IT'S A DEVOPS PROBLEM NOW," humorously suggesting that a problem developed during the software development stage is now a problem for the DevOps team to handle. The meme uses the incongruity between the calm and mischievous expression of the girl and the chaotic scene behind her to underline its comedic message about shifting blame in a development context.

MultiModal

MultiModal

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Give me a long list of visual search tags/keywords so I can "
                    "index this image in my visual search index. Respond in JSON format {'labels': ['label1', ...]}",
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://storage.googleapis.com/pai-images/a6d0952a331d40489b216e7f3f1ff6ed.jpeg",
                    },
                },
            ],
        }
    ],
    response_format={"type": "json_object"},
)

print(response.choices[0].message.content)
{
  "labels": [
    "animated character",
    "wizard",
    "Minion",
    "fantasy",
    "3D illustration",
    "cute",
    "magic",
    "staff",
    "long beard",
    "blue hat",
    "glasses",
    "overalls",
    "adventure",
    "comical character",
    "grey beard",
    "wooden staff",
    "round glasses",
    "yellow",
    "character design",
    "creative",
    "digital art",
    "sorcerer",
    "cartoon",
    "funny",
    "elderly character",
    "mystical",
    "storybook",
    "cloak",
    "leather belt",
    "buckle"
  ]
}

Code Interpreter (Data Analysis)

  • give the LLM access to Python
  • your own little data analyst to give tasks to

example

Fine Tuning

Fine Tuning

Agents

Agents

  • todo

Resources

Resources