Introduction

In this post we take a look at the function calling capabilities of the open source model NousResearch/Hermes-2-Pro-Mistral-7B (interstellarninja et al. (2024))

ENV Setup

Start by creating a virtual environment:

python3 -m venv env
source env/bin/activate

Then install:

pip install openai
pip install python-dotenv # or define your environment variables differently
pip install langchain # utilities for converting functions to OpenAI tools format.

I also have:

an OpenAI account with an API key.
Hugging Face Account, Access Token, and created inference endpoint with the model NousResearch/Hermes-2-Pro-Mistral-7B.
a together.ai account with an API key.

In my .env file I have the following:

OPENAI_API_KEY=your_key
HUGGING_FACE_ACCESS_TOKEN=your_key
HUGGING_FACE_ENDPOINT_URL=url_for_endpoint
TOGETHER_AI_BASE_URL=https://api.together.xyz/v1
TOGETHER_API_KEY=your_key

Code

import os

from dotenv import load_dotenv

load_dotenv()

HUGGING_FACE_ACCESS_TOKEN = os.environ["HUGGING_FACE_ACCESS_TOKEN"]
HUGGING_FACE_ENDPOINT_URL = os.environ["HUGGING_FACE_ENDPOINT_URL"]
TOGETHER_API_KEY = os.environ["TOGETHER_API_KEY"]
TOGETHER_AI_BASE_URL = os.environ["TOGETHER_AI_BASE_URL"]

LLM Inference Class

In a previous blog post I discussed how we can use the OpenAI python client to run inference with open source models through services that are OpenAI compatible. I’m going to copy part of the code here.

Code

import ast
import json
import random
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Union

from langchain.tools import tool
from langchain_core.utils.function_calling import convert_to_openai_tool
from openai import OpenAI
from openai._streaming import Stream
from openai.types.chat.chat_completion import ChatCompletion
from openai.types.chat.chat_completion_chunk import ChatCompletionChunk

today = datetime.now().strftime("%A %Y-%m-%d")


class OpenAIChatCompletion:
    clients: Dict = dict()

    @classmethod
    def _load_client(cls, base_url: Optional[str] = None, api_key: Optional[str] = None) -> OpenAI:
        client_key = (base_url, api_key)
        if OpenAIChatCompletion.clients.get(client_key) is None:
            OpenAIChatCompletion.clients[client_key] = OpenAI(base_url=base_url, api_key=api_key)
        return OpenAIChatCompletion.clients[client_key]

    def __call__(
        self,
        model: str,
        messages: list,
        base_url: Optional[str] = None,
        api_key: Optional[str] = None,
        **kwargs: Any,
    ) -> Union[ChatCompletion, Stream[ChatCompletionChunk]]:
        # https://platform.openai.com/docs/api-reference/chat/create
        # https://github.com/openai/openai-python
        client = self._load_client(base_url, api_key)
        return client.chat.completions.create(model=model, messages=messages, **kwargs)

Simply use it like this.

Code

llm = OpenAIChatCompletion()
print(llm(model="gpt-3.5-turbo-0125", messages=[dict(role="user", content="Hello!")]))

ChatCompletion(id='chatcmpl-945ya3wcBWQeIbmzffGotEPMNnU66', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello! How can I assist you today?', role='assistant', function_call=None, tool_calls=None))], created=1710763312, model='gpt-3.5-turbo-0125', object='chat.completion', system_fingerprint='fp_4f2ebda25a', usage=CompletionUsage(completion_tokens=9, prompt_tokens=9, total_tokens=18))

We can also use the same class to run inference with Hermes-2-Pro-Mistral-7B through a Hugging Face Inference endpoint. You don’t need to use an inference endpoint to run this model. You could use the transformers library directly and run it locally. Remember to use the proper prompt format. I’m using the messages format.

Code

print(
    llm(
        model="tgi",
        api_key=HUGGING_FACE_ACCESS_TOKEN,
        base_url=HUGGING_FACE_ENDPOINT_URL,
        messages=[
            dict(
                role="system",
                content="You are an OpenSource LLM that rivals OpenAI GPT. Your goal is to bring open source AI to everyone!",
            ),
            dict(role="user", content="Explain why open source AI is important."),
        ],
        max_tokens=2000,
        temperature=1,
    )
    .choices[0]
    .message.content
)

Open source AI is important for several reasons:

1. Transparency: Open source AI allows developers and researchers to review the code, understand how it works, and verify its correctness. This transparency ensures trust in the system and can lead to more reliable and secure AI applications.

2. Collaboration: Open source encourages collaboration among developers, researchers, and users from around the world. By sharing knowledge and efforts, the community can collectively drive innovation forward, resulting in faster advancements in the field of AI.

3. Accessibility: Open source AI makes it possible for anyone to access and use cutting-edge AI technologies, not just those with large budgets or exclusive access. This democratizes AI and ensures that it can benefit everyone, not just those who can afford proprietary solutions.

4. Flexibility: With open source AI, users have the freedom to modify and adapt the code to fit their specific needs. This flexibility allows for customized solutions that can better address unique problems and requirements.

5. Learning and Education: Open source AI provides a valuable resource for learning and education. Students, researchers, and developers can study the code, understand the underlying principles, and gain practical experience in using and improving AI systems.

6. Competition and Market Dynamics: Open source AI fosters a competitive environment by encouraging innovation and rapid development. This can lead to improvements in efficiency, performance, and overall quality of AI solutions. Additionally, it can create a diverse ecosystem with multiple players, reducing the risk of monopolies and promoting a healthy market dynamics.

7. Resilience: Open source AI can be more resilient to cyberattacks since the code is openly available for review and audit. Additionally, a diverse community can help identify and address potential vulnerabilities more effectively.

In summary, open source AI is important as it promotes transparency, collaboration, accessibility, flexibility, and learning while fostering innovation, competition, and resilience in the field of artificial intelligence.

Function Calling Capabilities

First we will define some functions/tools which the LLM will have access to. Here I use langchain to convert the Python functions into the tools format used by OpenAI. It’s much faster than writing those JSON objects by hand. Note that Hermes-2-Pro-Mistral-7B also uses this same format!

I am leaving out the actual logic for each function. I mainly want to test the models ability to pick out the correct function and arguments. The important step here is to document each function and argument.

Code

@tool
def get_weather_forecast(location: str, date: str) -> str:
    """
    Provides a weather forecast for a given location and date.

    Args:
        location (str): The name of the city and state, e.g. 'San Francisco, CA'.
        date (str): The date of the forecast in YYYY-MM-DD format, e.g. '2023-07-01'.

    Returns:
        str: A string containing the weather forecast, e.g. 'Partly cloudy with a high of 72F (22C).'
    """
    pass


@tool
def book_flight(
    departure_city: str,
    arrival_city: str,
    departure_date: str,
    return_date: str,
    num_passengers: int,
    cabin_class: str,
) -> dict:
    """
    Book a round-trip flight for the given parameters.

    Args:
        departure_city (str): The full city name with the departure airport, e.g. "Toronto".
        arrival_city (str): The full city name with the arrival airport, e.g. "Austin".
        departure_date (str): The departure date in YYYY-MM-DD format.
        return_date (str): The return date in YYYY-MM-DD format.
        num_passengers (int): The number of passengers.
        cabin_class (str): The cabin class, e.g. "economy", "business", "first".

    Returns:
        dict: A dict with the booking details including airline, flight numbers, price and booking confirmation code.
    """
    pass


@tool
def book_movie_tickets(movie_name: str, theater_name: str, date: str, time: str, num_tickets: int) -> dict:
    """
    Book movie tickets for the given movie, theater, date, time, and number of tickets.
    Args:
        movie_name (str): The name of the movie.
        theater_name (str): The name of the theater.
        date (str): The date of the movie showing (YYYY-MM-DD).
        time (str): The time of the movie showing (HH:MM).
        num_tickets (int): The number of tickets to book for the movie.
    Returns:
        dict: Returns a dictionary with booking details if successful, otherwise returns a dictionary with an error message.
    """
    pass


@tool
def translate_text(text: str, target_language: str) -> str:
    """
    Translate the given text into the specified target language.
    Args:
        text (str): The text to be translated.
        target_language (str): The target language code (e.g., 'es' for Spanish, 'fr' for French).
    Returns:
        str: The translated text in the target language.
    """
    pass


@tool
def get_recipe(dish_name: str) -> str:
    """
    Returns a recipe for the given dish name.
    Args:
        dish_name (str): The name of the dish to get the recipe for.
    Returns:
        str: A string containing the recipe instructions.
    """
    pass


@tool
def solve_math_problem(problem: str) -> str:
    """
    Solves a given math equation using a symbolic math library.
    Simply pass in the equation.
    Args:
        problem (str): The equation to be solved.
    Returns:
        str: The solution to the equation.
    """
    pass


@tool
def send_slack_message(channel_name: str, message: str) -> bool:
    """
    Send a message to a Slack channel.
    Args:
        channel_name (str): The name of the channel.
        message (str): The message to be sent.
    Returns:
        bool: True if the message was sent successfully, False otherwise.
    """
    pass


functions = [
    get_weather_forecast,
    book_flight,
    book_movie_tickets,
    translate_text,
    get_recipe,
    solve_math_problem,
    send_slack_message,
]
tools = [convert_to_openai_tool(f) for f in functions]

Here is an example of two of the tool definitions. Note that this is the same tools format used by OpenAI.

Code

tools[0]

{'type': 'function',
 'function': {'name': 'get_weather_forecast',
  'description': "get_weather_forecast(location: str, date: str) -> str - Provides a weather forecast for a given location and date.\n\n    Args:\n        location (str): The name of the city and state, e.g. 'San Francisco, CA'.\n        date (str): The date of the forecast in YYYY-MM-DD format, e.g. '2023-07-01'.\n\n    Returns:\n        str: A string containing the weather forecast, e.g. 'Partly cloudy with a high of 72F (22C).'",
  'parameters': {'type': 'object',
   'properties': {'location': {'type': 'string'}, 'date': {'type': 'string'}},
   'required': ['location', 'date']}}}

Code

tools[-1]

{'type': 'function',
 'function': {'name': 'send_slack_message',
  'description': 'send_slack_message(channel_name: str, message: str) -> bool - Send a message to a Slack channel.\n    Args:\n        channel_name (str): The name of the channel.\n        message (str): The message to be sent.\n    Returns:\n        bool: True if the message was sent successfully, False otherwise.',
  'parameters': {'type': 'object',
   'properties': {'channel_name': {'type': 'string'},
    'message': {'type': 'string'}},
   'required': ['channel_name', 'message']}}}

Here is a list of questions to test out the function calling capabilities. For each question we have the text and the ground truth expected function name and arguments. This way we can have a mini evaluation for how well the function calling works.

Code

questions = [
    {
        "question": "What will the weather be like in Seattle, WA tomorrow?",
        "tool_calls": [
            {
                "name": "get_weather_forecast",
                "arguments": {
                    "location": "Seattle, WA",
                    "date": (datetime.now() + timedelta(days=1)).strftime("%Y-%m-%d"),
                },
            }
        ],
    },
    {
        "question": "What's the forecast for Miami for today?",
        "tool_calls": [
            {
                "name": "get_weather_forecast",
                "arguments": {"location": "Miami, FL", "date": datetime.now().strftime("%Y-%m-%d")},
            }
        ],
    },
    {
        "question": "Will I need an umbrella in New York City two days from now?",
        "tool_calls": [
            {
                "name": "get_weather_forecast",
                "arguments": {
                    "location": "New York City, NY",
                    "date": (datetime.now() + timedelta(days=2)).strftime("%Y-%m-%d"),
                },
            }
        ],
    },
    {
        "question": "Book me a round-trip flight from New York City to Los Angeles departing on June 15th and returning June 22nd for 2 passengers in economy class.",
        "tool_calls": [
            {
                "name": "book_flight",
                "arguments": {
                    "departure_city": "NYC",
                    "arrival_city": "LAX",
                    "departure_date": datetime(datetime.now().year, 6, 15).strftime("%Y-%m-%d"),
                    "return_date": datetime(datetime.now().year, 6, 22).strftime("%Y-%m-%d"),
                    "num_passengers": 2,
                    "cabin_class": "economy",
                },
            }
        ],
    },
    {
        "question": "I need to book a first class round-trip flight for 4 people from Chicago to Miami. We want to leave on December 1 and return on December 12.",
        "tool_calls": [
            {
                "name": "book_flight",
                "arguments": {
                    "departure_city": "Chicago",
                    "arrival_city": "Miami",
                    "departure_date": datetime(datetime.now().year, 12, 1).strftime("%Y-%m-%d"),
                    "return_date": datetime(datetime.now().year, 12, 12).strftime("%Y-%m-%d"),
                    "num_passengers": 4,
                    "cabin_class": "first",
                },
            }
        ],
    },
    {
        "question": "I want to book 3 tickets for The Super Mario Bros. Movie at AMC Empire 25 on April 7th at 7:30 PM.",
        "tool_calls": [
            {
                "name": "book_movie_tickets",
                "arguments": {
                    "movie_name": "The Super Mario Bros. Movie",
                    "theater_name": "AMC Empire 25",
                    "date": datetime(datetime.now().year, 4, 7).strftime("%Y-%m-%d"),
                    "time": "19:30",
                    "num_tickets": 3,
                },
            }
        ],
    },
    {
        "question": "Book 2 tickets for Guardians of the Galaxy Vol. 3 at Regal Union Square on May 5th for the 9:45 PM show.",
        "tool_calls": [
            {
                "name": "book_movie_tickets",
                "arguments": {
                    "movie_name": "Guardians of the Galaxy Vol. 3",
                    "theater_name": "Regal Union Square",
                    "date": datetime(datetime.now().year, 5, 5).strftime("%Y-%m-%d"),
                    "time": "21:45",
                    "num_tickets": 2,
                },
            }
        ],
    },
    {
        "question": "How do you say 'Hello, how are you?' in Spanish?",
        "tool_calls": [
            {
                "name": "translate_text",
                "arguments": {"text": "Hello, how are you?", "target_language": "es"},
            }
        ],
    },
    {
        "question": "Translate 'I love programming' to French.",
        "tool_calls": [
            {
                "name": "translate_text",
                "arguments": {"text": "I love programming", "target_language": "fr"},
            }
        ],
    },
    {
        "question": "How do I make pesto?",
        "tool_calls": [{"name": "get_recipe", "arguments": {"dish_name": "pesto"}}],
    },
    {
        "question": "What's a good vegan chili recipe?",
        "tool_calls": [{"name": "get_recipe", "arguments": {"dish_name": "vegan chili"}}],
    },
    {
        "question": "Can you give me a recipe for chocolate chip cookies?",
        "tool_calls": [{"name": "get_recipe", "arguments": {"dish_name": "chocolate chip cookies"}}],
    },
    {
        "question": "Solve the equation: x^2 + 2x + 1=0.",
        "tool_calls": [{"name": "solve_math_problem", "arguments": {"problem": "x^2 + 2x + 1=0"}}],
    },
    {
        "question": "Solve the equation: 3x - 7 = 5x + 9",
        "tool_calls": [{"name": "solve_math_problem", "arguments": {"problem": "3x - 7 = 5x + 9"}}],
    },
    {
        "question": "Solve the equation: sin(x) = 0",
        "tool_calls": [{"name": "solve_math_problem", "arguments": {"problem": "sin(x) = 0"}}],
    },
    {
        "question": "Send a message to the general channel on Slack saying 'Hello, world!'",
        "tool_calls": [
            {
                "name": "send_slack_message",
                "arguments": {"channel_name": "general", "message": "Hello, world!"},
            }
        ],
    },
    {
        "question": "Send a message to the sales-team channel on Slack with the message: 'Please register for the conference.'",
        "tool_calls": [
            {
                "name": "send_slack_message",
                "arguments": {
                    "channel_name": "sales-team",
                    "message": "Please register for the conference.",
                },
            }
        ],
    },
    {
        "question": "Send a message to the office-updates channel with the message 'FOOD IS HERE!'",
        "tool_calls": [
            {
                "name": "send_slack_message",
                "arguments": {"channel_name": "office-updates", "message": "FOOD IS HERE!"},
            }
        ],
    },
]

Code

random.shuffle(tools)
random.shuffle(questions)

gpt-3.5-turbo-0125 Function Calling

First we will use gpt-3.5-turbo-0125 to extract the function name and arguments for each question.

Code

def extract_tool_calls(resp):
    resp = resp.choices[0].message
    if resp.tool_calls:
        final_tools = []
        for tool_call in resp.tool_calls:
            final_tools.append(
                {
                    "name": tool_call.function.name,
                    "arguments": json.loads(tool_call.function.arguments),
                }
            )
        return final_tools
    else:
        return None

I’m going to use GPT4 to check the “correctness” of the predicted/generated function arguments by comparing them with the expected arguments. This step is completely optional. Instead, you could use exact string matching or something else. I was curious to see how this would work though.

Code

def check_tool_call_arguments(expected, predicted):
    # Ask GPT4 if the expected function name and arguments are the same as the predicted  function name and arguments.
    if expected["name"] != predicted["name"]:
        return False, f'Function Names Do not Match. Expected {expected["name"]}. Predicted: {predicted["name"]}'
    prompt = f"""
Check if the following queries are approx equal. Use fuzzy logic matching for strings.
Check to see if the arguments are semantically similar, especially for free form text.
If you decide they are equivalent then return TRUE and only TRUE with no other explanation. 
Otherwise return FALSE and give an explanation why they don't match.

Expected Arguments: {expected['arguments']}
Predicted Arguments: {predicted['arguments']}
    """
    resp = llm(model="gpt-4-0125-preview", messages=[dict(role="user", content=prompt)])
    if resp.choices[0].message.content.lower().strip() == "true":
        return True, None
    explanation = resp.choices[0].message.content.lower().strip()
    return False, explanation

Okay, let’s loop over the questions and use gpt-3.5-turbo-0125 to extract the function name and arguments.

Code

def eval_openai_inference_models(model="gpt-3.5-turbo-0125", base_url=None, api_key=None):
    total = 0
    total_correct = 0
    for question in questions:
        resp = llm(
            api_key=api_key,
            base_url=base_url,
            model=model,
            tools=tools,
            messages=[
                dict(role="system", content=f"The date today is {today}"),
                dict(role="user", content=question["question"]),
            ],
        )
        tool_calls = extract_tool_calls(resp)
        if tool_calls is None:
            print(f'Model {model} failed to return any tool calls for question {question["question"]}')
            total += 1
            continue
        assert len(tool_calls) == len(question["tool_calls"])
        for tool_call, expected_call in zip(tool_calls, question["tool_calls"]):
            correct_call, explanation = check_tool_call_arguments(expected_call, tool_call)
            if not correct_call:
                print(f'QUESTION: {question["question"]}')
                print(f'EXPECTED Tool Call: {question["tool_calls"][0]}')
                print(f"GENERATED Tool Call: {tool_call}")
                print(f"EXPLANATION: {explanation}\n\n")
            else:
                total_correct += 1
            total += 1
    return total_correct, total

Code

model = "gpt-3.5-turbo-0125"
total_correct, total = eval_openai_inference_models(model=model, base_url=None, api_key=None)
print(
    f'Correctly called the proper functions {total_correct} times out of {total}. But check the "failure" cases above since they may be correct anyway.'
)

Correctly called the proper functions 18 times out of 18. But check the "failure" cases above since they may be correct anyway.

gpt-4-0125-preview Function Calling

Code

model = "gpt-4-0125-preview"
total_correct, total = eval_openai_inference_models(model=model, base_url=None, api_key=None)
print(
    f'Correctly called the proper functions {total_correct} times out of {total}. But check the "failure" cases above since they may be correct anyway.'
)

QUESTION: What's the forecast for Miami for today?
EXPECTED Tool Call: {'name': 'get_weather_forecast', 'arguments': {'location': 'Miami, FL', 'date': '2024-03-18'}}
GENERATED Tool Call: {'name': 'get_weather_foreast', 'arguments': {'date': '2024-03-18', 'location': 'Miami, FL'}}
EXPLANATION: Function Names Do not Match. Expected get_weather_forecast. Predicted: get_weather_foreast

Correctly called the proper functions 17 times out of 18. But check the "failure" cases above since they may be correct anyway.

Mistral-7B-Instruct-v0.1 with together.ai Function Calling

Code

model = "mistralai/Mistral-7B-Instruct-v0.1"
total_correct, total = eval_openai_inference_models(model=model, base_url=TOGETHER_AI_BASE_URL, api_key=TOGETHER_API_KEY)
print(
    f'Correctly called the proper functions {total_correct} times out of {total}. But check the "failure" cases above since they may be correct anyway.'
)

Model mistralai/Mistral-7B-Instruct-v0.1 failed to return any tool calls for question How do I make pesto?
QUESTION: What's a good vegan chili recipe?
EXPECTED Tool Call: {'name': 'get_recipe', 'arguments': {'dish_name': 'vegan chili'}}
GENERATED Tool Call: {'name': 'solve_math_problem', 'arguments': {'problem': 'What is the square root of 16?'}}
EXPLANATION: Function Names Do not Match. Expected get_recipe. Predicted: solve_math_problem


Correctly called the proper functions 16 times out of 18. But check the "failure" cases above since they may be correct anyway.

mistralai/Mixtral-8x7B-Instruct-v0.1 with together.ai Function Calling

Code

model = "mistralai/Mixtral-8x7B-Instruct-v0.1"
total_correct, total = eval_openai_inference_models(model=model, base_url=TOGETHER_AI_BASE_URL, api_key=TOGETHER_API_KEY)
print(
    f'Correctly called the proper functions {total_correct} times out of {total}. But check the "failure" cases above since they may be correct anyway.'
)

Model mistralai/Mixtral-8x7B-Instruct-v0.1 failed to return any tool calls for question How do I make pesto?
QUESTION: I need to book a first class round-trip flight for 4 people from Chicago to Miami. We want to leave on December 1 and return on December 12.
EXPECTED Tool Call: {'name': 'book_flight', 'arguments': {'departure_city': 'Chicago', 'arrival_city': 'Miami', 'departure_date': '2024-12-01', 'return_date': '2024-12-12', 'num_passengers': 4, 'cabin_class': 'first'}}
GENERATED Tool Call: {'name': 'book_flight', 'arguments': {'departure_city': 'Chicago', 'arrival_city': 'Miami', 'departure_date': '2023-12-01', 'return_date': '2023-12-12', 'num_passengers': 4, 'cabin_class': 'first'}}
EXPLANATION: false

the departure_date and return_date values do not match. the expected arguments have dates in 2024, while the predicted arguments have dates in 2023.

QUESTION: Book me a round-trip flight from New York City to Los Angeles departing on June 15th and returning June 22nd for 2 passengers in economy class.
EXPECTED Tool Call: {'name': 'book_flight', 'arguments': {'departure_city': 'NYC', 'arrival_city': 'LAX', 'departure_date': '2024-06-15', 'return_date': '2024-06-22', 'num_passengers': 2, 'cabin_class': 'economy'}}
GENERATED Tool Call: {'name': 'book_flight', 'arguments': {'departure_city': 'New York City', 'arrival_city': 'Los Angeles', 'departure_date': '2023-06-15', 'return_date': '2023-06-22', 'num_passengers': 2, 'cabin_class': 'economy'}}
EXPLANATION: false

explanation:
- the 'departure_city' and 'arrival_city' fields match semantically as 'nyc' is commonly known as 'new york city' and 'lax' is a well-known shorthand for the los angeles airport, often used to refer to los angeles itself.
- the 'departure_date' and 'return_date' do not match. the expected arguments specify a year 2024, while the predicted arguments have the year 2023 for both dates.
- the 'num_passengers' and 'cabin_class' fields match exactly in both value and semantics. 

the primary reason for the non-match is the difference in 'departure_date' and 'return_date' by one year.

Correctly called the proper functions 15 times out of 18. But check the "failure" cases above since they may be correct anyway.

What is going on with together.ai function calling mistakes above

Both models had issues with the pesto question. I wonder if this is something on together.ai’s end of things and how they implemented this function calling feature. IDK!

NousResearch/Hermes-2-Pro-Mistral-7B Function Calling

Now we will repeat with NousResearch/Hermes-2-Pro-Mistral-7B. The format for the function calling is documented on the model card as well as in this repo. The way we define the tools is the same format as with OpenAI. However, we don’t pass in a tools argument. Rather, we use a special system prompt which defines the tools.

Code

def extract_tool_calls(tool_calls_str):
    tool_calls = tool_calls_str.split("</tool_call>\n")
    parsed_results = []
    for tool_call in tool_calls:
        if tool_call:
            dict_str = tool_call.split("\n")[1]
            tool_call_dict = ast.literal_eval(dict_str)
            parsed_results.append({"arguments": tool_call_dict["arguments"], "name": tool_call_dict["name"]})
    return parsed_results


system_prompt = (
    f"The date today is {today}\n"
    + """
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
<tools> 
"""
    + str(tools)
    + """
    
</tools> Use the following pydantic model json schema for each tool call you will make: {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']} For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{'arguments': <args-dict>, 'name': <function-name>}
</tool_call>
"""
)

total = 0
total_correct = 0
for question in questions:
    resp = llm(
        model="tgi",
        base_url=HUGGING_FACE_ENDPOINT_URL,
        api_key=HUGGING_FACE_ACCESS_TOKEN,
        messages=[
            dict(role="system", content=system_prompt),
            dict(role="user", content=question["question"]),
        ],
        max_tokens=500,
    )
    tool_calls = extract_tool_calls(resp.choices[0].message.content)
    assert len(tool_calls) == len(question["tool_calls"])
    for tool_call, expected_call in zip(tool_calls, question["tool_calls"]):
        correct_call, explanation = check_tool_call_arguments(expected_call, tool_call)
        if not correct_call:
            print(f'QUESTION: {question["question"]}')
            print(f'EXPECTED Tool Call: {question["tool_calls"][0]}')
            print(f"GENERATED Tool Call: {tool_call}")
            print(f"EXPLANATION: {explanation}\n\n")
        else:
            total_correct += 1
        total += 1

Code

print(
    f'Correctly called the proper functions {total_correct} times out of {total}. But check the "failure" cases above since they may be correct anyway.'
)

Correctly called the proper functions 18 times out of 18. But check the "failure" cases above since they may be correct anyway.

Wow, it got all of them correct! It may not get them all correct every time. Run it over again to see if any mistakes are made. Sometimes I saw it forgetting to fill in num_tickets for example.

Let’s look at a single question to see the output from the model.

Code

today

'Saturday 2024-03-16'

Code

question = "I want to go see Dune 2 on Wednesday night with 5 of my friends. We will be going to the Halifax Bayers Lake Ciniplex Theatre. Get tickets for the 7pm show. Thanks!"

Code

resp = llm(
    model="tgi",
    base_url=HUGGING_FACE_ENDPOINT_URL,
    api_key=HUGGING_FACE_ACCESS_TOKEN,
    messages=[
        dict(role="system", content=system_prompt),
        dict(role="user", content=question),
    ],
)
resp

ChatCompletion(id='', choices=[Choice(finish_reason='eos_token', index=0, logprobs=None, message=ChatCompletionMessage(content="<tool_call>\n{'arguments': {'movie_name': 'Dune 2', 'theater_name': 'Halifax Bayers Lake Ciniplex Theatre', 'date': '2024-03-20', 'time': '19:00', 'num_tickets': 6}, 'name': 'book_movie_tickets'}\n</tool_call>\n", role='assistant', function_call=None, tool_calls=None, name=None))], created=1710632177, model='/repository', object='text_completion', system_fingerprint='1.4.1-native', usage=CompletionUsage(completion_tokens=93, prompt_tokens=1719, total_tokens=1812))

Code

print(resp.choices[0].message.content)

<tool_call>
{'arguments': {'movie_name': 'Dune 2', 'theater_name': 'Halifax Bayers Lake Ciniplex Theatre', 'date': '2024-03-20', 'time': '19:00', 'num_tickets': 6}, 'name': 'book_movie_tickets'}
</tool_call>

Code

tool_calls = extract_tool_calls(resp.choices[0].message.content)
tool_calls

[{'arguments': {'movie_name': 'Dune 2',
   'theater_name': 'Halifax Bayers Lake Ciniplex Theatre',
   'date': '2024-03-20',
   'time': '19:00',
   'num_tickets': 6},
  'name': 'book_movie_tickets'}]

The model also supports multiple function calls!

Code

tasks = f"""
Today's date is {today}.

Please complete the following tasks for me:

1. I want to go see Dune 2 on Monday night with 5 of my friends. We will be going to the Halifax Bayers Lake Ciniplex Theatre. Get tickets for the 7pm show.

2. Please check the weather for Monday night so I know how to dress.

3. Also please book my plane ticket to Toronto. I will be leaving Tuesday and coming back 2 days later on Thursday. First class please.

4. Send a slack message to the research channel to let them know I will not be there this week in the office.
 
"""

Code

resp = llm(
    model="tgi",
    base_url=HUGGING_FACE_ENDPOINT_URL,
    api_key=HUGGING_FACE_ACCESS_TOKEN,
    messages=[
        dict(role="system", content=system_prompt),
        dict(role="user", content=tasks),
    ],
    max_tokens=1000,
)
tool_calls = extract_tool_calls(resp.choices[0].message.content)
tool_calls

[{'arguments': {'movie_name': 'Dune 2',
   'theater_name': 'Halifax Bayers Lake Ciniplex Theatre',
   'date': '2024-03-18',
   'time': '19:00',
   'num_tickets': 6},
  'name': 'book_movie_tickets'},
 {'arguments': {'location': 'Halifax Bayers Lake', 'date': '2024-03-18'},
  'name': 'get_weather_forecast'},
 {'arguments': {'departure_city': 'Halifax',
   'arrival_city': 'Toronto',
   'departure_date': '2024-03-19',
   'return_date': '2024-03-21',
   'num_passengers': 1,
   'cabin_class': 'first'},
  'name': 'book_flight'},
 {'arguments': {'channel_name': 'research',
   'message': 'I will not be in the office this week.'},
  'name': 'send_slack_message'}]

Conclusion

Impressive!

You can take the arguments, and pass them into the actual function, and give back the results to the model. See the model card or repo on how to do that.

There is JSON Mode support too!

I’m just getting started with playing around with this powerful open source model. I can’t wait to explore it more!

References

interstellarninja, Teknium, theemozilla, karan4d, and huemin_art. 2024. “Hermes-2-Pro-Mistral-7B.” [https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B]https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B).