Agent AI 101: Starting the journey of building an AI agent

0 0 9 minutes read

Agent AI 101: Starting the journey of building an AI agent

The artificial intelligence industry is developing rapidly. It’s impressive, and many times.

I have been researching, learning and building my foundation in the field of data science because I believe the future of data science is closely related to the development of generative AI.

Just the other day I built my first AI agent and then a few weeks later there were a few python packages to choose from, not to mention the code-free option is done well, e.g. n8n.

From the “simple” model that can chat with us to Tsunami The ubiquitous AI agents, searching the internet, processing files and performing entire data science projects (from EDA to modeling and evaluation), all happening in just a few short years.

What?

Seeing all of this, my thoughts are: “I need to join soon”. After all, surfing is better than swallowing waves.

So I decided to start this series of posts, and I was going to start building our first AI agent from the ground up until the more complex concepts.

Enough conversations to let us sneak in.

Basics of AI Agents

Create an AI proxy when we give LLMs the interaction with the tool and perform useful actions for us. So now it can schedule dates, take care of our calendar, search the internet, write social media posts, and continue lists, not just chatbots, but can schedule dates, search the internet, and continue…

AI agents can do useful things, not just chat.

But how do we give this power to LLM?

The simple answer is to use the API to interact with LLM. Today, there are several Python packages. If you follow my blog, you will find that I have tried several packages to build the agent: such as Langchain, Agno (formerly Phidata), and Crewai. For this series, I’ll stick with Agno [1].

First, use uvAnaconda or your preferred environment handler. Next, install the package.

# Agno AI
pip install agno

# module to interact with Gemini
pip install google-generativeai

# Install these other packages that will be needed throughout the tutorial
 pip install agno groq lancedb sentence-transformers tantivy youtube-transcript-api

Take a quick note and we continue. Don’t forget to get the Google Gemini API key [2].

Creating a simple proxy is very simple. All packaging is very similar. They have classes Agent or something similar that allows us to select a model and start interacting with the LLM of our choice. Here are the main components of this category:

model: Connection to LLM. Here we will choose between Openai, Gemini, Llama, Deepseek, etc.
description: This argument allows us to describe the behavior of the agent. This is added to system_messagethis is a similar argument.
instructions: I like to think of agents like employees or assistants we are managing. In order to complete the task, we must provide instructions on what needs to be done. This is where you can do this.
expected_output: Here we can provide instructions on the expected output.
tools: This is what makes LLM an agent, enabling it to use these tools to interact with the real world.

Now let’s create a simple proxy without tools, but will help build our intuition around the code structure.

# Imports
from agno.agent import Agent
from agno.models.google import Gemini
import os

# Create agent
agent = Agent(
    model= Gemini(id="gemini-1.5-flash",
                  api_key = os.environ.get("GEMINI_API_KEY")),
    description= "An assistant agent",
    instructions= ["Be sucint. Answer in a maximum of 2 sentences."],
    markdown= True
    )

# Run agent
response = agent.run("What's the weather like in NYC in May?")

# Print response
print(response.content)

########### OUTPUT ###############
Expect mild temperatures in NYC during May, typically ranging from the low 50s 
to the mid-70s Fahrenheit.  
There's a chance of rain, so packing layers and an umbrella is advisable.

Very good. We are using the Gemini 1.5 model. Note that according to the training data, its response. If we ask to tell us the weather today, we will see a reply saying it has no access to the internet.

Let’s explore instructions and expected_output debate. Now we want a table of months, seasons and average temperatures in New York State.

# Imports
from agno.agent import Agent
from agno.models.google import Gemini
import os

# Create agent
agent = Agent(
    model= Gemini(id="gemini-1.5-flash",
                  api_key = os.environ.get("GEMINI_API_KEY")),
    description= "An assistant agent",
    instructions= ["Be sucint. Return a markdown table"],
    expected_output= "A table with month, season and average temperature",	
    markdown= True
    )

# Run agent
response = agent.run("What's the weather like in NYC for each month of the year?")

# Print response
print(response.content)

There are results.

moon	season	Average temperature (°F)
January	winter	32
February	winter	35
Moving	spring	44
April	spring	54
possible	spring	63
June	summer	72
July	summer	77
August	summer	76
September	autumn	70
October	autumn	58
November	autumn	48
December	winter	37

tool

The previous reply was very good. But we naturally don’t want to play with chatbots using powerful models such as LLM, nor do we want to tell us old news, right?

We want them to be bridges of automation, productivity and knowledge. so, tool Will add functionality to our AI agents to build a bridge to the real world. Common examples of proxy tools are: searching the network, running SQL, sending emails, or calling APIs.

But beyond that, we can create custom functions for the proxy by using any Python function as a tool.

tool Agents can run to implement the functionality of tasks.

On the code side, adding tools to proxy is just a matter of using parameters tools exist Agent class.

Imagine an entrepreneur (one person, one company) in a healthy living enterprise that wants to automate its content generation. This person posts tips on healthy habits every day. I know the fact that content generation is not as simple as it seems. It requires creativity, research and copywriting skills. So if it can be automated, or at least part of it, it is saving time.

So we wrote this code to create a very simple proxy that can generate a simple Instagram post and save it to a Markdown file for our review. We reduce the process from thinking > research > writing > review > publishing to review > publishing.

# Imports
import os
from agno.agent import Agent
from agno.models.google import Gemini
from agno.tools.file import FileTools

# Create agent
agent = Agent(
    model= Gemini(id="gemini-1.5-flash",
                  api_key = os.environ.get("GEMINI_API_KEY")),
                  description= "You are a social media marketer specialized in creating engaging content.",
                  tools=[FileTools(
                      read_files=True, 
                      save_files=True
                      )],
                  show_tool_calls=True)


# Writing and saving a file
agent.print_response("""Write a short post for instagram with tips and tricks
                        that positions me as an authority in healthy eating 
                        and save it to a file named 'post.txt'.""",
                     markdown=True)

As a result, we have the following.

Unlock Your Best Self Through Healthy Eating:

1. Prioritize whole foods: Load up on fruits, vegetables, lean proteins, and whole
 grains.  They're packed with nutrients and keep you feeling full and energized.
2. Mindful eating:  Pay attention to your body's hunger and fullness cues. 
Avoid distractions while eating.
3. Hydrate, hydrate, hydrate: Water is crucial for digestion, energy levels, 
and overall health.
4. Don't deprive yourself:  Allow for occasional treats.  
Deprivation can lead to overeating later.  Enjoy everything in moderation!
5. Plan ahead:  Prep your meals or snacks in advance to avoid unhealthy 
impulse decisions.

#healthyeating #healthylifestyle #nutrition #foodie 
#wellbeing #healthytips #eatclean #weightloss #healthyrecipes 
#nutritiontips #instahealth #healthyfood #mindfuleating #wellnessjourney 
#healthcoach

Of course, we can search for a list of websites for content, content inspectors and reviewers by creating crew members with other agents, and another list of websites that generate images for posts. But I believe you have a general idea on how to add tool Go to one Agent.

Another type of tool we can add is Function tool. We can use Python functions as a tool for LLM. Just don’t forget to add the prompts for types video_id:strso the model knows that this is used as input to the function. Otherwise, you may see an error.

Let’s take a brief look at how it works.

Now, we want our agents to get a given YouTube video and summarize it. To perform such tasks, we simply create a function that downloads the transcript of the video from YT and passes it to the model for summary.

# Imports
import os
from agno.agent import Agent
from agno.models.google import Gemini
from youtube_transcript_api import YouTubeTranscriptApi

# Get YT transcript
def get_yt_transcript(video_id:str) -> str:
      
    """
    Use this function to get the transcript from a YouTube video using the video id.

    Parameters
    ----------
    video_id : str
        The id of the YouTube video.
    Returns
    -------
    str
        The transcript of the video.
    """

    # Instantiate
    ytt_api = YouTubeTranscriptApi()
    # Fetch
    yt = ytt_api.fetch(video_id)
    # Return
    return ''.join([line.text for line in yt])


# Create agent
agent = Agent(
    model= Gemini(id="gemini-1.5-flash",
                  api_key = os.environ.get("GEMINI_API_KEY")),
                  description= "You are an assistant that summarizes YouTube videos.",
                  tools=[get_yt_transcript],
                  expected_output= "A summary of the video with the 5 main points and 2 questions for me to test my understanding.",
                  markdown=True,
                  show_tool_calls=True)


# Run agent
agent.print_response("""Summarize the text of the video with the id 'hrZSfMly_Ck' """,
                     markdown=True)

Then you have the results.

Required summary results. Image of the author.

Agent with reasoning

Another cool option for AGNO packages is to allow us to easily create agents that can analyze the situation before answering questions. That is a reasoning tool.

We will use Alibaba’s QWEN-QWQ-32B model to create an inference proxy. Note that the only difference here except for the model is that we are adding tools ReasoningTools().

this adding_instructions=True Meaning to provide detailed instructions to the agent to improve the reliability and accuracy of their tool usage while setting it to False Force the agent to rely on its own reasoning, which may be more prone to errors.

# Imports
import os
from agno.agent import Agent
from agno.models.groq import Groq
from agno.tools.reasoning import ReasoningTools


# Create agent with reasoning
agent = Agent(
    model= Groq(id="qwen-qwq-32b",
                  api_key = os.environ.get("GROQ_API_KEY")),
                  description= "You are an experienced math teacher.",
                  tools=[ReasoningTools(add_instructions=True)],
                  show_tool_calls=True)


# Writing and saving a file
agent.print_response("""Explain the concept of sin and cosine in simple terms.""",
                     stream=True,
                     show_full_reasoning=True,
                     markdown=True)

Follow the output.

Output of the inference model. Image of the author.

Agents with knowledge

This tool is the easiest way I can create a Retrieval Augmentation Generation (RAG). Using this feature, you can point the agent to a website or website list and add the content to a vector database. Then, it becomes searchable. Once asked, the agent can use the content as part of the answer.

In this simple example, I added a page of the website and asked the agents which books they listed there.

# Imports
import os
from agno.agent import Agent
from agno.models.google import Gemini
from agno.knowledge.url import UrlKnowledge
from agno.vectordb.lancedb import LanceDb, SearchType
from agno.embedder.sentence_transformer import SentenceTransformerEmbedder

# Load webpage to the knowledge base
agent_knowledge = UrlKnowledge(
    urls=["
    vector_db=LanceDb(
        uri="tmp/lancedb",
        table_name="projects",
        search_type=SearchType.hybrid,
        # Use Sentence Transformer for embeddings
        embedder=SentenceTransformerEmbedder(),
    ),
)

# Create agent
agent = Agent(
    model=Gemini(id="gemini-2.0-flash", api_key=os.environ.get("GEMINI_API_KEY")),
    instructions=[
        "Use tables to display data.",
        "Search your knowledge before answering the question.",
        "Only inlcude the content from the agent_knowledge base table 'projects'",
        "Only include the output in your response. No other text.",
    ],
    knowledge=agent_knowledge,
    add_datetime_to_instructions=True,
    markdown=True,
)

if __name__ == "__main__":
    # Load the knowledge base, you can comment out after first run
    # Set recreate to True to recreate the knowledge base if needed
    agent.knowledge.load(recreate=False)
    agent.print_response(
        "What are the two books listed in the 'agent_knowledge'",
        stream=True,
        show_full_reasoning=True,
        stream_intermediate_steps=True,
    )

After searching for the knowledge base, the agent’s response. Image of the author.

A proxy with memory

The last type we will use in this post is a proxy with memory.

This type of proxy can store and retrieve information from users in previous interactions, allowing them to learn user preferences and personalize their responses.

Let’s look at this example, where I’ll tell a few things to the agent and make suggestions based on that interaction.

# imports
import os
from agno.agent import Agent
from agno.memory.v2.db.sqlite import SqliteMemoryDb
from agno.memory.v2.memory import Memory
from agno.models.google import Gemini
from rich.pretty import pprint

# User Name
user_id = "data_scientist"

# Creating a memory database
memory = Memory(
    db=SqliteMemoryDb(table_name="memory", 
                      db_file="tmp/memory.db"),
    model=Gemini(id="gemini-2.0-flash", 
                 api_key=os.environ.get("GEMINI_API_KEY"))
                 )

# Clear the memory before start
memory.clear()

# Create the agent
agent = Agent(
    model=Gemini(id="gemini-2.0-flash", api_key=os.environ.get("GEMINI_API_KEY")),
    user_id=user_id,
    memory=memory,
    # Enable the Agent to dynamically create and manage user memories
    enable_agentic_memory=True,
    add_datetime_to_instructions=True,
    markdown=True,
)


# Run the code
if __name__ == "__main__":
    agent.print_response("My name is Gustavo and I am a Data Scientist learning about AI Agents.")
    memories = memory.get_user_memories(user_id=user_id)
    print(f"Memories about {user_id}:")
    pprint(memories)
    agent.print_response("What topic should I study about?")
    agent.print_response("I write articles for Towards Data Science.")
    print(f"Memories about {user_id}:")
    pprint(memories)
    agent.print_response("Where should I post my next article?")

Example of AI proxy with memory. Image of the author.

Here we end our first article about AI agents.

Before moving forward

There is a lot in this article. We climbed the first step to learning about AI agents. I know, it’s overwhelming. There is so much information out there that it is becoming increasingly difficult to know where to start and what to learn.

My advice is to go the same path I take. Choose only a few packages at a time like Agno, Crewai and dig into these packages to learn how to create more complex agents each time.

In this post, we start from scratch and learn how to interact with LLM to create proxy using memory, and even create simple rags for AI proxy.

Obviously, there is only one agent that can do more. View Reference [4].

With these simple skills, make sure you are ahead of many people and you can already do a lot of things. Just use creativity (why not?) Seek LLM’s help to build something cool!

In the next article, we will learn more about agents and assessments. stay tuned!