Technical Writer

In this ever-changing era of technology, artificial intelligence (AI) is driving innovation and transforming industries. Among the various advancements within AI, the development and deployment of AI agents are known to reshape how businesses operate, enhance user experiences, and automate complex tasks.
AI agents, which are software entities capable of performing specific tasks autonomously, have become indispensable in many applications, ranging from customer service chatbots to advanced data analysis tools to finance agents.
In this article, we will create a basic AI agent to explore the significance, functionalities, and technological frameworks that facilitate these agents’ creation and deployment. Specifically, we will understand LangGraph and Ollama, two powerful tools that simplify building local AI agents.
By the end of this guide, you will have a comprehensive understanding of leveraging these technologies to create efficient and effective AI agents tailored to your specific needs.
Key takeaways:
AI agents are entities or systems that perceive their environment and take actions to achieve specific goals or objectives. These agents can range from simple algorithms to sophisticated systems capable of complex decision-making. Here are some key points about AI agents:
An example of AI agents in action is healthcare systems that analyze patient data from various sources, such as: medical records, test results, and real-time monitoring devices. These AI agents can use this data to make informed decisions, such as predicting the likelihood of a patient developing a specific condition or recommending personalized treatment plans based on the patient’s medical history and current health status.
Info: DigitalOcean’s GradientAI offers businesses a fully-managed service to build and deploy custom AI agents. With access to leading models from Meta, Mistral AI, and Anthropic, along with essential features like RAG workflows and guardrails, the platform makes it easier than ever to integrate powerful AI capabilities into your applications.
RAG (Retrieval-Augmented Generation) applications and AI agents refer to different concepts within artificial intelligence.
RAG is used to improve the performance of LLM models by incorporating information retrieval methods. The retrieval system searches for relevant documents or information from a large corpus based on the input query. The generative model (e.g., a transformer-based language model) then uses this retrieved information to generate more accurate and contextually relevant responses. This helps increase the generated content’s accuracy due to the integration of retrieved information. Furthermore, this technique removes the need to fine-tune or train a LLM on new data.
On the other hand, AI agents are autonomous software entities designed to perform specific tasks or a series of tasks. They operate based on predefined rules, machine learning models, or both. They often interact with users or other systems to gather inputs, provide responses, or execute actions. Some AI agent’s performance increases as they can learn and adapt over time based on new data and experiences. AI can handle multiple tasks simultaneously, providing scalability for businesses.
| RAG | AI Agent | 
|---|---|
| RAG is a technique used to improve the performance of generative models by incorporating information retrieval methods | An AI personal assistant can perform autonomous tasks and make decisions | 
| Retrieval system + generative model | Rule-based systems, machine learning models, or a combination of AI techniques | 
| Improved accuracy and relevance, leverage external data | Improved versatility, adaptability | 
| Question answering, customer support, content generation | Virtual assistants, autonomous vehicles, recommendation systems | 
| Ability to leverage large, external datasets for enhancing generative responses without requiring the generative model itself to be trained on all that data | Capability to interact with users and adapt to changing requirements or environments. | 
| A chatbot that retrieves relevant FAQs or knowledge base articles to answer user queries more effectively. | A recommendation engine that suggests products or content based on user preferences and behavior. | 
In summary, RAG applications are specifically designed to enhance the capabilities of generative models by incorporating retrieval mechanisms; AI agents are broader entities intended to perform a wide array of tasks autonomously.
LangGraph is a powerful library for building stateful, multi-actor applications using large language models (LLMs). It helps create complex workflows involving single or multiple agents, offering critical advantages like cycles, controllability, and persistence.
LangGraph is inspired by technologies like Pregel and Apache Beam, with a user-friendly interface similar to NetworkX. Developed by LangChain Inc., it offers a robust tool for building reliable, advanced AI-driven applications.
Ollama is an open-source project that makes running LLMs on your local machine easy and user-friendly. It provides a user-friendly platform that simplifies the complexities of LLM technology, making it accessible and customizable for users who want to harness the power of AI without needing extensive technical expertise.
It is easy to install. Furthermore, we have a selection of models and a comprehensive set of features and functionalities designed to enhance the user experience.
In this demo, we will create a simple example of an agent using the Mistral model. This agent can search the web using the Tavily Search API and generate responses.
We will start by installing Langgraph, a library designed to build stateful, multi-actor applications with LLMs that are ideal for creating agent and multi-agent workflows. Inspired by Pregel, Apache Beam, and NetworkX, LangGraph is developed by LangChain Inc. and can be used independently of LangChain.
We will use Mistral as our LLM model, which will be integrated with Ollama and Tavily’s Search API. Tavily’s API is optimized for LLMs, providing a factual, efficient, persistent search experience.
Before we begin with the installation, let us check our GPU. You can open a terminal and type the code below to check your GPU config.
nvidia-smi

Now, we will start with our installations.
pip install -U langgraph
pip install -U langchain-nomic langchain_community tiktoken langchainhub chromadb langchain langgraph tavily-python
pip install langchain-openai
After completing the installations, we will move on to the next crucial step: providing the Travily API key.
export TAVILY_API_KEY="apikeygoeshere"
Now, we will run the code below to fetch the model. Please try this using Llama or any other version of Mistral.
ollama pull mistral
Import all the necessary libraries required to build the agent.
from langchain import hub
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain.prompts import PromptTemplate
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
from langchain_community.chat_models import ChatOllama
We will start by defining the tools we want to use and bind the tools with the llm. For this simple example, we will utilize a built-in search tool via Tavily.
tools = [TavilySearchResults(max_results=3)]
llm_with_tools = llm.bind_tools(tools)
The below code snippet retrieves a prompt template and prints it in a readable format. This template can then be used or modified as needed for the application.
prompt = hub.pull("wfh/react-agent-executor")
prompt.pretty_print()
Next, we will configure the use of Mistral via the Ollama platform.
llm = ChatOpenAI(model="mistral", api_key="ollama",     base_url="http://localhost:11434/v1",
)
Finally, we will create an agent executor using our language model (llm), a set of tools (tools), and a prompt template (prompt). The agent is configured to react to inputs, utilize the tools, and generate responses based on the specified prompt, enabling it to perform tasks in a controlled and efficient manner.
agent_executor = create_react_agent(llm, tools, messages_modifier=prompt)
================================ System Message ================================
You are a helpful assistant.
============================= Messages Placeholder =============================
{{messages}}
The given code snippet invokes the agent executor to process the input message. This step aims to send a query to the agent executor and receive a response. The agent will use its configured language model (Mistral in this case), tools, and prompts to process the message and generate an appropriate reply
response = agent_executor.invoke({"messages": [("user", "explain artificial intelligence")]})
for message in response['messages']:
    print(message.content)
and this will generate the below response.

Q: How to build local AI agents that work offline in 2025?
A: Building offline AI agents in 2025 requires combining LangGraph for agent orchestration with Ollama for local model serving. Start by installing Ollama and downloading appropriate models (Llama 2, Code Llama, or specialized models). Set up LangGraph to create stateful workflows with decision trees, loops, and memory persistence. Implement local vector databases like Chroma or FAISS for knowledge retrieval. Design agent workflows that handle common tasks without internet connectivity. Test thoroughly for edge cases and implement fallback mechanisms. This approach ensures complete data privacy and consistent performance regardless of internet availability.
Q: What are the best local LLM models for business applications with Ollama?
A: For business applications with Ollama in 2025, top models include Llama 2 70B for general business tasks and complex reasoning, Code Llama for software development and technical documentation, Mistral 7B for efficient customer service and content generation, and Phi-3 for resource-constrained environments. Specialized models like WizardCoder for programming tasks and Vicuna for conversational applications also perform well. Consider model size vs. performance trade-offs: 7B models for basic tasks, 13B for moderate complexity, and 70B+ for sophisticated reasoning. Always evaluate models on your specific use cases and hardware constraints.
Q: LangGraph vs LangChain: which framework is better for AI agents?
A: LangGraph excels for complex, stateful AI agents requiring advanced control flow, loops, conditional branching, and state persistence across interactions. It’s ideal for multi-step workflows, error recovery, and sophisticated decision-making processes. LangChain works better for simpler, linear workflows and rapid prototyping with extensive third-party integrations. Choose LangGraph for production agent systems requiring complex logic, state management, and robust error handling. Choose LangChain for quick prototypes, simple chains, and when you need extensive ecosystem integrations. LangGraph offers more control and flexibility for sophisticated agent behaviors.
Q: How to optimize local AI agent performance with limited hardware resources?
A: Optimizing local AI agents on limited hardware involves several strategies: Use smaller, efficient models like Phi-3 or Mistral 7B instead of larger variants. Implement model quantization (4-bit or 8-bit) to reduce memory usage. Use CPU-optimized inference engines and enable hardware acceleration where available. Implement intelligent caching to store frequent responses and computations. Design efficient prompting strategies to minimize token usage. Use streaming responses to improve perceived performance. Implement request batching and queue management. Consider hybrid approaches where simple tasks use lightweight models and complex tasks use larger models selectively.
Q: What are the security benefits of running AI agents locally vs cloud APIs?
A: Local AI agents provide significant security advantages over cloud APIs: Complete data privacy since sensitive information never leaves your infrastructure, eliminating data breach risks from third-party services. No dependency on external APIs reduces attack surface and prevents service disruptions. Full control over model updates and behavior prevents unexpected changes in AI responses. Reduced risk of prompt injection attacks through controlled environments. Cost predictability without usage-based pricing fluctuations. However, local deployment requires infrastructure management and security hardening responsibilities.
LangGraph and tools like AI Agents and Ollama represent a significant step forward in developing and deploying localized artificial intelligence solutions. By leveraging LangGraph’s ability to streamline various AI components and its modular architecture, developers can create versatile and scalable AI solutions that are efficient and highly adaptable to changing needs.
As our blog describes, AI Agents offer a flexible approach to automating tasks and enhancing productivity. These agents can be customized to handle various functions, from simple task automation to complex decision-making processes, making them indispensable tools for modern businesses.
Ollama, as part of this ecosystem, provides additional support by offering specialized tools and services that complement LangGraph’s capabilities.
In summary, the integration of LangGraph and Ollama provides a robust framework for building AI agents that are both effective and efficient. This guide is a valuable resource for anyone looking to harness the potential of these technologies to drive innovation and achieve their objectives in the ever-evolving landscape of artificial intelligence.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
With a strong background in data science and over six years of experience, I am passionate about creating in-depth content on technologies. Currently focused on AI, machine learning, and GPU computing, working on topics ranging from deep learning frameworks to optimizing GPU-based workloads.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
We have created an article on downloading and using Ollama; please check out the blog (link provided in the resource section.)
It’s not clear if one of the resources actually links to the other article. I ended up going to ollama downloads directly and using their install script.
Get simple AI infrastructure starting at $2.99/GPU/hr on-demand. Try GPU Droplets now!
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.