A Groq-Powered Agentic Research Assistant with LangGraph, Tool Calling, Sub-Agents, and Agentic Memory: Lets Built It

In this tutorial, we build a Groq-powered agentic research workflow that runs directly using Groq’s free OpenAI-compatible inference endpoint. We configure LangChain’s ChatOpenAI interface to work with Groq by setting the Groq API key and base URL, allowing us to use fast hosted models such as llama-3.3-70b-versatile for tool-based reasoning. We then connect the model with practical tools for web search, webpage fetching, file handling, Python execution, skill loading, sub-agent delegation, and long-term memory. By the end of the tutorial, we have a working Groq-based multi-step agent that can research a topic, delegate focused subtasks, generate structured outputs, and save useful information for later runs.

import subprocess, sys
def _pip(*a): subprocess.check_call([sys.executable,"-m","pip","install","-q",*a])
_pip("langgraph>=0.2.50", "langchain>=0.3.0", "langchain-openai>=0.2.0",
    "langchain-community>=0.3.0", "ddgs", "requests", "beautifulsoup4",
    "tiktoken", "pydantic>=2.0")


import os, getpass
if not os.environ.get("GROQ_API_KEY"):
   os.environ["GROQ_API_KEY"] = getpass.getpass("GROQ_API_KEY (free at console.groq.com/keys): ")


os.environ["OPENAI_API_KEY"]  = os.environ["GROQ_API_KEY"]
os.environ["OPENAI_BASE_URL"] = "https://api.groq.com/openai/v1"


MODEL_NAME = "llama-3.3-70b-versatile"


import json, re, io, contextlib, pathlib
from typing import Annotated, TypedDict, Sequence, Literal, List, Dict, Any
from datetime import datetime, timezone
from langchain_openai import ChatOpenAI
from langchain_core.messages import (
   SystemMessage, HumanMessage, AIMessage, ToolMessage, BaseMessage)
from langchain_core.tools import tool
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode

We install the core libraries required to build the Groq-powered agent workflow, including LangGraph, LangChain, DuckDuckGo search utilities, and supporting parsing libraries. We securely collect the Groq API key and configure Groq as an OpenAI-compatible endpoint by setting the API key and base URL. We then import all required modules for messages, tools, graph construction, typing, filesystem handling, and model initialization.

How to Design Python-First Interactive Dashboards with Prefab Reactive UI Components and Static HTML Export

Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribution and Claude Code Orchestration

SANDBOX = pathlib.Path("/content/deerflow_sandbox").resolve()
for sub in ["uploads","workspace","outputs","skills/public","skills/custom","memory"]:
   (SANDBOX/sub).mkdir(parents=True, exist_ok=True)


def _safe(p: str) -> pathlib.Path:
   full = (SANDBOX/p.lstrip("/")).resolve()
   if not str(full).startswith(str(SANDBOX)):
       raise ValueError(f"path escapes sandbox: {p}")
   return full


SKILLS: Dict[str, Dict[str,str]] = {}
def register_skill(name, description, content, location="public"):
   d = SANDBOX/"skills"/location/name; d.mkdir(parents=True, exist_ok=True)
   (d/"SKILL.md").write_text(content)
   SKILLS[name] = {"description": description, "content": content,
                   "path": str(d/"SKILL.md")}


register_skill("research",
   "Conduct multi-source web research on a topic and produce structured notes.",
   """# Research Skill
## Workflow
1. Decompose the question into 3-5 sub-questions.
2. For each sub-question call `web_search` and pick 2 authoritative URLs.
3. `web_fetch` those URLs; extract concrete facts, numbers, dates.
4. Cross-reference for consensus vs. disagreement.
5. Append findings to `workspace/research_notes.md`: claim → evidence → URL.
## Best practices
- Prefer primary sources. Note dates. Never fabricate URLs or numbers.""")


register_skill("report-generation",
   "Synthesize research notes into a polished markdown report in outputs/.",
   """# Report Generation Skill
## Workflow
1. file_read('workspace/research_notes.md').
2. Outline: exec summary, key findings, analysis, conclusion, sources.
3. file_write('outputs/report.md', ...).
## Structure
- # Title
- ## Executive Summary  (3–5 sentences)
- ## Key Findings       (bullets)
- ## Detailed Analysis  (sections)
- ## Conclusion
- ## Sources            (numbered URL list)""")


register_skill("code-execution",
   "Run Python in the sandbox for computation, data wrangling, charts.",
   """# Code Execution Skill
1. Plan in plain language first.
2. python_exec the code; persistent artifacts go to /outputs/.
3. Verify before quoting results.""")


MEM = SANDBOX/"memory/long_term.json"
if not MEM.exists():
   MEM.write_text(json.dumps({"facts":[],"preferences":{}}, indent=2))
def _load_mem(): return json.loads(MEM.read_text())
def _save_mem(m): MEM.write_text(json.dumps(m, indent=2))

We create a sandboxed project directory in Colab to keep uploads, workspace files, outputs, skills, and memory organized in a single controlled location. We define reusable skills for research, report generation, and code execution so the agent can discover and follow structured workflows. We also initialize a simple long-term memory JSON file that stores facts and preferences across multiple runs within the same sandbox.

@tool
def list_skills() -> str:
   """List all skills with one-line descriptions. Call this first for complex tasks."""
   return "\n".join(f"- {n}: {s['description']}" for n,s in SKILLS.items())


@tool
def load_skill(name: str) -> str:
   """Load full SKILL.md for `name`. Call before running its workflow."""
   if name not in SKILLS: return f"Unknown. Available: {list(SKILLS)}"
   return SKILLS[name]["content"]


@tool
def web_search(query: str, max_results: int = 5) -> str:
   """Search the web (DuckDuckGo). Returns titles, URLs, snippets."""
   from ddgs import DDGS
   out = []
   try:
       with DDGS() as d:
           for r in d.text(query, max_results=max_results):
               out.append(f"- {r.get('title','')}\n  URL: {r.get('href','')}\n  "
                          f"{(r.get('body') or '')[:220]}")
   except Exception as e:
       return f"search error: {e}"
   return "\n".join(out) or "no results"


@tool
def web_fetch(url: str, max_chars: int = 4000) -> str:
   """Fetch a URL, return cleaned text (scripts/nav stripped)."""
   import requests
   from bs4 import BeautifulSoup
   try:
       r = requests.get(url, timeout=15,
                        headers={"User-Agent":"Mozilla/5.0 DeerFlow-Lite"})
       soup = BeautifulSoup(r.text, "html.parser")
       for s in soup(["script","style","nav","footer","aside","header"]): s.decompose()
       text = re.sub(r"\n\s*\n", "\n\n", soup.get_text("\n")).strip()
       return text[:max_chars] or "(empty page)"
   except Exception as e:
       return f"fetch error: {e}"


@tool
def file_write(path: str, content: str) -> str:
   """Write content to a sandbox path, e.g. 'workspace/notes.md' or 'outputs/x.md'."""
   p = _safe(path); p.parent.mkdir(parents=True, exist_ok=True)
   p.write_text(content)
   return f"wrote {len(content)} chars → {path}"


@tool
def file_read(path: str) -> str:
   """Read a sandbox file (first 8 KB)."""
   p = _safe(path)
   return p.read_text()[:8000] if p.exists() else f"not found: {path}"


@tool
def file_list(path: str = "") -> str:
   """List files under a sandbox dir."""
   base = _safe(path) if path else SANDBOX
   if not base.exists(): return "not found"
   items = []
   for c in sorted(base.rglob("*")):
       if "memory" in c.relative_to(SANDBOX).parts: continue
       items.append(f"  {'D' if c.is_dir() else 'F'}  {c.relative_to(SANDBOX)}")
   return "\n".join(items[:60]) or "(empty)"


@tool
def python_exec(code: str) -> str:
   """Run Python in the sandbox. SANDBOX_ROOT is preset."""
   g = {"__name__":"__sb__", "SANDBOX_ROOT": str(SANDBOX)}
   buf = io.StringIO()
   try:
       with contextlib.redirect_stdout(buf), contextlib.redirect_stderr(buf):
           exec(code, g)
       return (buf.getvalue() or "(no stdout)")[:4000]
   except Exception as e:
       return f"{type(e).__name__}: {e}\n{buf.getvalue()[:1500]}"


@tool
def remember(fact: str) -> str:
   """Persist a single fact to long-term memory (survives across runs)."""
   m = _load_mem()
   m["facts"].append({"fact": fact, "ts": datetime.now(timezone.utc).isoformat()})
   _save_mem(m)
   return f"remembered ({len(m['facts'])} total)"


@tool
def recall() -> str:
   """Retrieve everything in long-term memory."""
   m = _load_mem()
   if not m["facts"]: return "(memory empty)"
   return "\n".join(f"- {f['fact']}" for f in m["facts"][-20:])

We define the main tools the Groq-backed agent can call during execution, including listing skills, loading skill instructions, searching the web, fetching webpages, reading files, and writing files. We also provide the agent with a sandboxed Python execution environment so it can run computations or generate artifacts when needed. We add memory tools that allow the agent to remember important facts and recall previously stored information.

@tool
def spawn_subagent(role: str, task: str,
                  allowed_tools: str = "web_search,web_fetch,file_write,file_read") -> str:
   """Spawn an isolated sub-agent with a focused role and scoped tools.
   Returns its final report string. Use for parallelizable / focused subtasks."""
   bag = {t.name: t for t in BASE_TOOLS}
   sub_tools = [bag[n.strip()] for n in allowed_tools.split(",") if n.strip() in bag]
   sub_llm = ChatOpenAI(model=MODEL_NAME, temperature=0.2).bind_tools(sub_tools)
   sys_msg = SystemMessage(content=(
       f"You are a specialized sub-agent. Role: {role}.\n"
       f"You operate in an ISOLATED context — no access to lead history.\n"
       f"Tools: {', '.join(t.name for t in sub_tools)}.\n"
       "End with a final assistant message starting 'FINAL REPORT:' "
       "containing a structured ≤700-word summary including any URLs."))
   msgs: List[BaseMessage] = [sys_msg, HumanMessage(content=task)]
   for _ in range(8):
       r = sub_llm.invoke(msgs); msgs.append(r)
       if not getattr(r, "tool_calls", None):
           return f"[sub-agent: {role}]\n" + (r.content if isinstance(r.content,str) else str(r.content))
       for tc in r.tool_calls:
           t = bag.get(tc["name"])
           try:
               res = t.invoke(tc["args"]) if t else f"unknown tool {tc['name']}"
           except Exception as e:
               res = f"tool error: {e}"
           msgs.append(ToolMessage(content=str(res)[:3000], tool_call_id=tc["id"]))
   return f"[sub-agent: {role}] step-limit reached."


BASE_TOOLS = [list_skills, load_skill, web_search, web_fetch, file_write,
             file_read, file_list, python_exec, remember, recall]
ALL_TOOLS = BASE_TOOLS + [spawn_subagent]


LEAD_SYSTEM = f"""You are DeerFlow-Lite, a long-horizon super-agent harness.


Sandbox layout (relative to {SANDBOX}):
 uploads/    – user files
 workspace/  – your scratchpad
 outputs/    – final deliverables
 skills/     – capability modules (load_skill)


Principles:
 • For non-trivial tasks: list_skills → load_skill → execute.
 • Use spawn_subagent for focused subtasks (isolated context keeps lead lean).
 • Persist intermediates to workspace/, deliverables to outputs/.
 • Use remember(fact) for cross-session knowledge.
 • Finish with a short summary of what was produced and where.


Today: {datetime.now(timezone.utc).strftime('%Y-%m-%d')}."""


class AgentState(TypedDict):
   messages: Annotated[Sequence[BaseMessage], add_messages]


llm = ChatOpenAI(model=MODEL_NAME, temperature=0.3).bind_tools(ALL_TOOLS)


def call_model(state: AgentState):
   msgs = list(state["messages"])
   if not msgs or not isinstance(msgs[0], SystemMessage):
       msgs = [SystemMessage(content=LEAD_SYSTEM)] + msgs
   return {"messages": [llm.invoke(msgs)]}


def route(state: AgentState) -> Literal["tools","__end__"]:
   last = state["messages"][-1]
   return "tools" if getattr(last, "tool_calls", None) else END


g = StateGraph(AgentState)
g.add_node("agent", call_model)
g.add_node("tools", ToolNode(ALL_TOOLS))
g.set_entry_point("agent")
g.add_conditional_edges("agent", route, {"tools":"tools", END: END})
g.add_edge("tools", "agent")
APP = g.compile()

We create a sub-agent tool that allows the main Groq-powered agent to delegate focused tasks to an isolated assistant with a limited set of tools. We then collect all available tools, define the lead system prompt, initialize the Groq-backed chat model, and bind the tools to it. We finally built the LangGraph workflow so the agent can alternate between reasoning and tool execution until it reaches a final answer.

def run(task: str, max_steps: int = 25):
   print("="*78); print(f"🦌 TASK: {task}"); print("="*78)
   state = {"messages":[HumanMessage(content=task)]}
   n = 0
   for ev in APP.stream(state, {"recursion_limit": max_steps*2}, stream_mode="updates"):
       for node, payload in ev.items():
           for m in payload.get("messages", []):
               n += 1
               if isinstance(m, AIMessage):
                   if m.tool_calls:
                       for tc in m.tool_calls:
                           args = json.dumps(tc["args"], ensure_ascii=False)
                           args = args[:140] + ("…" if len(args)>140 else "")
                           print(f"[{n:02}] 🔧 {tc['name']}({args})")
                   else:
                       txt = m.content if isinstance(m.content,str) else str(m.content)
                       print(f"[{n:02}] 🦌 {txt[:800]}")
               elif isinstance(m, ToolMessage):
                   s = str(m.content).replace("\n"," ")[:220]
                   print(f"[{n:02}] 📤 {s}")
   print("\n"+"="*78); print("✅ COMPLETE — sandbox state:"); print("="*78)
   print(file_list.invoke({"path":""}))
   print("\n🧠 Long-term memory:"); print(recall.invoke({}))
   for f in sorted((SANDBOX/"outputs").rglob("*")):
       if f.is_file():
           print(f"\n--- 📄 {f.relative_to(SANDBOX)} (first 800 chars) ---")
           print(f.read_text()[:800])


run(
   "Give me a briefing on small language models (SLMs) in 2025. "
   "(1) discover skills; (2) spawn a researcher sub-agent to gather "
   "specifics on three notable SLMs from 2024-2025 with sizes, benchmarks, "
   "and use cases — sub-agent saves to workspace/slm_research.md; "
   "(3) load report-generation skill and write outputs/slm_briefing.md "
   "(~400 words) with a Sources section; (4) save the single most "
   "important takeaway to long-term memory; (5) summarize.",
   max_steps=25,
)

We define the run() function that starts a user task, streams each agent step, and prints tool calls, tool outputs, and final responses in a readable format. We also display the sandbox file structure, long-term memory, and generated output files after the workflow completes. We finish by running a demo task in which the Groq-powered agent researches small language models, prepares a briefing, saves a report, and stores one key takeaway in memory.

In conclusion, we created a compact yet capable Groq-based agent framework that demonstrates how Groq’s OpenAI-compatible API can serve as a fast, accessible backend for advanced LLM workflows. We used LangGraph to manage the agent loop, LangChain to bind tools to the Groq-hosted model, and custom Python utilities to give the system controlled access to search, files, code execution, and memory. We also demonstrated how isolated sub-agents can help handle focused research tasks while the main agent coordinates the overall workflow. Also, we finished with a practical Groq-powered agentic system that can be extended into research assistants, automated briefing generators, and multi-step AI applications.

Check out the Full Codes with Notebook here. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Source_link

A Groq-Powered Agentic Research Assistant with LangGraph, Tool Calling, Sub-Agents, and Agentic Memory: Lets Built It

READ ALSO

How to Design Python-First Interactive Dashboards with Prefab Reactive UI Components and Static HTML Export

Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribution and Claude Code Orchestration

Related Posts

How to Design Python-First Interactive Dashboards with Prefab Reactive UI Components and Static HTML Export

Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribution and Claude Code Orchestration

Crawlee for Python: Build a Web Crawling Pipeline with Robots Handling, Link Graphs, and RAG Chunk Export

Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed

NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning

A better way to model the behavior of metal alloys | MIT News

Anthropic owes authors $1.5B — but the claims process is a mess

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

Communication Effectiveness Skills For Business Leaders

App Development Cost in Singapore: Pricing Breakdown & Insights

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

EDITOR'S PICK

Scikit-LLM vs. Traditional Text Classifiers: When Should You Use an LLM?

The 11 Best Social Media Management Tools in 2025 (+ Alternatives)

Meta Ads Creative Optimization Checklist

Understanding Logo Types and Icon Styles

About

Categories

Recent Posts