• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, September 6, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Tracing OpenAI Agent Responses using MLFlow

Josh by Josh
July 14, 2025
in Al, Analytics and Automation
0
Tracing OpenAI Agent Responses using MLFlow
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


MLflow is an open-source platform for managing and tracking machine learning experiments. When used with the OpenAI Agents SDK, MLflow automatically:

  • Logs all agent interactions and API calls
  • Captures tool usage, input/output messages, and intermediate decisions
  • Tracks runs for debugging, performance analysis, and reproducibility

This is especially useful when you’re building multi-agent systems where different agents collaborate or call functions dynamically

READ ALSO

A Playground of AI Mischief and Creative Sparks

How to Build a Complete End-to-End NLP Pipeline with Gensim: Topic Modeling, Word Embeddings, Semantic Search, and Advanced Text Analysis

In this tutorial, we’ll walk through two key examples: a simple handoff between agents, and the use of agent guardrails — all while tracing their behavior using MLflow.

Setting up the dependencies

Installing the libraries

pip install openai-agents mlflow pydantic pydotenv

OpenAI API Key

To get an OpenAI API key, visit https://platform.openai.com/settings/organization/api-keys and generate a new key. If you’re a new user, you may need to add billing details and make a minimum payment of $5 to activate API access.

Once the key is generated, create a .env file and enter the following:

OPENAI_API_KEY = <YOUR_API_KEY>

Replace <YOUR_API_KEY> with the key you generated.

Multi-Agent System (multi_agent_demo.py) 

In this script (multi_agent_demo.py), we build a simple multi-agent assistant using the OpenAI Agents SDK, designed to route user queries to either a coding expert or a cooking expert. We enable mlflow.openai.autolog(), which automatically traces and logs all agent interactions with the OpenAI API — including inputs, outputs, and agent handoffs — making it easy to monitor and debug the system. MLflow is configured to use a local file-based tracking URI (./mlruns) and logs all activity under the experiment name “Agent‑Coding‑Cooking“.

import mlflow, asyncio
from agents import Agent, Runner
import os
from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()                           # Auto‑trace every OpenAI call
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Coding‑Cooking")

coding_agent = Agent(name="Coding agent",
                     instructions="You only answer coding questions.")

cooking_agent = Agent(name="Cooking agent",
                      instructions="You only answer cooking questions.")

triage_agent = Agent(
    name="Triage agent",
    instructions="If the request is about code, handoff to coding_agent; "
                 "if about cooking, handoff to cooking_agent.",
    handoffs=[coding_agent, cooking_agent],
)

async def main():
    res = await Runner.run(triage_agent,
                           input="How do I boil pasta al dente?")
    print(res.final_output)

if __name__ == "__main__":
    asyncio.run(main())

MLFlow UI

To open the MLflow UI and view all the logged agent interactions, run the following command in a new terminal:

This will start the MLflow tracking server and display a prompt indicating the URL and port where the UI is accessible — usually http://localhost:5000 by default.

We can view the entire interaction flow in the Tracing section — from the user’s initial input to how the assistant routed the request to the appropriate agent, and finally, the response generated by that agent. This end-to-end trace provides valuable insight into decision-making, handoffs, and outputs, helping you debug and optimize your agent workflows.

Tracing Guardrails (guardrails.py) 

In this example, we implement a guardrail-protected customer support agent using the OpenAI Agents SDK with MLflow tracing. The agent is designed to help users with general queries but is restricted from answering medical-related questions. A dedicated guardrail agent checks for such inputs, and if detected, blocks the request. MLflow captures the entire flow — including guardrail activation, reasoning, and agent response — providing full traceability and insight into safety mechanisms.

import mlflow, asyncio
from pydantic import BaseModel
from agents import (
    Agent, Runner,
    GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
    input_guardrail, RunContextWrapper)

from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Guardrails")

class MedicalSymptons(BaseModel):
    medical_symptoms: bool
    reasoning: str


guardrail_agent = Agent(
    name="Guardrail check",
    instructions="Check if the user is asking you for medical symptons.",
    output_type=MedicalSymptons,
)


@input_guardrail
async def medical_guardrail(
    ctx: RunContextWrapper[None], agent: Agent, input
) -> GuardrailFunctionOutput:
    result = await Runner.run(guardrail_agent, input, context=ctx.context)

    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.medical_symptoms,
    )


agent = Agent(
    name="Customer support agent",
    instructions="You are a customer support agent. You help customers with their questions.",
    input_guardrails=[medical_guardrail],
)


async def main():
    try:
        await Runner.run(agent, "Should I take aspirin if I'm having a headache?")
        print("Guardrail didn't trip - this is unexpected")

    except InputGuardrailTripwireTriggered:
        print("Medical guardrail tripped")


if __name__ == "__main__":
    asyncio.run(main())

This script defines a customer support agent with an input guardrail that detects medical-related questions. It uses a separate guardrail_agent to evaluate whether the user’s input contains a request for medical advice. If such input is detected, the guardrail triggers and prevents the main agent from responding. The entire process, including guardrail checks and outcomes, is automatically logged and traced using MLflow.

MLFlow UI

To open the MLflow UI and view all the logged agent interactions, run the following command in a new terminal:

In this example, we asked the agent, “Should I take aspirin if I’m having a headache?”, which triggered the guardrail. In the MLflow UI, we can clearly see that the input was flagged, along with the reasoning provided by the guardrail agent for why the request was blocked.

Check out the Codes. All credit for this research goes to the researchers of this project. Ready to connect with 1 Million+ AI Devs/Engineers/Researchers? See how NVIDIA, LG AI Research, and top AI companies leverage MarkTechPost to reach their target audience [Learn More]


I am a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I have a keen interest in Data Science, especially Neural Networks and their application in various areas.



Source_link

Related Posts

A Playground of AI Mischief and Creative Sparks
Al, Analytics and Automation

A Playground of AI Mischief and Creative Sparks

September 6, 2025
How to Build a Complete End-to-End NLP Pipeline with Gensim: Topic Modeling, Word Embeddings, Semantic Search, and Advanced Text Analysis
Al, Analytics and Automation

How to Build a Complete End-to-End NLP Pipeline with Gensim: Topic Modeling, Word Embeddings, Semantic Search, and Advanced Text Analysis

September 5, 2025
Small Language Models are the Future of Agentic AI
Al, Analytics and Automation

Small Language Models are the Future of Agentic AI

September 5, 2025
When Words Cut Deeper Than Weapons
Al, Analytics and Automation

When Words Cut Deeper Than Weapons

September 5, 2025
A greener way to 3D print stronger stuff | MIT News
Al, Analytics and Automation

A greener way to 3D print stronger stuff | MIT News

September 5, 2025
Google AI Releases EmbeddingGemma: A 308M Parameter On-Device Embedding Model with State-of-the-Art MTEB Results
Al, Analytics and Automation

Google AI Releases EmbeddingGemma: A 308M Parameter On-Device Embedding Model with State-of-the-Art MTEB Results

September 5, 2025
Next Post
It’s Not Just Epstein. MAGA Is Angry About a Lot of Things

It's Not Just Epstein. MAGA Is Angry About a Lot of Things

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

Completion Of Jeneponto Wind Farm Accelerated To July

April 20, 2025

EDITOR'S PICK

Facebook Marketing for Small Businesses and Creators

Facebook Marketing for Small Businesses and Creators

August 20, 2025
REST: A Stress-Testing Framework for Evaluating Multi-Problem Reasoning in Large Reasoning Models

REST: A Stress-Testing Framework for Evaluating Multi-Problem Reasoning in Large Reasoning Models

July 27, 2025
How to turn your old iPad into a digital picture frame

How to turn your old iPad into a digital picture frame

June 5, 2025
Black Hat 2025: How Agentic AI Is finally delivering real value

Black Hat 2025: How Agentic AI Is finally delivering real value

August 8, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • A Playground of AI Mischief and Creative Sparks
  • Lessons from Canva, VEED, LinkedIn, and G2 Review Data
  • Bringing the Brand Home: What the Toronto Blue Jays’ Home Run Jacket Can Teach Us About Ritual, Symbolism and Loyalty
  • Google antitrust ruling clears the way for Apple’s Gemini push
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?