Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation

In this tutorial, we build a complete cognitive blueprint and runtime agent framework. We define structured blueprints for identity, goals, planning, memory, validation, and tool access, and use them to create agents that not only respond but also plan, execute, validate, and systematically improve their outputs. Along the tutorial, we show how the same runtime engine can support multiple agent personalities and behaviors through blueprint portability, making the overall design modular, extensible, and practical for advanced agentic AI experimentation.

import json, yaml, time, math, textwrap, datetime, getpass, os
from typing import Any, Callable, Dict, List, Optional
from dataclasses import dataclass, field
from enum import Enum


from openai import OpenAI
from pydantic import BaseModel
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from rich.tree import Tree


try:
   from google.colab import userdata
   OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
except Exception:
   OPENAI_API_KEY = getpass.getpass("🔑 Enter your OpenAI API key: ")


os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
client = OpenAI(api_key=OPENAI_API_KEY)
console = Console()


class PlanningStrategy(str, Enum):
   SEQUENTIAL   = "sequential"
   HIERARCHICAL = "hierarchical"
   REACTIVE     = "reactive"


class MemoryType(str, Enum):
   SHORT_TERM = "short_term"
   EPISODIC   = "episodic"
   PERSISTENT = "persistent"


class BlueprintIdentity(BaseModel):
   name: str
   version: str = "1.0.0"
   description: str
   author: str = "unknown"


class BlueprintMemory(BaseModel):
   type: MemoryType = MemoryType.SHORT_TERM
   window_size: int = 10
   summarize_after: int = 20


class BlueprintPlanning(BaseModel):
   strategy: PlanningStrategy = PlanningStrategy.SEQUENTIAL
   max_steps: int = 8
   max_retries: int = 2
   think_before_acting: bool = True


class BlueprintValidation(BaseModel):
   require_reasoning: bool = True
   min_response_length: int = 10
   forbidden_phrases: List[str] = []


class CognitiveBlueprint(BaseModel):
   identity: BlueprintIdentity
   goals: List[str]
   constraints: List[str] = []
   tools: List[str] = []
   memory: BlueprintMemory = BlueprintMemory()
   planning: BlueprintPlanning = BlueprintPlanning()
   validation: BlueprintValidation = BlueprintValidation()
   system_prompt_extra: str = ""


def load_blueprint_from_yaml(yaml_str: str) -> CognitiveBlueprint:
   return CognitiveBlueprint(**yaml.safe_load(yaml_str))


RESEARCH_AGENT_YAML = """
identity:
 name: ResearchBot
 version: 1.2.0
 description: Answers research questions using calculation and reasoning
 author: Auton Framework Demo
goals:
 - Answer user questions accurately using available tools
 - Show step-by-step reasoning for all answers
 - Cite the method used for each calculation
constraints:
 - Never fabricate numbers or statistics
 - Always validate mathematical results before reporting
 - Do not answer questions outside your tool capabilities
tools:
 - calculator
 - unit_converter
 - date_calculator
 - search_wikipedia_stub
memory:
 type: episodic
 window_size: 12
 summarize_after: 30
planning:
 strategy: sequential
 max_steps: 6
 max_retries: 2
 think_before_acting: true
validation:
 require_reasoning: true
 min_response_length: 20
 forbidden_phrases:
   - "I don't know"
   - "I cannot determine"
"""


DATA_ANALYST_YAML = """
identity:
 name: DataAnalystBot
 version: 2.0.0
 description: Performs statistical analysis and data summarization
 author: Auton Framework Demo
goals:
 - Compute descriptive statistics for given data
 - Identify trends and anomalies
 - Present findings clearly with numbers
constraints:
 - Only work with numerical data
 - Always report uncertainty when sample size is small (< 5 items)
tools:
 - calculator
 - statistics_engine
 - list_sorter
memory:
 type: short_term
 window_size: 6
planning:
 strategy: hierarchical
 max_steps: 10
 max_retries: 3
 think_before_acting: true
validation:
 require_reasoning: true
 min_response_length: 30
 forbidden_phrases: []
"""

We set up the core environment and define the cognitive blueprint, which structures how an agent thinks and behaves. We create strongly typed models for identity, memory configuration, planning strategy, and validation rules using Pydantic and enums. We also define two YAML-based blueprints, allowing us to configure different agent personalities and capabilities without changing the underlying runtime system.

Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding

@dataclass
class ToolSpec:
   name: str
   description: str
   parameters: Dict[str, str]
   function: Callable
   returns: str


class ToolRegistry:
   def __init__(self):
       self._tools: Dict[str, ToolSpec] = {}


   def register(self, name: str, description: str,
                parameters: Dict[str, str], returns: str):
       def decorator(fn: Callable) -> Callable:
           self._tools[name] = ToolSpec(name, description, parameters, fn, returns)
           return fn
       return decorator


   def get(self, name: str) -> Optional[ToolSpec]:
       return self._tools.get(name)


   def call(self, name: str, **kwargs) -> Any:
       spec = self._tools.get(name)
       if not spec:
           raise ValueError(f"Tool '{name}' not found in registry")
       return spec.function(**kwargs)


   def get_tool_descriptions(self, allowed: List[str]) -> str:
       lines = []
       for name in allowed:
           spec = self._tools.get(name)
           if spec:
               params = ", ".join(f"{k}: {v}" for k, v in spec.parameters.items())
               lines.append(
                   f"• {spec.name}({params})\n"
                   f"  → {spec.description}\n"
                   f"  Returns: {spec.returns}"
               )
       return "\n".join(lines)


   def list_tools(self) -> List[str]:
       return list(self._tools.keys())


registry = ToolRegistry()


@registry.register(
   name="calculator",
   description="Evaluates a safe mathematical expression",
   parameters={"expression": "A math expression string, e.g. '2 ** 10 + 5 * 3'"},
   returns="Numeric result as float"
)
def calculator(expression: str) -> str:
   try:
       allowed = {k: v for k, v in math.__dict__.items() if not k.startswith("_")}
       allowed.update({"abs": abs, "round": round, "pow": pow})
       return str(eval(expression, {"__builtins__": {}}, allowed))
   except Exception as e:
       return f"Error: {e}"


@registry.register(
   name="unit_converter",
   description="Converts between common units of measurement",
   parameters={
       "value": "Numeric value to convert",
       "from_unit": "Source unit (km, miles, kg, lbs, celsius, fahrenheit, liters, gallons, meters, feet)",
       "to_unit": "Target unit"
   },
   returns="Converted value as string with units"
)
def unit_converter(value: float, from_unit: str, to_unit: str) -> str:
   conversions = {
       ("km", "miles"): lambda x: x * 0.621371,
       ("miles", "km"): lambda x: x * 1.60934,
       ("kg", "lbs"):   lambda x: x * 2.20462,
       ("lbs", "kg"):   lambda x: x / 2.20462,
       ("celsius", "fahrenheit"): lambda x: x * 9/5 + 32,
       ("fahrenheit", "celsius"): lambda x: (x - 32) * 5/9,
       ("liters", "gallons"): lambda x: x * 0.264172,
       ("gallons", "liters"): lambda x: x * 3.78541,
       ("meters", "feet"): lambda x: x * 3.28084,
       ("feet", "meters"): lambda x: x / 3.28084,
   }
   key = (from_unit.lower(), to_unit.lower())
   if key in conversions:
       return f"{conversions[key](float(value)):.4f} {to_unit}"
   return f"Conversion from {from_unit} to {to_unit} not supported"


@registry.register(
   name="date_calculator",
   description="Calculates days between two dates, or adds/subtracts days from a date",
   parameters={
       "operation": "'days_between' or 'add_days'",
       "date1": "Date string in YYYY-MM-DD format",
       "date2": "Second date for days_between (YYYY-MM-DD), or number of days for add_days"
   },
   returns="Result as string"
)
def date_calculator(operation: str, date1: str, date2: str) -> str:
   try:
       d1 = datetime.datetime.strptime(date1, "%Y-%m-%d")
       if operation == "days_between":
           d2 = datetime.datetime.strptime(date2, "%Y-%m-%d")
           return f"{abs((d2 - d1).days)} days between {date1} and {date2}"
       elif operation == "add_days":
           result = d1 + datetime.timedelta(days=int(date2))
           return f"{result.strftime('%Y-%m-%d')} (added {date2} days to {date1})"
       return f"Unknown operation: {operation}"
   except Exception as e:
       return f"Error: {e}"


@registry.register(
   name="search_wikipedia_stub",
   description="Returns a stub summary for well-known topics (demo — no live internet)",
   parameters={"topic": "Topic to look up"},
   returns="Short text summary"
)
def search_wikipedia_stub(topic: str) -> str:
   stubs = {
       "openai": "OpenAI is an AI research company founded in 2015. It created GPT-4 and the ChatGPT product.",
   }
   for key, val in stubs.items():
       if key in topic.lower():
           return val
   return f"No stub found for '{topic}'. In production, this would query Wikipedia's API."

We implement the tool registry that allows agents to discover and use external capabilities dynamically. We design a structured system in which tools are registered with metadata, including parameters, descriptions, and return values. We also implement several practical tools, such as a calculator, unit converter, date calculator, and a Wikipedia search stub that the agents can invoke during execution.

@registry.register(
   name="statistics_engine",
   description="Computes descriptive statistics on a list of numbers",
   parameters={"numbers": "Comma-separated list of numbers, e.g. '4,8,15,16,23,42'"},
   returns="JSON with mean, median, std_dev, min, max, count"
)
def statistics_engine(numbers: str) -> str:
   try:
       nums = [float(x.strip()) for x in numbers.split(",")]
       n = len(nums)
       mean = sum(nums) / n
       sorted_nums = sorted(nums)
       mid = n // 2
       median = sorted_nums[mid] if n % 2 else (sorted_nums[mid-1] + sorted_nums[mid]) / 2
       std_dev = math.sqrt(sum((x - mean) ** 2 for x in nums) / n)
       return json.dumps({
           "count": n, "mean": round(mean, 4), "median": round(median, 4),
           "std_dev": round(std_dev, 4), "min": min(nums),
           "max": max(nums), "range": max(nums) - min(nums)
       }, indent=2)
   except Exception as e:
       return f"Error: {e}"


@registry.register(
   name="list_sorter",
   description="Sorts a comma-separated list of numbers",
   parameters={"numbers": "Comma-separated numbers", "order": "'asc' or 'desc'"},
   returns="Sorted comma-separated list"
)
def list_sorter(numbers: str, order: str = "asc") -> str:
   nums = [float(x.strip()) for x in numbers.split(",")]
   nums.sort(reverse=(order == "desc"))
   return ", ".join(str(n) for n in nums)


@dataclass
class MemoryEntry:
   role: str
   content: str
   timestamp: float = field(default_factory=time.time)
   metadata: Dict = field(default_factory=dict)


class MemoryManager:
   def __init__(self, config: BlueprintMemory, llm_client: OpenAI):
       self.config = config
       self.client = llm_client
       self._history: List[MemoryEntry] = []
       self._summary: str = ""


   def add(self, role: str, content: str, metadata: Dict = None):
       self._history.append(MemoryEntry(role=role, content=content, metadata=metadata or {}))
       if (self.config.type == MemoryType.EPISODIC and
               len(self._history) > self.config.summarize_after):
           self._compress_memory()


   def _compress_memory(self):
       to_compress = self._history[:-self.config.window_size]
       self._history = self._history[-self.config.window_size:]
       text = "\n".join(f"{e.role}: {e.content[:200]}" for e in to_compress)
       try:
           resp = self.client.chat.completions.create(
               model="gpt-4o-mini",
               messages=[{"role": "user", "content":
                   f"Summarize this conversation history in 3 sentences:\n{text}"}],
               max_tokens=150
           )
           self._summary += " " + resp.choices[0].message.content.strip()
       except Exception:
           self._summary += f" [compressed {len(to_compress)} messages]"


   def get_messages(self, system_prompt: str) -> List[Dict]:
       messages = [{"role": "system", "content": system_prompt}]
       if self._summary:
           messages.append({"role": "system",
               "content": f"[Memory Summary]: {self._summary.strip()}"})
       for entry in self._history[-self.config.window_size:]:
           messages.append({
               "role": entry.role if entry.role != "tool" else "assistant",
               "content": entry.content
           })
       return messages


   def clear(self):
       self._history = []
       self._summary = ""


   @property
   def message_count(self) -> int:
       return len(self._history)

We extend the tool ecosystem and introduce the memory management layer that stores conversation history and compresses it when necessary. We implement statistical tools and sorting utilities that enable the data analysis agent to perform structured numerical operations. At the same time, we design a memory system that tracks interactions, summarizes long histories, and provides contextual messages to the language model.

@dataclass
class PlanStep:
   step_id: int
   description: str
   tool: Optional[str]
   tool_args: Dict[str, Any]
   reasoning: str


@dataclass
class Plan:
   task: str
   steps: List[PlanStep]
   strategy: PlanningStrategy


class Planner:
   def __init__(self, blueprint: CognitiveBlueprint,
                registry: ToolRegistry, llm_client: OpenAI):
       self.blueprint = blueprint
       self.registry  = registry
       self.client    = llm_client


   def _build_planner_prompt(self) -> str:
       bp = self.blueprint
       return textwrap.dedent(f"""
           You are {bp.identity.name}, version {bp.identity.version}.
           {bp.identity.description}


           ## Your Goals:
           {chr(10).join(f'  - {g}' for g in bp.goals)}


           ## Your Constraints:
           {chr(10).join(f'  - {c}' for c in bp.constraints)}


           ## Available Tools:
           {self.registry.get_tool_descriptions(bp.tools)}


           ## Planning Strategy: {bp.planning.strategy}
           ## Max Steps: {bp.planning.max_steps}


           Given a user task, produce a JSON execution plan with this exact structure:
           {{
             "steps": [
               {{
                 "step_id": 1,
                 "description": "What this step does",
                 "tool": "tool_name or null if no tool needed",
                 "tool_args": {{"arg1": "value1"}},
                 "reasoning": "Why this step is needed"
               }}
             ]
           }}


           Rules:
           - Only use tools listed above
           - Set tool to null for pure reasoning steps
           - Keep steps <= {bp.planning.max_steps}
           - Return ONLY valid JSON, no markdown fences
           {bp.system_prompt_extra}
       """).strip()


   def plan(self, task: str, memory: MemoryManager) -> Plan:
       system_prompt = self._build_planner_prompt()
       messages = memory.get_messages(system_prompt)
       messages.append({"role": "user", "content":
           f"Create a plan to complete this task: {task}"})
       resp = self.client.chat.completions.create(
           model="gpt-4o-mini", messages=messages,
           max_tokens=1200, temperature=0.2
       )
       raw = resp.choices[0].message.content.strip()
       raw = raw.replace("```json", "").replace("```", "").strip()
       data = json.loads(raw)
       steps = [
           PlanStep(
               step_id=s["step_id"], description=s["description"],
               tool=s.get("tool"), tool_args=s.get("tool_args", {}),
               reasoning=s.get("reasoning", "")
           )
           for s in data["steps"]
       ]
       return Plan(task=task, steps=steps, strategy=self.blueprint.planning.strategy)


@dataclass
class StepResult:
   step_id: int
   success: bool
   output: str
   tool_used: Optional[str]
   error: Optional[str] = None


@dataclass
class ExecutionTrace:
   plan: Plan
   results: List[StepResult]
   final_answer: str


class Executor:
   def __init__(self, blueprint: CognitiveBlueprint,
                registry: ToolRegistry, llm_client: OpenAI):
       self.blueprint = blueprint
       self.registry  = registry
       self.client    = llm_client

We implement the planning system that transforms a user task into a structured execution plan composed of multiple steps. We design a planner that instructs the language model to produce a JSON plan containing reasoning, tool selection, and arguments for each step. This planning layer allows the agent to break complex problems into smaller executable actions before performing them.

  def execute_plan(self, plan: Plan, memory: MemoryManager,
                    verbose: bool = True) -> ExecutionTrace:
       results: List[StepResult] = []
       if verbose:
           console.print(f"\n[bold yellow]⚡ Executing:[/] {plan.task}")
           console.print(f"   Strategy: {plan.strategy} | Steps: {len(plan.steps)}")


       for step in plan.steps:
           if verbose:
               console.print(f"\n  [cyan]Step {step.step_id}:[/] {step.description}")
           try:
               if step.tool and step.tool != "null":
                   if verbose:
                       console.print(f"   🔧 Tool: [green]{step.tool}[/] | Args: {step.tool_args}")
                   output = self.registry.call(step.tool, **step.tool_args)
                   result = StepResult(step.step_id, True, str(output), step.tool)
                   if verbose:
                       console.print(f"   ✅ Result: {output}")
               else:
                   context_text = "\n".join(
                       f"Step {r.step_id} result: {r.output}" for r in results)
                   prompt = (
                       f"Previous results:\n{context_text}\n\n"
                       f"Now complete this step: {step.description}\n"
                       f"Reasoning hint: {step.reasoning}"
                   ) if context_text else (
                       f"Complete this step: {step.description}\n"
                       f"Reasoning hint: {step.reasoning}"
                   )
                   sys_prompt = (
                       f"You are {self.blueprint.identity.name}. "
                       f"{self.blueprint.identity.description}. "
                       f"Constraints: {'; '.join(self.blueprint.constraints)}"
                   )
                   resp = self.client.chat.completions.create(
                       model="gpt-4o-mini",
                       messages=[
                           {"role": "system", "content": sys_prompt},
                           {"role": "user",   "content": prompt}
                       ],
                       max_tokens=500, temperature=0.3
                   )
                   output = resp.choices[0].message.content.strip()
                   result = StepResult(step.step_id, True, output, None)
                   if verbose:
                       preview = output[:120] + "..." if len(output) > 120 else output
                       console.print(f"   🤔 Reasoning: {preview}")
           except Exception as e:
               result = StepResult(step.step_id, False, "", step.tool, str(e))
               if verbose:
                   console.print(f"   ❌ Error: {e}")
           results.append(result)


       final_answer = self._synthesize(plan, results, memory)
       return ExecutionTrace(plan=plan, results=results, final_answer=final_answer)


   def _synthesize(self, plan: Plan, results: List[StepResult],
                   memory: MemoryManager) -> str:
       steps_summary = "\n".join(
           f"Step {r.step_id} ({'✅' if r.success else '❌'}): {r.output[:300]}"
           for r in results
       )
       synthesis_prompt = (
           f"Original task: {plan.task}\n\n"
           f"Step results:\n{steps_summary}\n\n"
           f"Provide a clear, complete final answer. Integrate all step results."
       )
       sys_prompt = (
           f"You are {self.blueprint.identity.name}. "
           + ("Always show your reasoning. " if self.blueprint.validation.require_reasoning else "")
           + f"Goals: {'; '.join(self.blueprint.goals)}"
       )
       messages = memory.get_messages(sys_prompt)
       messages.append({"role": "user", "content": synthesis_prompt})
       resp = self.client.chat.completions.create(
           model="gpt-4o-mini", messages=messages,
           max_tokens=600, temperature=0.3
       )
       return resp.choices[0].message.content.strip()


@dataclass
class ValidationResult:
   passed: bool
   issues: List[str]
   score: float


class Validator:
   def __init__(self, blueprint: CognitiveBlueprint, llm_client: OpenAI):
       self.blueprint = blueprint
       self.client    = llm_client


   def validate(self, answer: str, task: str,
                use_llm_check: bool = False) -> ValidationResult:
       issues = []
       v = self.blueprint.validation


       if len(answer) < v.min_response_length:
           issues.append(f"Response too short: {len(answer)} chars (min: {v.min_response_length})")


       answer_lower = answer.lower()
       for phrase in v.forbidden_phrases:
           if phrase.lower() in answer_lower:
               issues.append(f"Forbidden phrase detected: '{phrase}'")


       if v.require_reasoning:
           indicators = ["because", "therefore", "since", "step", "first",
                         "result", "calculated", "computed", "found that"]
           if not any(ind in answer_lower for ind in indicators):
               issues.append("Response lacks visible reasoning or explanation")


       if use_llm_check:
           issues.extend(self._llm_quality_check(answer, task))


       return ValidationResult(passed=len(issues) == 0,
                               issues=issues,
                               score=max(0.0, 1.0 - len(issues) * 0.25))


   def _llm_quality_check(self, answer: str, task: str) -> List[str]:
       prompt = (
           f"Task: {task}\n\nAnswer: {answer[:500]}\n\n"
           f'Does this answer address the task? Reply JSON: {{"on_topic": true/false, "issue": "..."}}'
       )
       try:
           resp = self.client.chat.completions.create(
               model="gpt-4o-mini",
               messages=[{"role": "user", "content": prompt}],
               max_tokens=100
           )
           raw = resp.choices[0].message.content.strip().replace("```json","").replace("```","")
           data = json.loads(raw)
           if not data.get("on_topic", True):
               return [f"LLM quality check: {data.get('issue', 'off-topic')}"]
       except Exception:
           pass
       return []

We build the executor and validation logic that actually performs the steps generated by the planner. We implement a system that can either call registered tools or perform reasoning through the language model, depending on the step definition. We also add a validator that checks the final response against blueprint constraints such as minimum length, reasoning requirements, and forbidden phrases.

@dataclass
class AgentResponse:
   agent_name: str
   task: str
   final_answer: str
   trace: ExecutionTrace
   validation: ValidationResult
   retries: int
   total_steps: int


class RuntimeEngine:
   def __init__(self, blueprint: CognitiveBlueprint,
                registry: ToolRegistry, llm_client: OpenAI):
       self.blueprint = blueprint
       self.memory    = MemoryManager(blueprint.memory, llm_client)
       self.planner   = Planner(blueprint, registry, llm_client)
       self.executor  = Executor(blueprint, registry, llm_client)
       self.validator = Validator(blueprint, llm_client)


   def run(self, task: str, verbose: bool = True) -> AgentResponse:
       bp = self.blueprint
       if verbose:
           console.print(Panel(
               f"[bold]Agent:[/] {bp.identity.name} v{bp.identity.version}\n"
               f"[bold]Task:[/] {task}\n"
               f"[bold]Strategy:[/] {bp.planning.strategy} | "
               f"Max Steps: {bp.planning.max_steps} | "
               f"Max Retries: {bp.planning.max_retries}",
               title="🚀 Runtime Engine Starting", border_style="blue"
           ))


       self.memory.add("user", task)
       retries, trace, validation = 0, None, None


       for attempt in range(bp.planning.max_retries + 1):
           if attempt > 0 and verbose:
               console.print(f"\n[yellow]⟳ Retry {attempt}/{bp.planning.max_retries}[/]")
               console.print(f"  Issues: {', '.join(validation.issues)}")


           if verbose:
               console.print("\n[bold magenta]📋 Phase 1: Planning...[/]")
           try:
               plan = self.planner.plan(task, self.memory)
               if verbose:
                   tree = Tree(f"[bold]Plan ({len(plan.steps)} steps)[/]")
                   for s in plan.steps:
                       icon = "🔧" if s.tool else "🤔"
                       branch = tree.add(f"{icon} Step {s.step_id}: {s.description}")
                       if s.tool:
                           branch.add(f"[green]Tool:[/] {s.tool}")
                           branch.add(f"[yellow]Args:[/] {s.tool_args}")
                   console.print(tree)
           except Exception as e:
               if verbose: console.print(f"[red]Planning failed:[/] {e}")
               break


           if verbose:
               console.print("\n[bold magenta]⚡ Phase 2: Executing...[/]")
           trace = self.executor.execute_plan(plan, self.memory, verbose=verbose)


           if verbose:
               console.print("\n[bold magenta]✅ Phase 3: Validating...[/]")
           validation = self.validator.validate(trace.final_answer, task)


           if verbose:
               status = "[green]PASSED[/]" if validation.passed else "[red]FAILED[/]"
               console.print(f"  Validation: {status} | Score: {validation.score:.2f}")
               for issue in validation.issues:
                   console.print(f"  ⚠️  {issue}")


           if validation.passed:
               break


           retries += 1
           self.memory.add("assistant", trace.final_answer)
           self.memory.add("user",
               f"Your previous answer had issues: {'; '.join(validation.issues)}. "
               f"Please improve."
           )


       if trace:
           self.memory.add("assistant", trace.final_answer)


       if verbose:
           console.print(Panel(
               trace.final_answer if trace else "No answer generated",
               title=f"🎯 Final Answer — {bp.identity.name}",
               border_style="green"
           ))


       return AgentResponse(
           agent_name=bp.identity.name, task=task,
           final_answer=trace.final_answer if trace else "",
           trace=trace, validation=validation,
           retries=retries,
           total_steps=len(trace.results) if trace else 0
       )


   def reset_memory(self):
       self.memory.clear()


def build_engine(blueprint_yaml: str, registry: ToolRegistry,
                llm_client: OpenAI) -> RuntimeEngine:
   return RuntimeEngine(load_blueprint_from_yaml(blueprint_yaml), registry, llm_client)


if __name__ == "__main__":


   print("\n" + "="*60)
   print("DEMO 1: ResearchBot")
   print("="*60)
   research_engine = build_engine(RESEARCH_AGENT_YAML, registry, client)
   research_engine.run(
       task=(
           "how many steps of 20cm height would that be? Also, if I burn 0.15 "
           "calories per step, what's the total calorie burn? Show all calculations."
       )
   )


   print("\n" + "="*60)
   print("DEMO 2: DataAnalystBot")
   print("="*60)
   analyst_engine = build_engine(DATA_ANALYST_YAML, registry, client)
   analyst_engine.run(
       task=(
           "Analyze this dataset of monthly sales figures (in thousands): "
           "142, 198, 173, 155, 221, 189, 203, 167, 244, 198, 212, 231. "
           "Compute key statistics, identify the best and worst months, "
           "and calculate growth from first to last month."
       )
   )


   print("\n" + "="*60)
   print("PORTABILITY DEMO: Same task → 2 different blueprints")
   print("="*60)
   SHARED_TASK = "Calculate 15% of 2,500 and tell me the result."


   responses = {}
   for name, yaml_str in [
       ("ResearchBot",    RESEARCH_AGENT_YAML),
       ("DataAnalystBot", DATA_ANALYST_YAML),
   ]:
       eng = build_engine(yaml_str, registry, client)
       responses[name] = eng.run(SHARED_TASK, verbose=False)


   table = Table(title="🔄 Blueprint Portability", show_header=True, show_lines=True)
   table.add_column("Agent",   style="cyan",   width=18)
   table.add_column("Steps",   style="yellow", width=6)
   table.add_column("Valid?",  width=7)
   table.add_column("Score",   width=6)
   table.add_column("Answer Preview", width=55)


   for name, r in responses.items():
       table.add_row(
           name, str(r.total_steps),
           "✅" if r.validation.passed else "❌",
           f"{r.validation.score:.2f}",
           r.final_answer[:140] + "..."
       )
   console.print(table)

We assemble the runtime engine that orchestrates planning, execution, memory updates, and validation into a complete autonomous workflow. We run multiple demonstrations showing how different blueprints produce different behaviors while using the same core architecture. Finally, we illustrate blueprint portability by running the same task across two agents and comparing their results.

In conclusion, we created a fully functional Auton-style runtime system that integrates cognitive blueprints, tool registries, memory management, planning, execution, and validation into a cohesive framework. We demonstrated how different agents can share the same underlying architecture while behaving differently through customized blueprints, highlighting the design’s flexibility and power. Through this implementation, we not only explored how modern runtime agents operate but also built a strong foundation that we can extend further with richer tools, stronger memory systems, and more advanced autonomous behaviors.

Check out the Full Codes here and Related Paper. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Source_link

Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation

READ ALSO

Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding

Related Posts

Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Compact Multimodal Model for Math, Science, and GUI Understanding

5 Essential Security Patterns for Robust Agentic AI

Google Launches TensorFlow 2.21 And LiteRT: Faster GPU Performance, New NPU Acceleration, And Seamless PyTorch Edge Deployment Upgrades

Vector Databases vs. Graph RAG for Agent Memory: When to Use Which

Pay for the data you’re using

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Communication Effectiveness Skills For Business Leaders

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

App Development Cost in Singapore: Pricing Breakdown & Insights

Google announced the next step in its nuclear energy plans

EDITOR'S PICK

The New Look of D’Olive Olive Oil

How Simplicity Builds Competitive Advantage

Leaked Data is Unveiled at NordVPN’s Live Hacking Event

Google Finance Becomes Your AI-Powered Financial Sidekick—Beyond Tickers and into Conversations

About

Categories

Recent Posts