• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Friday, January 23, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Building AI agents is 5% AI and 100% software engineering

Josh by Josh
September 19, 2025
in Al, Analytics and Automation
0
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Production-grade agents live or die on data plumbing, controls, and observability—not on model choice. The doc-to-chat pipeline below maps the concrete layers and why they matter.

What is a “doc-to-chat” pipeline?

A doc-to-chat pipeline ingests enterprise documents, standardizes them, enforces governance, indexes embeddings alongside relational features, and serves retrieval + generation behind authenticated APIs with human-in-the-loop (HITL) checkpoints. It’s the reference architecture for agentic Q&A, copilots, and workflow automation where answers must respect permissions and be audit-ready. Production implementations are variations of RAG (retrieval-augmented generation) hardened with LLM guardrails, governance, and OpenTelemetry-backed tracing.

How do you integrate cleanly with the existing stack?

Use standard service boundaries (REST/JSON, gRPC) over a storage layer your org already trusts. For tables, Iceberg gives ACID, schema evolution, partition evolution, and snapshots—critical for reproducible retrieval and backfills. For vectors, use a system that coexists with SQL filters: pgvector collocates embeddings with business keys and ACL tags in PostgreSQL; dedicated engines like Milvus handle high-QPS ANN with disaggregated storage/compute. In practice, many teams run both: SQL+pgvector for transactional joins and Milvus for heavy retrieval.

Key properties

  • Iceberg tables: ACID, hidden partitioning, snapshot isolation; vendor support across warehouses.
  • pgvector: SQL + vector similarity in one query plan for precise joins and policy enforcement.
  • Milvus: layered, horizontally scalable architecture for large-scale similarity search.

How do agents, humans, and workflows coordinate on one “knowledge fabric”?

Production agents require explicit coordination points where humans approve, correct, or escalate. AWS A2I provides managed HITL loops (private workforces, flow definitions) and is a concrete blueprint for gating low-confidence outputs. Frameworks like LangGraph model these human checkpoints inside agent graphs so approvals are first-class steps in the DAG, not ad hoc callbacks. Use them to gate actions like publishing summaries, filing tickets, or committing code.

Pattern: LLM → confidence/guardrail checks → HITL gate → side-effects. Persist every artifact (prompt, retrieval set, decision) for auditability and future re-runs.

How is reliability enforced before anything reaches the model?

Treat reliability as layered defenses:

  1. Language + content guardrails: Pre-validate inputs/outputs for safety and policy. Options span managed (Bedrock Guardrails) and OSS (NeMo Guardrails, Guardrails AI; Llama Guard). Independent comparisons and a position paper catalog the trade-offs.
  2. PII detection/redaction: Run analyzers on both source docs and model I/O. Microsoft Presidio offers recognizers and masking, with explicit caveats to combine with additional controls.
  3. Access control and lineage: Enforce row-/column-level ACLs and audit across catalogs (Unity Catalog) so retrieval respects permissions; unify lineage and access policies across workspaces.
  4. Retrieval quality gates: Evaluate RAG with reference-free metrics (faithfulness, context precision/recall) using Ragas/related tooling; block or down-rank poor contexts.

How do you scale indexing and retrieval under real traffic?

Two axes matter: ingest throughput and query concurrency.

  • Ingest: Normalize at the lakehouse edge; write to Iceberg for versioned snapshots, then embed asynchronously. This enables deterministic rebuilds and point-in-time re-indexing.
  • Vector serving: Milvus’s shared-storage, disaggregated compute architecture supports horizontal scaling with independent failure domains; use HNSW/IVF/Flat hybrids and replica sets to balance recall/latency.
  • SQL + vector: Keep business joins server-side (pgvector), e.g., WHERE tenant_id = ? AND acl_tag @> ... ORDER BY embedding <-> :q LIMIT k. This avoids N+1 trips and respects policies.
  • Chunking/embedding strategy: Tune chunk size/overlap and semantic boundaries; bad chunking is the silent killer of recall.

For structured+unstructured fusion, prefer hybrid retrieval (BM25 + ANN + reranker) and store structured features next to vectors to support filters and re-ranking features at query time.

How do you monitor beyond logs?

You need traces, metrics, and evaluations stitched together:

  • Distributed tracing: Emit OpenTelemetry spans across ingestion, retrieval, model calls, and tools; LangSmith natively ingests OTEL traces and interoperates with external APMs (Jaeger, Datadog, Elastic). This gives end-to-end timing, prompts, contexts, and costs per request.
  • LLM observability platforms: Compare options (LangSmith, Arize Phoenix, LangFuse, Datadog) by tracing, evals, cost tracking, and enterprise readiness. Independent roundups and matrixes are available.
  • Continuous evaluation: Schedule RAG evals (Ragas/DeepEval/MLflow) on canary sets and live traffic replays; track faithfulness and grounding drift over time.

Add schema profiling/mapping on ingestion to keep observability attached to data shape changes (e.g., new templates, table evolution) and to explain retrieval regressions when upstream sources shift.

Example: doc-to-chat reference flow (signals and gates)

  1. Ingest: connectors → text extraction → normalization → Iceberg write (ACID, snapshots).
  2. Govern: PII scan (Presidio) → redact/mask → catalog registration with ACL policies.
  3. Index: embedding jobs → pgvector (policy-aware joins) and Milvus (high-QPS ANN).
  4. Serve: REST/gRPC → hybrid retrieval → guardrails → LLM → tool use.
  5. HITL: low-confidence paths route to A2I/LangGraph approval steps.
  6. Observe: OTEL traces to LangSmith/APM + scheduled RAG evaluations.

Why “5% AI, 100% software engineering” is accurate in practice?

Most outages and trust failures in agent systems are not model regressions; they’re data quality, permissioning, retrieval decay, or missing telemetry. The controls above—ACID tables, ACL catalogs, PII guardrails, hybrid retrieval, OTEL traces, and human gates—determine whether the same base model is safe, fast, and credibly correct for your users. Invest in these first; swap models later if needed.


References:


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🔥[Recommended Read] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Powerful and Versatile 3D Video Annotation Tool for Spatial AI



Source_link

READ ALSO

Joi Chatbot Access, Pricing, and Feature Overview

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

Related Posts

Joi Chatbot Access, Pricing, and Feature Overview
Al, Analytics and Automation

Joi Chatbot Access, Pricing, and Feature Overview

January 23, 2026
Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control
Al, Analytics and Automation

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

January 23, 2026
Quality Data Annotation for Cardiovascular AI
Al, Analytics and Automation

Quality Data Annotation for Cardiovascular AI

January 23, 2026
A Missed Forecast, Frayed Nerves and a Long Trip Back
Al, Analytics and Automation

A Missed Forecast, Frayed Nerves and a Long Trip Back

January 23, 2026
Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
Al, Analytics and Automation

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

January 23, 2026
Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future
Al, Analytics and Automation

Slow Down the Machines? Wall Street and Silicon Valley at Odds Over A.I.’s Nearest Future

January 22, 2026
Next Post
How to Migrate Azure File Shares with Zero Downtime

How to Migrate Azure File Shares with Zero Downtime

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Five B2B Marketing Themes for 2026 – Beyond B2B Marketing Podcast – TopRank® Marketing

Five B2B Marketing Themes for 2026 – Beyond B2B Marketing Podcast – TopRank® Marketing

January 12, 2026
Google announces ‘voluntary exit program’ for Pixel, Android team

Google announces ‘voluntary exit program’ for Pixel, Android team

June 7, 2025
Crafting a Winning Mother’s Day Advertising Strategy for 2025: Insights and Trends

Crafting a Winning Mother’s Day Advertising Strategy for 2025: Insights and Trends

June 4, 2025

Building Bridges Between Sales & Marketing: A People-First Approach to Unifying Teams

May 28, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • B2B Attendees Demand More Value at Events—8 Brands Weigh In
  • Celebrating the art of teaching and the science of learning
  • What is ChatGPT Atlas? – Moz
  • What Still Matters and What Doesn’t
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?