• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Thursday, January 22, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Technology And Software

A weekend ‘vibe code’ hack by Andrej Karpathy quietly sketches the missing layer of enterprise AI orchestration

Josh by Josh
November 26, 2025
in Technology And Software
0
A weekend ‘vibe code’ hack by Andrej Karpathy quietly sketches the missing layer of enterprise AI orchestration
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter



This weekend, Andrej Karpathy, the former director of AI at Tesla and a founding member of OpenAI, decided he wanted to read a book. But he did not want to read it alone. He wanted to read it accompanied by a committee of artificial intelligences, each offering its own perspective, critiquing the others, and eventually synthesizing a final answer under the guidance of a "Chairman."

READ ALSO

Humans& thinks coordination is the next frontier for AI, and they’re building a model to prove it

8 Best Gig Economy Jobs To Consider For Passive Income

To make this happen, Karpathy wrote what he called a "vibe code project" — a piece of software written quickly, largely by AI assistants, intended for fun rather than function. He posted the result, a repository called "LLM Council," to GitHub with a stark disclaimer: "I’m not going to support it in any way… Code is ephemeral now and libraries are over."

Yet, for technical decision-makers across the enterprise landscape, looking past the casual disclaimer reveals something far more significant than a weekend toy. In a few hundred lines of Python and JavaScript, Karpathy has sketched a reference architecture for the most critical, undefined layer of the modern software stack: the orchestration middleware sitting between corporate applications and the volatile market of AI models.

As companies finalize their platform investments for 2026, LLM Council offers a stripped-down look at the "build vs. buy" reality of AI infrastructure. It demonstrates that while the logic of routing and aggregating AI models is surprisingly simple, the operational wrapper required to make it enterprise-ready is where the true complexity lies.

How the LLM Council works: Four AI models debate, critique, and synthesize answers

To the casual observer, the LLM Council web application looks almost identical to ChatGPT. A user types a query into a chat box. But behind the scenes, the application triggers a sophisticated, three-stage workflow that mirrors how human decision-making bodies operate.

First, the system dispatches the user’s query to a panel of frontier models. In Karpathy’s default configuration, this includes OpenAI’s GPT-5.1, Google’s Gemini 3.0 Pro, Anthropic’s Claude Sonnet 4.5, and xAI’s Grok 4. These models generate their initial responses in parallel.

In the second stage, the software performs a peer review. Each model is fed the anonymized responses of its counterparts and asked to evaluate them based on accuracy and insight. This step transforms the AI from a generator into a critic, forcing a layer of quality control that is rare in standard chatbot interactions.

Finally, a designated "Chairman LLM" — currently configured as Google’s Gemini 3 — receives the original query, the individual responses, and the peer rankings. It synthesizes this mass of context into a single, authoritative answer for the user.

Karpathy noted that the results were often surprising. "Quite often, the models are surprisingly willing to select another LLM's response as superior to their own," he wrote on X (formerly Twitter). He described using the tool to read book chapters, observing that the models consistently praised GPT-5.1 as the most insightful while rating Claude the lowest. However, Karpathy’s own qualitative assessment diverged from his digital council; he found GPT-5.1 "too wordy" and preferred the "condensed and processed" output of Gemini.

FastAPI, OpenRouter, and the case for treating frontier models as swappable components

For CTOs and platform architects, the value of LLM Council lies not in its literary criticism, but in its construction. The repository serves as a primary document showing exactly what a modern, minimal AI stack looks like in late 2025.

The application is built on a "thin" architecture. The backend uses FastAPI, a modern Python framework, while the frontend is a standard React application built with Vite. Data storage is handled not by a complex database, but by simple JSON files written to the local disk.

The linchpin of the entire operation is OpenRouter, an API aggregator that normalizes the differences between various model providers. By routing requests through this single broker, Karpathy avoided writing separate integration code for OpenAI, Google, and Anthropic. The application does not know or care which company provides the intelligence; it simply sends a prompt and awaits a response.

This design choice highlights a growing trend in enterprise architecture: the commoditization of the model layer. By treating frontier models as interchangeable components that can be swapped by editing a single line in a configuration file — specifically the COUNCIL_MODELS list in the backend code — the architecture protects the application from vendor lock-in. If a new model from Meta or Mistral tops the leaderboards next week, it can be added to the council in seconds.

What's missing from prototype to production: Authentication, PII redaction, and compliance

While the core logic of LLM Council is elegant, it also serves as a stark illustration of the gap between a "weekend hack" and a production system. For an enterprise platform team, cloning Karpathy’s repository is merely step one of a marathon.

A technical audit of the code reveals the missing "boring" infrastructure that commercial vendors sell for premium prices. The system lacks authentication; anyone with access to the web interface can query the models. There is no concept of user roles, meaning a junior developer has the same access rights as the CIO.

Furthermore, the governance layer is nonexistent. In a corporate environment, sending data to four different external AI providers simultaneously triggers immediate compliance concerns. There is no mechanism here to redact Personally Identifiable Information (PII) before it leaves the local network, nor is there an audit log to track who asked what.

Reliability is another open question. The system assumes the OpenRouter API is always up and that the models will respond in a timely fashion. It lacks the circuit breakers, fallback strategies, and retry logic that keep business-critical applications running when a provider suffers an outage.

These absences are not flaws in Karpathy’s code — he explicitly stated he does not intend to support or improve the project — but they define the value proposition for the commercial AI infrastructure market.

Companies like LangChain, AWS Bedrock, and various AI gateway startups are essentially selling the "hardening" around the core logic that Karpathy demonstrated. They provide the security, observability, and compliance wrappers that turn a raw orchestration script into a viable enterprise platform.

Why Karpathy believes code is now "ephemeral" and traditional software libraries are obsolete

Perhaps the most provocative aspect of the project is the philosophy under which it was built. Karpathy described the development process as "99% vibe-coded," implying he relied heavily on AI assistants to generate the code rather than writing it line-by-line himself.

"Code is ephemeral now and libraries are over, ask your LLM to change it in whatever way you like," he wrote in the repository’s documentation.

This statement marks a radical shift in software engineering capability. Traditionally, companies build internal libraries and abstractions to manage complexity, maintaining them for years. Karpathy is suggesting a future where code is treated as "promptable scaffolding" — disposable, easily rewritten by AI, and not meant to last.

For enterprise decision-makers, this poses a difficult strategic question. If internal tools can be "vibe coded" in a weekend, does it make sense to buy expensive, rigid software suites for internal workflows? Or should platform teams empower their engineers to generate custom, disposable tools that fit their exact needs for a fraction of the cost?

When AI models judge AI: The dangerous gap between machine preferences and human needs

Beyond the architecture, the LLM Council project inadvertently shines a light on a specific risk in automated AI deployment: the divergence between human and machine judgment.

Karpathy’s observation that his models preferred GPT-5.1, while he preferred Gemini, suggests that AI models may have shared biases. They might favor verbosity, specific formatting, or rhetorical confidence that does not necessarily align with human business needs for brevity and accuracy.

As enterprises increasingly rely on "LLM-as-a-Judge" systems to evaluate the quality of their customer-facing bots, this discrepancy matters. If the automated evaluator consistently rewards "wordy and sprawled" answers while human customers want concise solutions, the metrics will show success while customer satisfaction plummets. Karpathy’s experiment suggests that relying solely on AI to grade AI is a strategy fraught with hidden alignment issues.

What enterprise platform teams can learn from a weekend hack before building their 2026 stack

Ultimately, LLM Council acts as a Rorschach test for the AI industry. For the hobbyist, it is a fun way to read books. For the vendor, it is a threat, proving that the core functionality of their products can be replicated in a few hundred lines of code.

But for the enterprise technology leader, it is a reference architecture. It demystifies the orchestration layer, showing that the technical challenge is not in routing the prompts, but in governing the data.

As platform teams head into 2026, many will likely find themselves staring at Karpathy’s code, not to deploy it, but to understand it. It proves that a multi-model strategy is not technically out of reach. The question remains whether companies will build the governance layer themselves or pay someone else to wrap the "vibe code" in enterprise-grade armor.



Source_link

Related Posts

Humans& thinks coordination is the next frontier for AI, and they’re building a model to prove it
Technology And Software

Humans& thinks coordination is the next frontier for AI, and they’re building a model to prove it

January 22, 2026
8 Best Gig Economy Jobs To Consider For Passive Income
Technology And Software

8 Best Gig Economy Jobs To Consider For Passive Income

January 22, 2026
Why LinkedIn says prompting was a non-starter — and small models was the breakthrough
Technology And Software

Why LinkedIn says prompting was a non-starter — and small models was the breakthrough

January 22, 2026
X is also launching Bluesky-like starter packs
Technology And Software

X is also launching Bluesky-like starter packs

January 22, 2026
What Type of Mattress Is Right for You? (2026)
Technology And Software

What Type of Mattress Is Right for You? (2026)

January 22, 2026
Sources: project SGLang spins out as RadixArk with $400M valuation as inference market explodes
Technology And Software

Sources: project SGLang spins out as RadixArk with $400M valuation as inference market explodes

January 21, 2026
Next Post
Brookline PR and the Field of Crosses – Brookline PR

Brookline PR and the Field of Crosses – Brookline PR

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Google announced the next step in its nuclear energy plans 

Google announced the next step in its nuclear energy plans 

August 20, 2025

EDITOR'S PICK

Napier Shortlisted for 2025 Instrumentation and Electronics Awards – Vote Now!

Napier Shortlisted for 2025 Instrumentation and Electronics Awards – Vote Now!

August 6, 2025

The year-end crunch: How comms leaders can wrap up the year without unraveling

December 10, 2025
How We Built a Content Optimization Tool for AI Search [Study]

How We Built a Content Optimization Tool for AI Search [Study]

January 14, 2026
8 Best Photo Editing Software I’ve Tested and Recommend

8 Best Photo Editing Software I’ve Tested and Recommend

November 11, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Is this the end of Epic and Google’s feud?
  • How to Submit a Sitemap to Google (in 3 Simple Steps)
  • Your brand should show up early to be relevant during Super Bowl LX
  • 10 Last Mile Technology Trends Transforming Urban Logistics in 2025
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?