• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, June 20, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed

Josh by Josh
June 20, 2026
in Al, Analytics and Automation
0
Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed


TLDR

  • YaFF is Yandex’s open-source zero-copy wire format for Protobuf — Apache 2.0, currently C++, v0.1.0.
  • The .proto file stays the source of truth; only the physical memory layout changes.
  • On Yandex’s benchmarks, the Flat Layout reads hot data ~3.8× faster than FlatBuffers, within 1.2× of a raw C++ struct.
  • Four layouts — Fixed, Flat, Sparse, Dynamic — trade read speed for schema flexibility; Dynamic is the default.
  • YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale. 
  • Adoption is incremental: drop it into one hot path, with two-way Protobuf conversion at the edges.

Yandex has open-sourced YaFF (Yet another Flat Format) under Apache 2.0. It is a high-performance C++ serialization library. YaFF provides a zero-copy wire format for the Protobuf ecosystem. Your .proto file stays the single source of truth. The format only changes how data sits in memory. It concentrates on server-side runtimes.

What is YaFF 

YaFF is not a replacement for Protobuf. It is an alternative wire format for Protobuf messages. The same .proto schema generates a proto-like C++ API. Reads need no parsing step, so fields come straight from the buffer. Less performance-sensitive code can still parse the wire format back into Protobuf messages. That two-way conversion is what makes module-by-module adoption realistic. You introduce YaFF in one hot path and leave the rest on Protobuf.

READ ALSO

NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning

A better way to model the behavior of metal alloys | MIT News

The Problem it Targets

Protobuf parsing can consume double-digit percentages of CPU in high-load backends. At scale, that maps to thousands of physical cores. The common  zero-copy option  is FlatBuffers, also from Google. But FlatBuffers is not a Protobuf drop-in and requires maintaining a separate schema and conversion layer. semantically incompatible with Protobuf. Migrating means duplicated schemas, different schema-evolution rules , and hand-written field converters. Many teams conclude the cost is not worth it. YaFF aims at that gap: zero-copy reads with Protobuf semantics preserved.

How the Layouts Work

A layout decides how a message is stored in the buffer. It changes only the physical representation, leaving the schema and generated interfaces unchanged. YaFF ships four layouts. Fixed is a plain packed struct with no header and a frozen schema. Flat adds a two-byte header and supports schema evolution. Sparse addresses fields through a meta table, fitting sparse schemas. Dynamic is the default and selects Flat or Sparse at runtime. It uses Flat while the schema permits, then switches to Sparse when evolution breaks flat alignment.

Layout Read access Per-message overhead Schema evolution Best for
Fixed 1 read, 0 branches 0 bytes Frozen Small inlined primitives
Flat 2 reads, 1 branch 2 bytes Restricted (type preservation) Dense, hot data
Sparse 4 reads, 2 branches 6 bytes Unrestricted Sparse schemas, free evolution
Dynamic (default) Flat or Sparse at runtime 2 or 6 bytes Unrestricted General application logic

Benchmark

Yandex ships a reproducible benchmark suite, built with google/benchmark in a Release build. The numbers below are median nanoseconds per read on an AMD EPYC 7713 with Clang 20.1.8. Lower is faster. In the hot hierarchical case, the Flat Layout reads in 9.79 ns. FlatBuffers needs 37.30 ns, and Protobuf needs 219.35 ns. The raw C++ struct baseline is 8.14 ns. So the Flat Layout reads about 3.8× faster than FlatBuffers here, and about 22× faster than Protobuf. It stays within 1.2× of the raw struct.

Format Read time (ns) Slowdown vs raw struct
Raw C++ struct 8.14 1.0×
YaFF Flat Layout 9.79 1.2×
YaFF Sparse Layout 21.23 2.6×
FlatBuffers 37.30 4.6×
Protobuf 219.35 26.9×
Median ns per read, hierarchical / hot / no chain caching. Source: https://yaff.tech/docs/en/benchmarks/access 

Note: The absolute numbers depend on the host CPU and memory. The ratios between formats are expected to hold across hardware.

The Compiler Aliasing Detail

FlatBuffers and YaFF both read fields by reinterpreting raw memory as the target type. That type-punning leaves TBAA without strong enough facts. So LLVM’s alias analysis falls back to a conservative MayAlias verdict. The compiler then cannot prove that repeated accesses are safe to reuse. Writing root.intermediate().leaf().a() twice re-walks the tree each time. YaFF adds annotations in its generated code that tell the compiler when reuse is safe. YaFF’s generated-code annotations can often help the compiler reuse the access chain, as long as the relevant memory is not modified between reads. As long as nothing writes to memory between reads, YaFF caches the access chain on its own.

Where It Fits: Use Cases

YaFF targets systems where you control both producer and consumer. Recommendation and ad-serving backends are the clearest fit. According to Yandex, YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale. Memory-mapped indexes are a second fit. A host can hold tens of gigabytes of local data. Those mmap-able indexes survive service restarts without re-parsing. Search indexes, feature stores, and feed services share that read-heavy profile. The planned Columnar Layout targets analytics and ML pipelines with large repeated fields. YaFF can also be more compact than FlatBuffers, which helps cache behavior.

A Look at the Code

The read path mirrors Protobuf, minus the parse step.

#include "feed.pb.h"     // generated by protoc
#include "feed.yaff.h"   // generated by yaff_generate()

// 1. Serialize an existing Protobuf message into a YaFF buffer.
feed::FeedResponse proto = LoadFeedResponse();
const auto buffer = yaff::Serialize<protoyaff::feed::FeedResponse>(proto);

// 2. Read fields directly from the buffer. There is no parsing step.
const auto& response = yaff::ReadMessage<protoyaff::feed::FeedResponse>(buffer.Data());
for (const auto& item : response.items()) {
    std::string_view title  = item.title();
    std::string_view author = item.author().name();  // empty if author is unset
}

// 3. Convert back to Protobuf when a consumer needs the parsed message.
feed::FeedResponse restored;
response.ParseTo(restored);

You add YaFF through CMake (find_package) or Conan. Code generation runs protobuf_generate() then yaff_generate(). Generated YaFF types live in the protoyaff::<package> namespace. Most projects only link yaff::core and yaff::proto.


Resources:

Check out the GitHub repository and Documentation.




Source_link

Related Posts

NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning
Al, Analytics and Automation

NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning

June 20, 2026
A better way to model the behavior of metal alloys | MIT News
Al, Analytics and Automation

A better way to model the behavior of metal alloys | MIT News

June 19, 2026
Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages
Al, Analytics and Automation

Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages

June 19, 2026
MIT in the media: For the future of tech, “Massachusetts can absolutely lead” | MIT News
Al, Analytics and Automation

MIT in the media: For the future of tech, “Massachusetts can absolutely lead” | MIT News

June 19, 2026
Perplexity Launches Brain, a Self-Improving Memory System That Builds a Context Graph of an Agent’s Work and Learns Overnight
Al, Analytics and Automation

Perplexity Launches Brain, a Self-Improving Memory System That Builds a Context Graph of an Agent’s Work and Learns Overnight

June 18, 2026
In game theory, generalists sometimes win out over specialists | MIT News
Al, Analytics and Automation

In game theory, generalists sometimes win out over specialists | MIT News

June 18, 2026
Next Post
GeoGuessr Daily Challenge Answer Today for June 20, 2026

GeoGuessr Daily Challenge Answer Today for June 20, 2026

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

OpenAI CEO apologizes to Tumbler Ridge community

OpenAI CEO apologizes to Tumbler Ridge community

April 25, 2026
Learn From Ferrari’s Mistake & Apply This Simple Lesson to Your Business — Bolder&Louder

Learn From Ferrari’s Mistake & Apply This Simple Lesson to Your Business — Bolder&Louder

June 5, 2025
14 Sponsorship and Media Options for Agencies and Company Partners

14 Sponsorship and Media Options for Agencies and Company Partners

June 6, 2025
The Space Invaders movie is apparently still happening

The Space Invaders movie is apparently still happening

August 10, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Pulling back the curtain on Nissan’s ’follow the sun’ global comms structure
  • GeoGuessr Daily Challenge Answer Today for June 20, 2026
  • Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed
  • Recognizing the Event Industry’s Top Builders
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions