Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed

TLDR

YaFF is Yandex’s open-source zero-copy wire format for Protobuf — Apache 2.0, currently C++, v0.1.0.
The .proto file stays the source of truth; only the physical memory layout changes.
On Yandex’s benchmarks, the Flat Layout reads hot data ~3.8× faster than FlatBuffers, within 1.2× of a raw C++ struct.
Four layouts — Fixed, Flat, Sparse, Dynamic — trade read speed for schema flexibility; Dynamic is the default.
YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale.
Adoption is incremental: drop it into one hot path, with two-way Protobuf conversion at the edges.

Yandex has open-sourced YaFF (Yet another Flat Format) under Apache 2.0. It is a high-performance C++ serialization library. YaFF provides a zero-copy wire format for the Protobuf ecosystem. Your .proto file stays the single source of truth. The format only changes how data sits in memory. It concentrates on server-side runtimes.

What is YaFF

YaFF is not a replacement for Protobuf. It is an alternative wire format for Protobuf messages. The same .proto schema generates a proto-like C++ API. Reads need no parsing step, so fields come straight from the buffer. Less performance-sensitive code can still parse the wire format back into Protobuf messages. That two-way conversion is what makes module-by-module adoption realistic. You introduce YaFF in one hot path and leave the rest on Protobuf.

NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning

A better way to model the behavior of metal alloys | MIT News

The Problem it Targets

Protobuf parsing can consume double-digit percentages of CPU in high-load backends. At scale, that maps to thousands of physical cores. The common zero-copy option is FlatBuffers, also from Google. But FlatBuffers is not a Protobuf drop-in and requires maintaining a separate schema and conversion layer. semantically incompatible with Protobuf. Migrating means duplicated schemas, different schema-evolution rules , and hand-written field converters. Many teams conclude the cost is not worth it. YaFF aims at that gap: zero-copy reads with Protobuf semantics preserved.

How the Layouts Work

A layout decides how a message is stored in the buffer. It changes only the physical representation, leaving the schema and generated interfaces unchanged. YaFF ships four layouts. Fixed is a plain packed struct with no header and a frozen schema. Flat adds a two-byte header and supports schema evolution. Sparse addresses fields through a meta table, fitting sparse schemas. Dynamic is the default and selects Flat or Sparse at runtime. It uses Flat while the schema permits, then switches to Sparse when evolution breaks flat alignment.

Layout	Read access	Per-message overhead	Schema evolution	Best for
Fixed	1 read, 0 branches	0 bytes	Frozen	Small inlined primitives
Flat	2 reads, 1 branch	2 bytes	Restricted (type preservation)	Dense, hot data
Sparse	4 reads, 2 branches	6 bytes	Unrestricted	Sparse schemas, free evolution
Dynamic (default)	Flat or Sparse at runtime	2 or 6 bytes	Unrestricted	General application logic

Benchmark

Yandex ships a reproducible benchmark suite, built with google/benchmark in a Release build. The numbers below are median nanoseconds per read on an AMD EPYC 7713 with Clang 20.1.8. Lower is faster. In the hot hierarchical case, the Flat Layout reads in 9.79 ns. FlatBuffers needs 37.30 ns, and Protobuf needs 219.35 ns. The raw C++ struct baseline is 8.14 ns. So the Flat Layout reads about 3.8× faster than FlatBuffers here, and about 22× faster than Protobuf. It stays within 1.2× of the raw struct.

Format	Read time (ns)	Slowdown vs raw struct
Raw C++ struct	8.14	1.0×
YaFF Flat Layout	9.79	1.2×
YaFF Sparse Layout	21.23	2.6×
FlatBuffers	37.30	4.6×
Protobuf	219.35	26.9×

Median ns per read, hierarchical / hot / no chain caching. Source: https://yaff.tech/docs/en/benchmarks/access

Note: The absolute numbers depend on the host CPU and memory. The ratios between formats are expected to hold across hardware.

The Compiler Aliasing Detail

FlatBuffers and YaFF both read fields by reinterpreting raw memory as the target type. That type-punning leaves TBAA without strong enough facts. So LLVM’s alias analysis falls back to a conservative MayAlias verdict. The compiler then cannot prove that repeated accesses are safe to reuse. Writing root.intermediate().leaf().a() twice re-walks the tree each time. YaFF adds annotations in its generated code that tell the compiler when reuse is safe. YaFF’s generated-code annotations can often help the compiler reuse the access chain, as long as the relevant memory is not modified between reads. As long as nothing writes to memory between reads, YaFF caches the access chain on its own.

Where It Fits: Use Cases

YaFF targets systems where you control both producer and consumer. Recommendation and ad-serving backends are the clearest fit. According to Yandex, YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale. Memory-mapped indexes are a second fit. A host can hold tens of gigabytes of local data. Those mmap-able indexes survive service restarts without re-parsing. Search indexes, feature stores, and feed services share that read-heavy profile. The planned Columnar Layout targets analytics and ML pipelines with large repeated fields. YaFF can also be more compact than FlatBuffers, which helps cache behavior.

A Look at the Code

The read path mirrors Protobuf, minus the parse step.

#include "feed.pb.h"     // generated by protoc
#include "feed.yaff.h"   // generated by yaff_generate()

// 1. Serialize an existing Protobuf message into a YaFF buffer.
feed::FeedResponse proto = LoadFeedResponse();
const auto buffer = yaff::Serialize<protoyaff::feed::FeedResponse>(proto);

// 2. Read fields directly from the buffer. There is no parsing step.
const auto& response = yaff::ReadMessage<protoyaff::feed::FeedResponse>(buffer.Data());
for (const auto& item : response.items()) {
    std::string_view title  = item.title();
    std::string_view author = item.author().name();  // empty if author is unset
}

// 3. Convert back to Protobuf when a consumer needs the parsed message.
feed::FeedResponse restored;
response.ParseTo(restored);

You add YaFF through CMake (find_package) or Conan. Code generation runs protobuf_generate() then yaff_generate(). Generated YaFF types live in the protoyaff::<package> namespace. Most projects only link yaff::core and yaff::proto.