Cap’n Proto

Introduction

Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers , except faster. In fact, in benchmarks, Cap’n Proto is INFINITY TIMES faster than Protocol Buffers.

This benchmark is, of course, unfair. It is only measuring the time to encode and decode a message in memory. Cap’n Proto gets a perfect score because there is no encoding/decoding step . The Cap’n Proto encoding is appropriate both as a data interchange format and an in-memory representation, so once your structure is built, you can simply write the bytes straight out to disk!

But doesn’t that mean the encoding is platform-specific?

NO! The encoding is defined byte-for-byte independent of any platform. However, it is designed to be efficiently manipulated on common modern CPUs. Data is arranged like a compiler would arrange a struct with fixed widths, fixed offsets, and proper alignment. Variable-sized elements are embedded as pointers. Pointers are offset-based rather than absolute so that messages are position-independent. Integers use little-endian byte order because most CPUs are little-endian, and even big-endian CPUs usually have instructions for reading little-endian data.

Doesn’t that make backwards-compatibility hard?

Not at all! New fields are always added to the end of a struct (or replace padding space), so existing field positions are unchanged. The recipient simply needs to do a bounds check when reading each field. Fields are numbered in the order in which they were added, so Cap’n Proto always knows how to arrange them for backwards-compatibility.

Won’t fixed-width integers, unset optional fields, and padding waste space on the wire?

Yes. However, since all these extra bytes are zeros, when bandwidth matters, we can apply an extremely fast Cap’n-Proto-specific compression scheme to remove them. Cap’n Proto calls this “packing” the message; it achieves similar (better, even) message sizes to protobuf encoding, and it’s still faster.

When bandwidth really matters, you should apply general-purpose compression, like zlib or LZ4 , regardless of your encoding format.

Isn’t this all horribly insecure?

No no no! To be clear, we’re NOT just casting a buffer pointer to a struct pointer and calling it a day.

Cap’n Proto generates classes with accessor methods that you use to traverse the message. These accessors validate pointers before following them. If a pointer is invalid (e.g. out-of-bounds), the library can throw an exception or simply replace the value with a default / empty object (your choice).

Thus, Cap’n Proto checks the structural integrity of the message just like any other serialization protocol would. And, just like any other protocol, it is up to the app to check the validity of the content.

Cap’n Proto was built to be used in Sandstorm.io , where security is a major concern. As of this writing, Cap’n Proto has not undergone a security review, therefore we suggest caution when handling messages from untrusted sources. That said, our response to security issues was once described by security guru Ben Laurie as “the most awesome response I’ve ever had.” (Please report all security issues to security@sandstorm.io .)

Are there other advantages?

Glad you asked!

Incremental reads: It is easy to start processing a Cap’n Proto message before you have received all of it since outer objects appear entirely before inner objects (as opposed to most encodings, where outer objects encompass inner objects). Random access: You can read just one field of a message without parsing the whole thing. mmap: Read a large Cap’n Proto file by memory-mapping it. The OS won’t even read in the parts that you don’t access. Inter-language communication: Calling C++ code from, say, Java or python tends to be painful or slow. With Cap’n Proto, the two languages can easily operate on the same in-memory data structure. Inter-process communication: Multiple processes running on the same machine can share a Cap’n Proto message via shared memory. No need to pipe data through the kernel. Calling another process can be just as fast and easy as calling another thread. Arena allocation: Manipulating Protobuf objects tends to be bogged down by memory allocation, unless you are very careful about object reuse. Cap’n Proto objects are always allocated in an “arena” or “region” style, which is faster and promotes cache locality. Tiny generated code: Protobuf generates dedicated parsing and serialization code for every message type, and this code tends to be enormous. Cap’n Proto generated code is smaller by an order of magnitude or more. In fact, usually it’s no more than some inline accessor methods! Tiny runtime library: Due to the simplicity of the Cap’n Proto format, the runtime library can be much smaller. Time-traveling RPC: Cap’n Proto features an RPC system that implementstime travel such that call results are returned to the client before the request even arrives at the server!

Why do you pick on Protocol Buffers so much?

Because it’s easy to pick on myself. :) I, Kenton Varda, was the primary author of Protocol Buffers version 2, which is the version that Google released open source. Cap’n Proto is the result of years of experience working on Protobufs, listening to user feedback, and thinking about how things could be done better.

Note that I no longer work for Google. Cap’n Proto is not, and never has been, affiliated with Google; in fact, it is a property of Sandstorm.io , of which I am co-founder.

OK, how do I get started?

To install Cap’n Proto, head over to theinstallation page. If you’d like to help hack on Cap’n Proto, such as by writing bindings in other languages, let us know on the discussion group . If you’d like to receive e-mail udpates about future releases, add yourself to the announcement list .

Cap’n Proto

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本