Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another downside in gRPC is that (de)serialization can be relatively slow. One of my pet projects is a tagging server that takes a list of tags and returns a list of objects that match, retrieved from RocksDB. I tested gRPC (proto and FlatBuffers) and Cap'n'Proto. For all of them, the process looked like:

Insert: Object comes in (built in RPC system for both), object is assigned ID (which is added before serialization), object is written to RocksDB, object is serialized and indexed.

Query: List of strings come in, strings are looked up in tag index, intersection of results is done, objects are retrieved from RocksDB, objects are deserialized, objects are added to list, objects are returned to client.

FlatBuffers was unfortunately a no-go since it doesn't permit modification of fields that haven't yet been set and it didn't have a sane way of making a copy (which would also be slow).

Protocol Buffers worked but the pure Python driver is incredibly slow and the C++ driver for Python was non-default and marked "experimental". I eventually adapted to it but it was still quite slow even on the server side. From memory a response with C++ client and server took ~9ms, 4-5ms of which were spent on serialization/deserialization.

Cap'n'Proto eventually won for me. The Python driver is unstable and memory usage soars like an eagle until Linux shoots it down with an OOM but the server was much faster. Typical response times were closer to 3ms, most of which was spent in RocksDB or the actual index lookup. A downside though was that Cap'n'Proto has its own built in and odd library for async stuff and doesn't really support threading.



gRPC is slow with Python, but most people don't pick Python if they are serious about performance. gRPC has continuous benchmarks running[1] to track and improve perf. I agree the Python one has some serious problems, but don't let it be the whole story. The C++ implementation is 100x faster.

[1] https://performance-dot-grpc-testing.appspot.com/explore?das...


For the non-pingpong benchmark [1], c++ (300k rps) looks to be around 2x faster than java (150k rps), while the latter is around 2x faster than go (75k rps).

It looks like go's non-compacting/non-generational gc is at play here. Even csharp is faster than go on that benchmark.

The one you posted has no deserialization overhead since its just ping/pong.

[1] https://performance-dot-grpc-testing.appspot.com/explore?das...

edit: fix typo


I'm aware and I did try out a C++ client but there was still a definite upper bound to its performance and latency was still ~2x Cap'n'Proto.


If the perf of gRPC is not good enough, you should file an issue on Github. I'm kind of surprised that gRPC loist by 2x. That's about how much better netperf is.

When you say you used Cap'n'proto, did you mean the RPC and serializer, or just the serializer? I ask because gRPC's serialization is pluggable, and doesn't have a dependency on Protobuf.


It is important to note that message complexity has a lot to do with proto performance. Anyone with a basic understanding of algorithms could tell you that, but I prefer to be shown things so here's a repo with some examples of Python proto serialization performance so you can see how it does as message complexity increases:

https://github.com/theRealWardo/proto-bench

So yes, if your application has to send a lot of structured data around, you will end up paying a non-trivial serialization cost. On the other hand, if your application is sending really simple messages then I highly doubt proto serialization is the performance bottleneck you should focus on.


I looked at capn proto as well, but ultimately decided against it because the common view online was grpc was much better at rpc. Maybe it's gotten better since then, but capnproto at the time was really comparable to protobufs and not grpc.


Cap'n Proto includes both serialization (comparable to protobuf) and RPC (comparable to gRPC).

The main argument I've heard in favor of gRPC is that it is supported in more languages. Otherwise, I'm not sure what would make gRPC "much better at RPC" -- Cap'n Proto is actually much more expressive, and has features like capabilities and promise pipelining which gRPC lacks.

Do you remember the arguments you saw? I'd be curious to know what they were.

(Disclosure: I'm the author of Cap'n Proto and also Protobuf v2.)


In my experience, RPC in gRPC was nicer to work with but Cap'n'Proto's RPC was faster (due to the format I believe).

Why was gRPC nicer to work with in my opinion?

- Comprehensive and readable documentation. Cap'n'Proto has a high level overview on the main page but the documentation is far from complete. For example regarding I/O, a core part of many RPC systems, "Function calls that do I/O must do so asynchronously, and must return a “promise” for the result." is said, then no explanation is given on how to implement a function that does it (all the examples assume an existing library returns kj promises and since that's custom, no such libraries exist). For serialization, how do I build a list of integers? The "Lists" section on the serialization page doesn't explain this at all, it only mentions the init method for lists of lists or lists of blobs. The gRPC docs on the other hand are quite good, with numerous easy to read tutorials and examples.

- Built in support for threading. An event loop is limited to the throughput of a single core and there'll always be a point at which you need either threading or multiple processes. Since I don't want to have a bunch of different copies of a particular thing in memory, it works better for me to have a bunch of threads with access to shared memory. Again, the docs say "While multiple threads are allowed, each thread must have its own event loop." but don't document how to do it (and particularly, do it while sharing a port). Plus, there's no suggestions on how to return a promise that's fulfilled by another thread (this was and still is a big problem for me, I still haven't figured it out). gRPC makes this dead simple by just using threads (or a queue system I haven't looked into much yet).

- No kj. I might be more okay with it if the documentation was better but at the moment it's near-nonexistent and wandering around the code base for ~30 mins just to figure out how to construct and return a promise was pretty painful.

Really, you can sum up most of my problems in one word: Documentation.

Oh and as you said, language support. The Python driver is really important to me yet seems unfinished.


Those are legitimate complaints. Unfortunately Cap'n Proto lacks tech writers. :(

(It is actually possible to write a multithreaded server, but it needs to be easier and with better documentation...)


I think you can go a long way just with example code. A simple (ideally single file, <300 lines) RPC server that does I/O in a thread, an example of building, reading and mutating a complex data structure with lots of lists and nesting and such, some kind of demo of kj and another server that has multiple event loops could be quite useful.


I agree skip the documentation, someone can fill that gap with PR.

We'd be really happy if you just provided many little code-examples that can be put together =))

I've not used Cap'nProto yet, but want to build a ML endpoint and wish to have Python support, so I'm also interested in that


I also started with Cap'n Proto in my system, but then moved over go gRPC. Reason? There's no RPC support for Java (although there is serialization support). It was early on, so I decided to switch.

Still wondering which one is better, although really, it probably doesn't matter nearly as much as the rest of the code I write. For now I just want it to be easy.

Totally agree that the grpc docs are much better. It took me quite a while to figure out how to just take a message in memory and get it out (all the examples I found are about reading from a file descriptor).

Also agree about the threading model suiting my purposes more, since I don't need any synchronization (I've got reader/writer locks for that).


For many use cases you can just send streams of serialised data as request response with capnproto you don't necessarily need the full RPC protocol.


Thanks for responding, and I'll give it another shot. I will say that it was 2 years ago almost, so I don't remember the arguments. However, seeing the comments below about grpc's documentation is not true at all for c++. The async example is really poor, and pretty much nothing else comes up from a Google search. That would be fine if the feature wasn't really complicated, but it is.


Last time I looked, capn porto was a no-go on Windows for RPC.



Wow, great. I last looked in around march. I'll take a look as I'm currently unhappy with thrift.


What are you unhappy with in Thrift?


Sorry, I only saw this today. We're using the c++ bindings, and having severe performance problems, particularly the speed of the serialisation.

The generated code is not particularly efficient(maps of strings to function pointers) and not very modern. The code itself relies on boost shared pointers everywhere, which is hugely wasteful, and the actual process of getting thrift up and running was on trivial on windows. (Still, easier than grocery was and capn porto at the time). The compiler itself is a bit funny, and hasn't been particularly easy to integrate into our build system (cmake).


> From memory a response with C++ client and server took ~9ms, 4-5ms of which were spent on serialization/deserialization

What's the message size?


It was about a megabyte from memory. Lots of text.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: