gRPC in Production

_ydfu · on July 21, 2017

Another downside in gRPC is that (de)serialization can be relatively slow. One of my pet projects is a tagging server that takes a list of tags and returns a list of objects that match, retrieved from RocksDB. I tested gRPC (proto and FlatBuffers) and Cap'n'Proto. For all of them, the process looked like:

Insert: Object comes in (built in RPC system for both), object is assigned ID (which is added before serialization), object is written to RocksDB, object is serialized and indexed.

Query: List of strings come in, strings are looked up in tag index, intersection of results is done, objects are retrieved from RocksDB, objects are deserialized, objects are added to list, objects are returned to client.

FlatBuffers was unfortunately a no-go since it doesn't permit modification of fields that haven't yet been set and it didn't have a sane way of making a copy (which would also be slow).

Protocol Buffers worked but the pure Python driver is incredibly slow and the C++ driver for Python was non-default and marked "experimental". I eventually adapted to it but it was still quite slow even on the server side. From memory a response with C++ client and server took ~9ms, 4-5ms of which were spent on serialization/deserialization.

Cap'n'Proto eventually won for me. The Python driver is unstable and memory usage soars like an eagle until Linux shoots it down with an OOM but the server was much faster. Typical response times were closer to 3ms, most of which was spent in RocksDB or the actual index lookup. A downside though was that Cap'n'Proto has its own built in and odd library for async stuff and doesn't really support threading.

morecoffee · on July 22, 2017

gRPC is slow with Python, but most people don't pick Python if they are serious about performance. gRPC has continuous benchmarks running[1] to track and improve perf. I agree the Python one has some serious problems, but don't let it be the whole story. The C++ implementation is 100x faster.

[1] https://performance-dot-grpc-testing.appspot.com/explore?das...

dyu- · on July 22, 2017

For the non-pingpong benchmark [1], c++ (300k rps) looks to be around 2x faster than java (150k rps), while the latter is around 2x faster than go (75k rps).

It looks like go's non-compacting/non-generational gc is at play here. Even csharp is faster than go on that benchmark.

The one you posted has no deserialization overhead since its just ping/pong.

[1] https://performance-dot-grpc-testing.appspot.com/explore?das...

edit: fix typo

_ydfu · on July 22, 2017

I'm aware and I did try out a C++ client but there was still a definite upper bound to its performance and latency was still ~2x Cap'n'Proto.

morecoffee · on July 22, 2017

If the perf of gRPC is not good enough, you should file an issue on Github. I'm kind of surprised that gRPC loist by 2x. That's about how much better netperf is.

When you say you used Cap'n'proto, did you mean the RPC and serializer, or just the serializer? I ask because gRPC's serialization is pluggable, and doesn't have a dependency on Protobuf.

therealwardo · on July 22, 2017

It is important to note that message complexity has a lot to do with proto performance. Anyone with a basic understanding of algorithms could tell you that, but I prefer to be shown things so here's a repo with some examples of Python proto serialization performance so you can see how it does as message complexity increases:

https://github.com/theRealWardo/proto-bench

So yes, if your application has to send a lot of structured data around, you will end up paying a non-trivial serialization cost. On the other hand, if your application is sending really simple messages then I highly doubt proto serialization is the performance bottleneck you should focus on.

shaklee3 · on July 21, 2017

I looked at capn proto as well, but ultimately decided against it because the common view online was grpc was much better at rpc. Maybe it's gotten better since then, but capnproto at the time was really comparable to protobufs and not grpc.

kentonv · on July 21, 2017

Cap'n Proto includes both serialization (comparable to protobuf) and RPC (comparable to gRPC).

The main argument I've heard in favor of gRPC is that it is supported in more languages. Otherwise, I'm not sure what would make gRPC "much better at RPC" -- Cap'n Proto is actually much more expressive, and has features like capabilities and promise pipelining which gRPC lacks.

Do you remember the arguments you saw? I'd be curious to know what they were.

(Disclosure: I'm the author of Cap'n Proto and also Protobuf v2.)

_ydfu · on July 22, 2017

In my experience, RPC in gRPC was nicer to work with but Cap'n'Proto's RPC was faster (due to the format I believe).

Why was gRPC nicer to work with in my opinion?

- Comprehensive and readable documentation. Cap'n'Proto has a high level overview on the main page but the documentation is far from complete. For example regarding I/O, a core part of many RPC systems, "Function calls that do I/O must do so asynchronously, and must return a “promise” for the result." is said, then no explanation is given on how to implement a function that does it (all the examples assume an existing library returns kj promises and since that's custom, no such libraries exist). For serialization, how do I build a list of integers? The "Lists" section on the serialization page doesn't explain this at all, it only mentions the init method for lists of lists or lists of blobs. The gRPC docs on the other hand are quite good, with numerous easy to read tutorials and examples.

- Built in support for threading. An event loop is limited to the throughput of a single core and there'll always be a point at which you need either threading or multiple processes. Since I don't want to have a bunch of different copies of a particular thing in memory, it works better for me to have a bunch of threads with access to shared memory. Again, the docs say "While multiple threads are allowed, each thread must have its own event loop." but don't document how to do it (and particularly, do it while sharing a port). Plus, there's no suggestions on how to return a promise that's fulfilled by another thread (this was and still is a big problem for me, I still haven't figured it out). gRPC makes this dead simple by just using threads (or a queue system I haven't looked into much yet).

- No kj. I might be more okay with it if the documentation was better but at the moment it's near-nonexistent and wandering around the code base for ~30 mins just to figure out how to construct and return a promise was pretty painful.

Really, you can sum up most of my problems in one word: Documentation.

Oh and as you said, language support. The Python driver is really important to me yet seems unfinished.

kentonv · on July 22, 2017

Those are legitimate complaints. Unfortunately Cap'n Proto lacks tech writers. :(

(It is actually possible to write a multithreaded server, but it needs to be easier and with better documentation...)

_ydfu · on July 22, 2017

I think you can go a long way just with example code. A simple (ideally single file, <300 lines) RPC server that does I/O in a thread, an example of building, reading and mutating a complex data structure with lots of lists and nesting and such, some kind of demo of kj and another server that has multiple event loops could be quite useful.

FractalNerve · on July 23, 2017

I agree skip the documentation, someone can fill that gap with PR.

We'd be really happy if you just provided many little code-examples that can be put together =))

I've not used Cap'nProto yet, but want to build a ML endpoint and wish to have Python support, so I'm also interested in that

cbanek · on July 22, 2017

I also started with Cap'n Proto in my system, but then moved over go gRPC. Reason? There's no RPC support for Java (although there is serialization support). It was early on, so I decided to switch.

Still wondering which one is better, although really, it probably doesn't matter nearly as much as the rest of the code I write. For now I just want it to be easy.

Totally agree that the grpc docs are much better. It took me quite a while to figure out how to just take a message in memory and get it out (all the examples I found are about reading from a file descriptor).

Also agree about the threading model suiting my purposes more, since I don't need any synchronization (I've got reader/writer locks for that).

justincormack · on July 22, 2017

For many use cases you can just send streams of serialised data as request response with capnproto you don't necessarily need the full RPC protocol.

shaklee3 · on July 22, 2017

Thanks for responding, and I'll give it another shot. I will say that it was 2 years ago almost, so I don't remember the arguments. However, seeing the comments below about grpc's documentation is not true at all for c++. The async example is really poor, and pretty much nothing else comes up from a Google search. That would be fine if the feature wasn't really complicated, but it is.

maccard · on July 21, 2017

Last time I looked, capn porto was a no-go on Windows for RPC.

kentonv · on July 21, 2017

That was fixed in version 0.6: https://capnproto.org/news/2017-05-01-capnproto-0.6-msvc-jso...

maccard · on July 22, 2017

Wow, great. I last looked in around march. I'll take a look as I'm currently unhappy with thrift.

kod · on July 23, 2017

What are you unhappy with in Thrift?

maccard · on July 25, 2017

Sorry, I only saw this today. We're using the c++ bindings, and having severe performance problems, particularly the speed of the serialisation.

The generated code is not particularly efficient(maps of strings to function pointers) and not very modern. The code itself relies on boost shared pointers everywhere, which is hugely wasteful, and the actual process of getting thrift up and running was on trivial on windows. (Still, easier than grocery was and capn porto at the time). The compiler itself is a bit funny, and hasn't been particularly easy to integrate into our build system (cmake).

trimbo · on July 21, 2017

> From memory a response with C++ client and server took ~9ms, 4-5ms of which were spent on serialization/deserialization

What's the message size?

_ydfu · on July 22, 2017

It was about a megabyte from memory. Lots of text.

brango · on July 21, 2017

The only thing holding back gRPC is JS web support. If it had that it'd be time to drop swagger completely. As it is you need to go protobuf -> swagger -> js lib, but it's cumbersome and doesn't work 100% (e.g adding auth keys, etc.).

Will there be any progress on JS web, or can it just not be done at all with HTTP1? Even a subset of features for basic GET/POST would be fine...

buckhx · on July 21, 2017

We've been using grpc-web from Improbable successfully for internal services. I'm working on a more production scale application soon that I'm planning on leveraging grpc-web for. The biggest missing feature is client side streaming support which is hamstrung the new wgfetch/streams API, but I feel confident in finding solutions for situations that generally require client side streaming.

https://github.com/improbable-eng/grpc-web

simonhorlick · on July 22, 2017

We've just gone live with a grpc-web based application. It's still early days and there are some rough edges, but it's an absolute joy to use and the amount of time we've saved is crazy.

brango · on July 21, 2017

I hadn't seen that. Thanks!

phamilton · on July 21, 2017

I'd say load balancing is still a big gRPC hurdle. See https://github.com/grpc/grpc/blob/master/doc/load-balancing.... for the current official stance on load balancing. It's pretty convoluted IMO.

YZF · on July 21, 2017

I agree. You kind of have to roll your own for every language you use which is odd. And retries. Hopefully some more standard way of handling this will emerge.

EDIT: I enjoyed the GopherCon talk... I don't think the video is up yet but when it is I recommend watching...

leastangle · on July 22, 2017

I like the approach of custom name resolver though. Pretty straight forward to build one e.g. backed by Consul.

brango · on July 21, 2017

IIRC on K8s there are some gRPC-aware load balancers, possibly Envoy, I forget.

morecoffee · on July 22, 2017

Surprisingly, the reason why gRPC JS web support doesn't exist is because of Chrome. There is currently no way to access the HTTP trailers from Javascript. gRPC depends on trailers to tell when an RPC is complete when doing bidirectional streaming.

Please tell browser devs to provide access to them! They have been part of the HTTP spec for years. Only recently have browsers entertained the idea:

https://bugzilla.mozilla.org/show_bug.cgi?id=1339096

https://bugs.webkit.org/show_bug.cgi?id=168232

https://bugs.chromium.org/p/chromium/issues/detail?id=691599

https://developer.microsoft.com/en-us/microsoft-edge/platfor...

cshenton · on July 21, 2017

There's also the fact that, for high level languages like python, the gRPC server has lower throughput than available python http servers. Even with the overhead of parsing http and whatever message format you're sending over it.

weberc2 · on July 21, 2017

This is surprising given that there Python implementation is apparently C. Anyway I can't imagine this is the bottleneck in any nontrivial Python app.

papercruncher · on July 22, 2017

There's a known issue that's been open for a while, where a Python server could get stuck consuming 100% of a single CPU core, see https://github.com/grpc/grpc/issues/9688

weberc2 · on July 22, 2017

Granted, but this is clearly a bug in a pathological case; it's not consistent bad performance.

stevvooe · on July 21, 2017

Decent overview, but, remember that when evaluating something, projects change over time. Even better, you can be the change you want to see! Thus far, the grpc project has been fairly responsive in making solid changes, either through PRs or filing issues. ;)

Regarding the complaint about errors, there is already protocol support for structured error handling: https://godoc.org/google.golang.org/grpc/status. https://github.com/grpc/grpc-go/pull/1358 should make this easier to use. In practice, the provided error code set is very good, so give them a try before making things more complex.

Worst case, you can just stuff things in a header or trailer.

joneholland · on July 22, 2017

You can add Expedia to the list of production grpc users. Our entire hotel pricing backend is grpc microservices written in Scala. Some of these services take > 100k TPS. We have found grpc to be extremely scaleable.

If that sounds interesting, I'm hiring engineers, shoot me a note at joholland at Expedia.com

morecoffee · on July 22, 2017

Do you use the Java generated stubs or generate your own? There are several people using gRPC with Scala that could benefit from more idiomatic stubs.

joneholland · on July 22, 2017

We use the java stubs. We briefly looked at some scala proto generators, but most times you dump the proto class into a rich scala type right away anyways.

tnolet · on July 21, 2017

The point about operations is a very, very valid constraint of REST. Easy and common stuff like, "run this thing in the background", "send of this one, ephemeral, message" are very unnatural. Maybe a hybrid / bastard child of REST and gRPC would be a good marriage of resource and operations modelling.

QuercusMax · on July 22, 2017

The Google Cloud standard for async operations via the google.longrunning.Operation service: https://github.com/googleapis/googleapis/blob/master/google/.... This should be usable either via REST/JSON or gRPC.

It's designed to be generic for any kind of async operation, and to be "mixed in" with your APIs. There are utilities for waiting for an operation to finish (via polling); in theory it should be possible to use some type of server-push to avoid polling, but I'm not sure if anybody's doing this.

(I'm an Alphabet employee who's working on APIs delivered via gRPC / REST/JSON.)

nemothekid · on July 21, 2017

Thrift is Facebook's version of gRPC right? If so, I don't quite understand the comparison, and how gRPC succeeds where thrift "fails". Wouldn't all language implementations of gRPC have to be well documented, reliable, highly performant and easy to install?

atombender · on July 21, 2017

I guess Google wanted to start with a clean slate based on the design principles already established internally (a system called Stubby).

gRPC was designed from the start for HTTP/2, which comes with some benefits: It's able to work wherever HTTP works (load balancers and proxies), can multiplex calls over a single stream (Thrift on the JVM, where it's most popular, uses a thread per socket), supports cancelation and streaming and so on. gRPC is arguably more opinionated than Thrift here; Thrift is both a serialization format and an RPC mechanism, and for Thrift RPC you can choose between different transports and framings, of which HTTP is just one. gRPC is HTTP-only, and is (at least nominally) serialization-format-agnostic; you could, in principle, use Thrift over gRPC instead of Protocol Buffers, for example. So in this sense, gRPC is more pure and generic than Thrift. (I'm sure Thrift fans might disagree here.)

I haven't actually used Thrift, so maybe someone else can chime in about other reasons gRPC is preferable.

wrsh07 · on July 21, 2017

My understanding is that Thrift is Facebook's analog to Google's Stubby. Or gRPC [which is the next generation of Stubby].

It's not implementing the gRPC interface. I don't think gRPC was open-sourced [2015] early enough for Thrift [2007] to be built to its API.

Disclosure: Google employee who uses Stubby. [but is not on the team]

puzzle · on July 22, 2017

Thrift was written at Facebook by an ex Google intern trying to recreate something close to protobuf+Stubby.

fizx · on July 22, 2017

Thrift is comparable to GRPC, and more user-friendly in practice than GRPC. GRPC is more flexible, rigorous (anal).

I disagree that Thrift failed, but GRPC has more momentum right now IMO.

Source (GRPC at current startup, multiple years of thrift)

cookiecaper · on July 22, 2017

Thrift is popular and it "supports" more targets, but I found a lot of bugs in some of its lesser-known implementations. This may not be an issue if you're using a widely-used Thrift platform, but if you're choosing Thrift because it supports the target you want while another RPC doesn't, do watch out for weird bugs and issues.

I haven't tried to use Protobufs or gRPC in a serious way so I can't say if it's better or worse, but I would hope it supports fewer targets because it takes stability and QC more seriously.

azurezyq · on July 21, 2017

on the contrary, I always found thrift lack good documentation.

vikiomega9 · on July 22, 2017

That's the sense I got too but beyond the initial hurdles it's good enough.

misterbowfinger · on July 21, 2017

> Inefficient (textual representations aren’t optimal for networks)

REST APIs don't _have_ to be text-based, AFAIK. Why not just send/receive binary?

ehsankia · on July 21, 2017

Sure, but then you'd have to write a serializer/deserializer for your messages. You could also use an existing one like Bson but then why not just use Protobuf?

cshenton · on July 21, 2017

Because bson, msgpack etc are self describing? It means a way lower barrier to entry.

adrianmonk · on July 22, 2017

I'm not sure I see how self-describing is a lower barrier to entry. It seems like generally the main reason why you'd want a self-describing format is if you're writing client (and server) language bindings by hand. If you have a tool to auto-generate those language bindings, you can skip that step, and there's no need to make a step easier if it's done for you automatically.

I can see where self-describing is better when it comes to side issues like debugging, exploration, or the hassle of configuring your build to generate code from the IDL files, though. But if I had to choose only one, those are lower priority to me.

throwaway91111 · on July 22, 2017

I was under the impression it was trivial to store the message description in the message itself with protobuf.

duality · on July 22, 2017

But sadly it also means redundancy with every message sent, which must carrying this self-description.

morecoffee · on July 22, 2017

For the same reason people use JSON. It's simpler, and there is a fast parser built into the browser. Even non browser clients have optimized JSON coders which is good enough.

That said, JSON is still slower and more bloated than Protobuf, and you get Protobuf for free when using gRPC.