Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, right now Pulsar does everything Kafka does but easier, faster, and with more features. It's especially great in combining message queuing with low-latency pub/sub.

It's also been around for years but recently open-sourced so community is smaller than Kafka, however it's growing quickly along with the usual ecosystem of drivers, extensions, and services. Unless you're already running Kafka, I would look at Pulsar first for new projects.

A great overview of Pulsar history with plenty of links is here: https://streaml.io/blog/messaging-storage-or-both/



Unfortunately pulsar does not have a system such kafka stream. In pulsar we need to use spark to do the same (https://pulsar.incubator.apache.org/docs/latest/adaptors/Pul...).

And pulsar does not provide exactly once neither, that is something very important.


Kafka streams is great. But it is a processing library. It isn’t an apple-to-apple comparison to a messaging system like Pulsar. First there are already many mature stream processing engines and libraries (spark, flink, heron, storm). They have been in production for years and it made no sense to write a new one without a good reason. Second, kafka stream could have done a better job, not just tight with kafka. Technically kafka streams isn’t really a very specific implementation or design to kafka. If confluent has done a better job on abstraction, I would expect it is very easy to plugin different messaging systems or log storages to run "kafka streams". Although I am not sure Confluent want to see that happen.

“Kafka has exactly-once delivery but messaging system x/y/z doesn’t provide it” is also confusing and misleading. Exactly-once is technically effectively-once: “at-least-once” and make the processing of the messages idempotent or “de-duplicated”. This has already been done in the industry for many decades in many mature stream processing engines like heron, flink. It isn’t a really new thing. And many messaging systems like pulsar already provides those primitives (e.g. idempotent producing, at-least-once delivery) for processing jobs to achieve effectively-once very easily. Streamlio folks did a great job about explaining exactly-once and effectively-once. It is worth checking this blog post out -- https://streaml.io/blog/exactly-once/

I think Pulsar itself as a distributed messaging system does provides all the three delivery semantics: at-most-once, at-least-once and effectively-once. It is very easy for people to use and integrate. I don’t think it is difficult to make kafka streams run with pulsar technically. The question is more is there a value to do that, do kafka folks wanna to do that, can the collaboration happen in the ASF?

that's just my two cents.


Kafka Streams is just a small wrapper library to make it easy to work with Kafka partitions, along with some utilities for reading, transforming, and writing to another stream in a single step.

Pulsar doesn't have any client overhead since it's all tracked on the broker so there's no real need for a separate library. Read, process, and push messages using any code you want, 1 at a time or in batches.

There's no realistic "exactly once" either, it's idempotency or some local cache of processed events used to dedup, you can read more here: https://streaml.io/blog/exactly-once/


Is it possible to use Kafka Streams wtihout Kafka by listening and processing events from another source?


I would say no, because kafka stream use the internal kafka method of partitions to be able to scale.


I do think it is technically possible base on my understanding of pulsar

- pulsar provides an failover subscription mode, which seems to be the equivalent of partition rebalancing of consumer group in kafka. https://pulsar.incubator.apache.org/docs/latest/getting-star...

- it has partitioned topics as well.

- it supports idempotent producing and have effectively-once delivery semantic.

It seems to have all the kind of primitives for kafka streams. to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: