I journal everything to s3 after batching and compressing the batches. I have mi...

I journal everything to s3 after batching and compressing the batches. I have minute-resolution batches per machine (so you read the prefix in S3 for ROLE/YYYY/MM/DD/HH/MM/* to get all of the machines who wrote for that time period). I have replay capability as well as a utility library built up for my company's use cases to find specific events. Fundamentally I use an architecture similar to s3-journal[0]. There are some stream systems that can handle writing to s3 and reading from s3 natively such as Onyx[1][2] as well as checkpointing stateful operations against s3 [3].

We use replay capability to fix bugs, add new features to existing data, and to load qa environments.

[0] https://github.com/Factual/s3-journal

[1] http://www.onyxplatform.org/

[2] https://github.com/onyx-platform/onyx-amazon-s3

[3] http://www.onyxplatform.org/docs/api/0.10.x/onyx.storage.s3....