Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have been waiting for this to hit 1.0 and more importantly get popular so that I can use it everywhere. I am really a fan of Yann Collet's work. These are extremely impressive work specially when you consider that lz4 seems to be better than snappy (by google) and zstandard from LZFSE (from apple). I think he is the first one to write a practical fast arithmetic coder using ANS. And look at how his huffman implementation blazes past zlib huffman though compresses less than FSE [0]. I also like reading his blog posts. While a lot of them goes over my head I can generally make a sense of what he is trying and why something's working despite the complexity.

[0] https://github.com/Cyan4973/FiniteStateEntropy



Cyan4973/Yann also is well known for xxhash[1], which is one of the faster hashers[2] out there. Post a new hasher and people will probably ask about xxhash (ex: metrohash[3]). Guy is an absolute machine. If you search Google for 'zstd' right now, you'll find him, not Facebook, namely: https://github.com/Cyan4973/zstd . Glad his work is being supported by someone now! Immensely well deserved, after so many years of helping make everyone fast.

PS - I didn't know a lot of the terms you used, but the Finite State Entropy (FSE) link you provided does a good job intro'ing some of them, and the linked paper Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding [4] (ANS) seems interesting.

[1] https://github.com/Cyan4973/xxHash

[2] https://github.com/rurban/smhasher

[3] https://news.ycombinator.com/item?id=9613098

[4] http://arxiv.org/abs/1311.2540


I found his (Yann's) blog posts the most helpful for getting a basic idea of what was going on with FSE: https://fastcompression.blogspot.com/2013/12/finite-state-en... and the followup https://fastcompression.blogspot.fr/2014/01/fse-decoding-how...


Yes. I forgot about xxHash :). I think it was created as part of checksumming on LZ4 (not entirely sure). That is another amazing part of his work. He is producing this things xxHash, FSE, Huff0 that are state of the art projects in their own right for his state of the art compression algorithm making sure others can benefit the most from his work without reinventing it. At the same time he is also blogging his though process and experiment in a manner that even a layman like me gets the gist of for most part. Now whether its useful enough for an expert in the field I can't say.

There is a lot of other gem in his blog. He points to resources and other people he have worked with making it much easier to get a bigger picture on the related items.


> I think he is the first one to write a practical fast arithmetic coder using ANS.

I don't think he is the first; although RAD game tools has been cagey about the details, many strongly suspect their recently announced Kraken, etc. use ANS in some form and some of their previous products; see the discussion here:

https://news.ycombinator.com/item?id=11583898

...and here:

http://encode.ru/threads/2492-Kraken-compressor

http://cbloomrants.blogspot.ca/2016_05_01_archive.html


Yeah read about Kraken a while ago and found the encode.ru thread you pointed. Its unfortunate that we can't compare that side by side. Those guys seem another of those compression geniuses. Overall the world of compression is moving very fast and well these days.

But even with that Yann might be the first. Yan's work on FSE is old [0] and if I'm reading this correctly his work on FSE is a mix of his own work and Jarek Duda. But it seems one of his failed attempt challenged Jarek to invent another version of ANS called rANS [1] and as you can see in the encode.ru comments from the RAD game tools that they seem to be using rANS at least for Kraken. Regardless these are impressive works and these people are bouncing ideas of each others, being challenged and inspired which is a very good thing for the rest of us :).

[0] https://fastcompression.blogspot.com/2013/12/finite-state-en... [1] http://encode.ru/threads/1821-Asymetric-Numeral-System?p=360... [2] https://github.com/rygorous/ryg_rans


Before Yann, this ANS variant was implemented by Andrew Polar in 2008: http://www.ezcodesample.com/abs/abs_article.html Here is a list of implementations: http://encode.ru/threads/2078-List-of-Asymmetric-Numeral-Sys...


Yeah, it's all very hazy, and since RAD is (understandably) proprietary, it's difficult to figure out exactly when who did what.


There are now actually open source decompressors[0] for Oodle formats, and yes, many use rANS. Especially LZNA's clever use of SSE2 to decode and update 16 symbol models is interesting. FSE (which is tANS) is a different beast, though, and more suitable for rarely updated models.

[0] http://encode.ru/threads/2577-Open-source-Kraken-Mermaid-Sel...


...and got DMCAed [1]. At the very least the discussion itself is interesting for the inside infos, though.

[1] https://github.com/github/dmca/blob/master/2016/2016-08-30-O...


That was because of the belief that he had used an illegal copy of the DLL, and that he had copied all implementation details (neither is true). A representative from RAD said they would withdraw the DMCA complaint now after that misunderstanding was cleared.


Now here things gets really interesting [0]. And the fact that the claim is that higher level of compression has negligible impact on decode speed making Kraken really impressive for one time compression and lots of decompression situation.

[0] http://cbloomrants.blogspot.ca/2016/07/introducing-oodle-mer...


RAD use rANS for their compressors. See ryg's blog [0] for more details, and a bunch more interesting posts about compression and various low-level optimizations.

[0] https://fgiesen.wordpress.com/2015/12/21/rans-in-practice/


I agree, I just started using Zstd for uniprot.org (dev branch) and it looks like it is a lot faster in decoding than deflate or zlib and even smaller on disk. It is one of those things where the improvement really matters for us and gives us faster results for our users.

i.e. download speed is up to 20% faster with Zstd as compression algo for backing store compared to Deflate. Assuming bandwidth is available ;)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: