Reminds me of this compression project from a few years back from Steve Chamberlin at Big Mess O' Wires, FC8.
FC8 is designed to be as fast as possible to decompress on "legacy" hardware, while still maintaining a decent compression ratio. Generic C code for compression and decompression is provided, as well as an optimized 68K decompressor for the 68020 or later CPUs. The main loop of the 68K decompressor is exactly 256 bytes, so it fits entirely within the instruction cache of the 68020/030. Decompression speed on a 68030 is about 25% as fast as an optimized memcpy of uncompressed data.
The algorithm is based on the classic LZ77 compression scheme, with a sliding history window and duplicated data replaced by (distance,length) markers pointing to previous instances of the same data. No extra RAM is required during decompression, aside from the input and output buffers. The match-finding code and length lookup table were borrowed from liblzg by Marcus Geelnard.
LZSA focuses on very fast decompression on 8-bit systems; so I suspect some of those could be used for MCUs as well, although getting to the level of integration that Segger is selling with their compression method may not be trivial.
The author calls it "unmaintained". And Google Code was shut down in 2016, everything there is read-only archived. While such a project probably does not need regular updates, bugs are often found in code over time. For this project there is effectively no place to report, discuss, fix and integrate such. So abandoned does not seem inaccurate.
yeah, it doesn't really have a proper home anymore, but it is pretty much finished as a library. While I have used it for many years now and not had a problem, it still may have problems on edge cases I haven't touched on. I mainly mention it here because it took me a while to find a nice small simple library that can be easily embedded in smaller microcontrollers and uses no dynamic memory. Often a hard thing to come across anything done well.
Segger makes high quality stuff, but they’re expensive and their business model is “anti-modern”. The cost to use emCompress is $2500 USD.
Also, they specify that SMASH gets a worse compression ratio than LZMA but is an “order of magnitude less complex measured in lines of code”.
I’m not convinced that fewer lines of code means better performance on embedded systems. If anything, embedded code that has several thousand lines of code typically means there are hundreds of preprocessor macros that enable/disable certain paths depending on your architecture for performance.
Lines of code is not all that important. But program size is a key metric for microcontroller-type embedded systems, since the total available FLASH is typically in the 1-1000 kB range.
Same conclusion as others: no source, no figures, TFA is basically an ad.
There are already tools to compress on a PC and decompress on a small machine, with more-than-decent compression rate and decompression speed, for example Exomizer https://bitbucket.org/magli143/exomizer
As an example, a decompression routine for the Z80 is about 154 bytes of code, needs 156 bytes of private data, and the decompression speed is not that much lower than a memory-to-memory copy speed. It even has options to ease the case when decompressing in the same memory range as the compressed data so that you can load data in RAM and decompress in-place.
While interesting the niche for Smash seems to be small.
You'd have to have a system that's big enough to have some RAM where to decompress in the first place. So forget those ultra small 256 bytes or less RAM systems.
Mid-sized systems (say 16 kB+ RAM), I think you'd still mostly run directly from ROM. Any RAM is probably needed for other use.
Larger systems with ARM M0 core (or better) and 64 kB+ RAM have a ton of CPU power and typically plenty of other resources as well.
Perhaps there are people who need to deal with mask ROMs, but do those need to be that small either in 2020? Or penny pincher cases where you just can't have an extra I2C EEPROM / flash chip and your target uC doesn't have enough flash storage.
Maybe I'm missing something.
In any case, LZ-family compressors (like LZ4) have served me well so far.
Edit: That LZSA mentioned in other comment(s) looks cool. Regardless, a small niche of applications where you'd need this in the first place.
would depend a lot on the architecture of the controller. not all of them allow you to run code from ram (Harvard vs von Neumann architecture differences).
You would need to understand what the worst case decoding time/operations for a given sized packet, and build your system/code around that.
Real-time does not necessarily mean that all operation X must take exact same time, it just means that you know that all operation X will occur within some span.
Real-time systems are unfortunately named. Time-constrained systems would have been better. Variable naming is, after all, one of the two real hard things in Compsci.
I think yes in principle although this is why you test things thoroughly (In an embedded context there both probably won't be much data flowing around and the code will probably be simple enough that any glaring problems should be fairly obvious)
Ultimately it depends on whether you mean embedded as in a self-driving car, or embedded as in a microwave controller.
FC8 is designed to be as fast as possible to decompress on "legacy" hardware, while still maintaining a decent compression ratio. Generic C code for compression and decompression is provided, as well as an optimized 68K decompressor for the 68020 or later CPUs. The main loop of the 68K decompressor is exactly 256 bytes, so it fits entirely within the instruction cache of the 68020/030. Decompression speed on a 68030 is about 25% as fast as an optimized memcpy of uncompressed data.
The algorithm is based on the classic LZ77 compression scheme, with a sliding history window and duplicated data replaced by (distance,length) markers pointing to previous instances of the same data. No extra RAM is required during decompression, aside from the input and output buffers. The match-finding code and length lookup table were borrowed from liblzg by Marcus Geelnard.
https://www.bigmessowires.com/2016/05/06/fc8-faster-68k-deco...