This post covers a topic I've spent some time with in the past, and is generally a good overview but unfortunately gets the idea of "linear RGB" wrong. That means all of the results need some attention, including the Go implementation. Maybe for a part II post?
Each color value (e.g. red) is represented by a value from zero to full intensity. It's easiest to think of it as a number between 0 and 1 in a linear space. You could use a floating point number for that, or a quantized/fixed point value. For example the 10-bit quantized value round(r_linear*1023) in the range 0 to 0x3ff.
8-bit RGB color components are "encoded" from their linear version with a transfer curve (aka gamma compression). For sRGB, the curve is a piecewise linear and exponential combination. A good overview is [1]. There are many different encodings, including sRGB, BT.601, BT.709, etc. Then there's "full range" vs. "video range"... it can get complex pretty quickly.
Because of gamma encoding, an 8-bit R_sRGB red value is not equal to round(r_linear*255). You have to first compress r_linear via the gamma curve, then quantize that 0..1 value to 8-bits. When going in reverse (expanding an 8-bit sRGB value to linear), you generally take R_sRGB/255 to produce a value in the 0..1 range and then use the inverse gamma curve to get the linear value. These computations can be done in floating point, fixed point, or using lookup tables.
The takeaway is that you can't represent 8-bit sRGB color components in linear with just 8-bits, without losing precision. You need at least 12-bits for linear sRGB and many implementations just go straight for 32-bit float values for simplicity.
These conversions are required whenever you combine (blend) pixels encoded into sRGB: so for each pixel operation X, you decode sRGB to linear, perform X, then encode back to sRGB. It's expensive! That's why GPUs offer texture formats that specify a gamma encoding like sRGB, so a pixel shader can blissfully work in linear color, with the conversions done for it in hardware as a pre- and post-shader operation. On the CPU? You have to do it all yourself...
Because of that, many software libraries don't bother with the proper gamma conversion and just compute everything in the logarithmic (gamma encoded) domain. And most of the time, it looks OK! But it really is just a "cheap" approximation -- sometimes it can look quite bad compared to the (proper) linear computation...
As far as I can tell, none of the Go standard library does linear blending; and all of the image formats are assumed to be sRGB encoded. There are some 3rd party packages like [3] that can do some of color management on a 16-bit linear image format (RGBA64 == 16bits/component RGBA).
The other thing the author might consider is revising the "Why?" footnote to the "Random Noise (grayscale)" section. What the author is actually doing there is just using a cheap approximation to a rounding function: round(x) ~= floor(x + 0.5). In general, doing a round like that introduces a bias [2]. That section can be summarized as: after every pixel operation, round and clamp back to the valid range.
Just a small correction:
Logarithmic != gamma encoded. Most software use sRGB encoding which is close to a power function (often referred to as a gamma function). Logarithmic encoding is often used for encoding HDR images, but is not what most software use.
When I referred to "logarithmic domain", I'm not talking about a purely log/exp transfer function, but one that is "log like". Perhaps it is more accurate to say "non-linear domain"... but I hope you get the idea. :)
The sRGB transfer function is piecewise linear + exponential but can be closely approximated by a simple exponential with \gamma ~= 2.2 [1]. Either way, the encoding between linear and non-linear is generally referred to as "gamma correction", even when using a transfer function that is not a simple exponential.
Even if the article gets the "linear RGB" idea wrong, it doesn't matter for the results, because each channel in all explored palettes is still 1-bit: either on, or off, sometimes with the constraint that red and green cannot be on at the same time.
Unfortunately not when you're performing error diffusion, where the residuals are added to neighboring pixels. If you're doing it in the non-linear domain, you're going to get a different result during that diffusion step, even when you're dithering to a target 1-bit/pixel image.
You can see this visually in Surma's excellent blog post [1]: look for the gradient strips in the "Gamma" section.
> none of the Go standard library does linear blending;
I would understand a C++ library from the 1990s getting this wrong, or some toy project not bothering to implement colour management properly.
But to develop a new programming language for the 2010s to 2020s and blithely assume that images are always 8-bit sRGB is lazy beyond belief...
To put things in perspective, this would be roughly the same as making an application around the same time that simply assumes that the screen resolution is a fixed 1024x768 pixels.
My receipt printer library[0] uses dithering for images since the printers are black and white but fairly high resolution. It looks pretty decent [1], much better than a plain black and white conversion. I learned about dithering while implementing this library, and it’s a great technique! I originally implemented a black and white filter and then Floyd-steinberg dither natively in c# but then ended up using a library for it since we needed image transforms as well.
Ah, that 3rd image reminds me of trying to send images to my ESC-POS darling. I think I've got an entire roll of garbage from Linux Bluetooth glitches corrupting the output (macOS works 100% reliably on the same binary stream.)
Awesome, please do try it and let us know how it goes. As for the garbage, gotta have the hand on that power button so you can flip it off quickly when it starts scrolling garbage, it can really burn through paper quick!
I use an Epson printer[0] for developing the library, but this print out was from a contributor who’s trying to add support for the CUSTOM brand of receipt printers. If you have the Adafruit, please give the lib a try and let me know! It works on raspberry pis too.
Dithering is so cool! I love the different looks you can get with different algorithms. I added dithered output to a little 3D renderer I’m building and I think it looks super cool [0].
Also, it’s mentioned in the monochrome article, but Lucas Pope’s post on dithering in Obra Dinn is one of my favorite tech explorations [1].
Regarding the lack of info on "clustered dot" matrices, it may just be a keyword issue. This pattern is traditionally called "halftone" and comes from the limitations of print design.
I've done some graphics programming but never specifically for halftone patterns, a quick search turns up this SO post with some examples:
My understanding is that while "halftone" is an older term now, there was an in-between period where it meant any kind of dithering. Indeed, even on the page you linked you see people talking about Floyd-Steinberg and other kinds of dithering.
Thanks for the link, although the best I could find was this[1] old Java file that's completely uncommented. I was hoping for some textual documentation.
The StackOverflow answer above is referenced by various Python halftoning projects, which have documented or at least commented halftone code you may be interested in:
I think grandparent post is right, that you’re looking for the terms like halftone, halftoning, digital halftoning, binary halftoning, and the like. These are the terms that show up in papers:
Having worked in early digital press, I recall the term of art for a higher quality technique that came after “halftone screening” or “digital halftone” was “stochastic screening”:
This prevented the tell-tale circle effect (moiré or rosette patterns) of traditional screening. We used it to print high quality magazine photography. It also worked on Mac Plus for rendering photography as a better dither for black and white dots of irregular or non-geometric photography — which brings us full circle to dithering.
Using terms from Photoshop of the day, Pattern Dither uses a uniform pattern to represent levels of gray. Diffusion Dither uses a random pattern to represent levels of gray. Halftone Screen uses preset patterns (round, diamond, ellipse, line, square or cross) at frequencies and angles that can be varied as well.
When you're doing color error diffusion dithering, you have to make sure the edges and corners of the color space are in your output palette. Otherwise you may get errors that accumulate and can't be eliminated.
Thanks, I didn't know about this! Do you have any links to learn more? In any case, this is on the dev who uses the library and specifies a palette rather than the library itself though.
Sorry, I don't have any links - I learned it through personal experience. Kept getting ugly blotches in the output that weren't easily explained. Finally traced it to errors that were accumulating because the palette color they were mapped to left an even greater positive error.
> You can factor out the observer [the human] because what you are interested in here is basically energy conservation.
I would tend to disagree from a theoretical perspective, as you are trying to conserve energy from the point of view of the observer.
Our perception is not linear. The dithered images in the article appear too bright to me, probably due to the Helmholtz-Kohlrausch effect. Likewise, green is a lot brighter than other colors to us, and we have a greater depth of perception, so it might be more important to propagate errors there?
As another commenter has noted, the most likely reason for that is missing gamma correction. An sRGB value of 128 is dithered as 50% black, 50% white, when it should be around 78% black, 22% white.
No, the article and the library does gamma correction, and I've observed the exact effect you're talking about before. I'm not sure why the images are appearing too bright for the person above. Maybe because it's using pure colours like red and green which are bright on RGB displays? I should've chosen a different palette.
That’s what I get for parroting claims I haven’t verified! That’s great, thanks for the correction.
Hmm, as far as I understand the math, your palette being pure colors shouldn’t really matter if the display calibration were perfect. Which I guess is an unrealistic expectation. :)
However, pure red will always appear brighter than non-pure red, even at the same luminance (~energy), which is what Helmholtz-Kohlrausch is about. IIRC pure red appears brighter than other pure colors as well at the same luminance level, but I am not sure how computer monitors interact will all of the above...
Dithering has always given me the most incredibly overwhelming nostalgia, to the point where it’s emotionally moving. No doubt because my first computer was a Macintosh SE and it tickles all the right memories, but there’s something more to it, the technical impressiveness of how damn well it works to add realism with so few colours.
Doesn't it look cooler now because it's "retro"? I'm sure the people who were forced to use it for hardware reasons would've loved to have all 24 bits available to them.
Primitive/retro looking cool reminds me of Veblen on the "exaltation of the defective":
"...the cheap, and therefore indecorous, articles of daily consumption in modern industrial communities are commonly machine products; and the generic feature of the physiognomy of machine-made goods as compared with the hand-wrought article is their greater perfection in workmanship and greater accuracy in the detail execution of the design. Hence it comes about that the visible imperfections of the hand-wrought goods, being honorific, are accounted marks of superiority in point of beauty, or serviceability, or both. Hence has arisen that exaltation of the defective, of which John Ruskin and William Morris were such eager spokesmen in their time; and on this ground their propaganda of crudity and wasted effort has been taken up and carried forward since their time. And hence also the propaganda for a return to handicraft and household industry. So much of the work and speculations of this group of men as fairly comes under the characterisation here given would have been impossible at a time when the visibly more perfect goods were not the cheaper." – Thorstein Veblen, The Theory of the Leisure Class, 1899, p162
Posterizing, which I guess is a type of low-bit dithering, or not dithering at all, has been popular of a long time. There's the famous Obama "Hope" poster with him in front of a French flag that was really popular in 2008.
As presented, it is not really about going beyond 1 bit. It is beyond 1 color channel, with a twist that the color channels are not independent. And adaptation of dithering algorithms to a black-red-green palette (with a missing yellow) and exploration of their behavior is the scientific content that the article showcases.
And the "missing yellow" constraint is important, without it, the problem would be separable and therefore uninteresting.
The matrices for clustered dot halftoning should not be too hard to generate. I think it boils down to which approach take to generate the thresholds matrix:
> (ordered dither): This algorithm is generally identified as a dispersed-dot technique [Limb 69], but if the intensity threshold levels are spatially concentrated it results in a clustered-dot dithering.
You want to know something cool? Here's a tech report from HP that describes a stenography approach based on halftoning:
There also is the void and cluster method, which gives nice results comparable to error diffusion while not suffering its data dependencies, i.e. it's as parallelizable/GPU-friendly as ordered dithering.
I built a tool for playing around with different dithering effects and palettes if anyone is interested in having a play: https://doodad.dev/dither-me-this
Dithering is also really handy if you're screen printing and want to do things in one ink cololor (IE grey to 1-bit B&W). Having Illustrator trace the dot layer is a bit painful but it seems to work!
In my experience, I've been unable to detect any discernable difference compared to other 16bit or 24bit Hi-Res audio at 44.1 or 96KHz. What _is_ totally discernable is the file sizes in the gigabytes ranges as opposed to tens of megabytes for equivalent 44.1/96KHz flacs. Ive found this a good resource for comparison: http://www.2l.no/hires/
Each color value (e.g. red) is represented by a value from zero to full intensity. It's easiest to think of it as a number between 0 and 1 in a linear space. You could use a floating point number for that, or a quantized/fixed point value. For example the 10-bit quantized value round(r_linear*1023) in the range 0 to 0x3ff.
8-bit RGB color components are "encoded" from their linear version with a transfer curve (aka gamma compression). For sRGB, the curve is a piecewise linear and exponential combination. A good overview is [1]. There are many different encodings, including sRGB, BT.601, BT.709, etc. Then there's "full range" vs. "video range"... it can get complex pretty quickly.
Because of gamma encoding, an 8-bit R_sRGB red value is not equal to round(r_linear*255). You have to first compress r_linear via the gamma curve, then quantize that 0..1 value to 8-bits. When going in reverse (expanding an 8-bit sRGB value to linear), you generally take R_sRGB/255 to produce a value in the 0..1 range and then use the inverse gamma curve to get the linear value. These computations can be done in floating point, fixed point, or using lookup tables.
The takeaway is that you can't represent 8-bit sRGB color components in linear with just 8-bits, without losing precision. You need at least 12-bits for linear sRGB and many implementations just go straight for 32-bit float values for simplicity.
These conversions are required whenever you combine (blend) pixels encoded into sRGB: so for each pixel operation X, you decode sRGB to linear, perform X, then encode back to sRGB. It's expensive! That's why GPUs offer texture formats that specify a gamma encoding like sRGB, so a pixel shader can blissfully work in linear color, with the conversions done for it in hardware as a pre- and post-shader operation. On the CPU? You have to do it all yourself...
Because of that, many software libraries don't bother with the proper gamma conversion and just compute everything in the logarithmic (gamma encoded) domain. And most of the time, it looks OK! But it really is just a "cheap" approximation -- sometimes it can look quite bad compared to the (proper) linear computation...
As far as I can tell, none of the Go standard library does linear blending; and all of the image formats are assumed to be sRGB encoded. There are some 3rd party packages like [3] that can do some of color management on a 16-bit linear image format (RGBA64 == 16bits/component RGBA).
The other thing the author might consider is revising the "Why?" footnote to the "Random Noise (grayscale)" section. What the author is actually doing there is just using a cheap approximation to a rounding function: round(x) ~= floor(x + 0.5). In general, doing a round like that introduces a bias [2]. That section can be summarized as: after every pixel operation, round and clamp back to the valid range.
[1] https://blog.johnnovak.net/2016/09/21/what-every-coder-shoul... [2] http://www.cplusplus.com/articles/1UCRko23/ [3] https://github.com/mandykoh/prism