Unicode is not hard. What's hard is the conversions between all these different systems. That's the hard part. Unicode is simple enough to be done flawlessly as long as you stick to unicode for everything.
If you only need to receive, store, and send text, Unicode is easy enough and you can just treat it as a byte stream. Once you get into things like manipulating text, comparisons and searches, or displaying text, things get hairy and all kinds of fun algorithms from the various Unicode Technical References and Notes make their appearance. Those parts are the ones that increase complexity.
Also, a major reason why Unicode is large and complex is because languages and scripts are large and complex. Unless we all agree on using simple computer-friendly languages and scripts that complexity is not going to change, and the need of working with older scripts (e.g. for historians and researchers) still requires something like Unicode. Unicode is the kind of thing that emerges from a messy world, and unsurprisingly it's messy as well.
Unicode is still _way_ less hard than anything else for manipulating text. Global human written language is complicated, unicode is a pretty ingeniously designed standard, it's got solutions that work pretty darn well for almost any common manipulation you'd want to do. Now, everything isn't always implemented or easily accessible on every platform, and people don't always understand what to do with it -- because global human written language is complicated -- but unicode is a pretty amazing accomplishment, quite successful in various meanings of 'succesful'.
It's hard, because there's a lot more to learn and to do than if you stick to (say) ASCII and ignore the problems ASCII can't handle.
It's easy, because if you want to solve a sizable fraction of all the problems ASCII just gives up on, Unicode's remarkably simple.
In the eyes of a monoglot Brit who just wants the Latin alphabet and the pound sign, unicode probably seems like a lot of moving parts for such a simple goal.
Something as simple as moving the insertion point in response to an arrow key requires a big table of code point attributes and changes with every new version of Unicode. Seemingly simple questions like "how long is this string?" or "are these two strings equal?" have multiple answers and often the answer you need requires those big version-dependent tables.
I think Unicode is about as simple as it can possibly be given the complexity of human language, but that doesn't make it simple.
A Brit hoping to encode the Queen's English in ASCII is, I'm afraid, somewhat naïve. An American could, of course, be perfectly happy with the ASCII approximation of "naive", but wouldn't that be a rather barbaric solution? ;)
For anything resembling sanely typeset text you’d also want apostrophes, proper “quotes” — as well as various forms of dashes and spaces. Plus, many non-trivial texts contain words in more than one language. I’d rather not return to the times of in-band codepage switching, or embedding foreign words as images.
But comparing something to something else and it being easy, doesn't make it easy by itself.
Paraphrasing the joke about new standards: we had a problem, so we created a beatiful abstraction. Now we have more problems. One of the new problem being normalization.
It doesn't undermine the good that Unicode brought, but you can't say to have included some unilib.h and use its functions without understanding all the Unicode quirks and its encodings, because some of the parameters wouldn't even make sense to you, like the same normalization forms.
1. Either your restrict yourself to the kind of text CP437/MCS/ASCII can handle (to name the three codecs in the blog posting). In that case unicode normalisation is a noop, and you can use unicode without understanding all its quirks.
2. Or you don't restrict the input, in which case unicode may be hard, but using CP437/MCS/ASCII will be incomparably harder.
Unicode IS hard. It's hard because concepts that exist in ASCII don't really extend to Unicode, and many of them depend on what locale you're operating in. Things like case conversion (in Turkish, ToUpper("i") should be "İ", not "I"), comparison (where do you put é, ê, and e?), what constitutes a character, word, whitespace, what direction do you write text in, how many spaces do characters take up in the terminal, etc.
Some of these concepts exist when limited to ASCII.
For example, in olden times, or when restricted to ASCII, the Nordic letter "å" is written "aa", but it is still sorted at the end of the alphabet — "Aarhus" will be close to the end of a list of towns.
In Welsh there are several digraphs, single letters written with two symbols. The town "Llanelli" has 6 letters in Welsh. (There are ligatures, but I don't think they're often used: Ỻaneỻi.)
Indeed, collation, case-insensitive string matching, and probably a bunch of other things must be used with an appropriate locale. That was the case before Unicode and is still the case with Unicode. The only difference is that the tables for how to do it are slightly larger now, but the operation itself isn't (much) more complex.
I would edit that to say "as long as you stick with UTF-8 for everything." Unicode defines more than one encoding, not just UTF-8, but also UTF-16 and UTF-32.
It's hard in command line tools like coreutils since there is no setting (afaict) for making sure all string comparisons are normalized. So you end up trying to compare which files using composed vs precomposed glyphs is painful. e.g. make; if the files generated us composed glyphs but you type precomposed glyphs into your makefile then nothing will work despite the filenames appearing to be the same.