My problem with base-122 is simply that it's not an even power of 2. It's very e...

kobeya · on Nov 27, 2016

I'm pretty sure that Bitcoin has dealt with this issue for its base58 encoding. It might be worth checking if their algorithm is generalizable to other radix sizes.

deckar01 · on Nov 27, 2016

Check out digit-array if you are interested in the generalized algorithm. The source code is commented with formal math notation for the operations.

https://github.com/deckar01/digit-array/blob/master/README.m...

CiPHPerCoder · on Nov 28, 2016

This doesn't appear to address the problem. At all.

https://paragonie.com/blog/2016/06/constant-time-encoding-bo...

deckar01 · on Nov 28, 2016

I misread your original comment. Nothing I posted is related to constant time encoding. I apologize for the confusion.

CiPHPerCoder · on Nov 28, 2016

It's OK. I just want to be clear that my objection to base-122 is the same as my objection to base-N where N != 2^m for some integer m.

CiPHPerCoder · on Nov 28, 2016

No, it hasn't. I sent a pull request that attempts to solve this problem for a C# implementation of Base58Check, but there's no way for me to reliably guarantee that data isn't leaking from the divison.

https://github.com/adamcaudill/Base58Check/pull/3

deckar01 · on Nov 27, 2016

I think you are focusing on the wrong number.

Each byte of base64 produces 6 bits of data, so the boundary aligns at 32 bits. LCM(6,8) = (6•8)/2

Each byte of base122 produces 7 bits of data, so the byte boundary aligns at 56 bits. LCM(7,8) = (7•8)/1

Edit: Due to the variable length encoding, there is no guarantee of byte alignment.

beagle3 · on Nov 27, 2016

If it was base 128 it Would have produced 7 bits. But as it is 122 it produces 6.8 bits, hence a problem.

deckar01 · on Nov 27, 2016

> This uses one-byte characters encode seven bits and two-byte characters to encode fourteen bits. Hence this attains the goal of encoding seven bits per byte, i.e. the 8 : 7 inflation ratio.

The magic is in the 2 byte encoding 110sss1x 10xxxxxx.

There are really 890 characters used in this encoding, so it should be called base890.

((2^7)-6)+((6•2)•(2^6))=890

theoh · on Nov 27, 2016

A number system can have any natural number as radix, so why not? https://en.m.wikipedia.org/wiki/Radix

From the posted article: "This leaves us with 122 legal one-byte UTF-8 characters to use"

Seems legit to me.

dom0 · on Nov 27, 2016

> It's very easy to write a cache-timing-safe version of base{16,32,64} encoding ...

> Base-122? Not sure if it's even possible.

Parent is clearly referring to his previous statement, not that Base-122 itself is not possible.

theoh · on Nov 27, 2016

Not obvious to me at all.