Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ARM immediate value encoding (2014) (mcdiarmid.org)
97 points by JoshTriplett on Jan 30, 2018 | hide | past | favorite | 26 comments


And with Aarch64 they invented several other types of immediate value encodings: https://stackoverflow.com/questions/30904718/range-of-immedi...


Including the incredibly weird "bitmask immediate" for logical operations.

It's so complicated that the assembler in binutils actually implements it by brute-forcing!

https://github.com/bminor/binutils-gdb/blob/c40d7e49cf0a6842...


Another very interesting characteristic of the arm instruction set is that MOV and AND have their opposite op (MVN and BIC).

That means you can write MOV, r0, #-1 (equivalent to #0xFFFFFFFF) and the assembler will emit a MVN r0, #0 to fulfill your needs. This increases the number of constants that can be directly assigned.


The MVN and BIC are not the "opposite ops", they are the same operations but they do "binary not" over every bit of the parameter which is easy to implement in hardware. It's true this also allows for better code density.


Just wondering, would it be better to normalise like flooding point does and always include a virtual 1 on the front of the number? This encoding wastes space because some numbers can be encoded multiple ways e.g. 3 and 6.

On the other hand, that would mean zero couldn't be encoded, but you don't need a zero for most maths as it doesn't do anything (and you could use a zero register.)

Edit to add: the even shift is a problem, but you could use 7 bits of number, 1 implicit leading 1 bit, and 5 bits to cover any 32-bit shift.


ARM, like other RISC architectures MIPS and PowerPC, has a fixed instruction size of 32 bits. This is a good design decision

So good that they got rid of it in their new chips.


AArch64's ISA has only 32-bit instructions.


Can you explicit what you mean by that? Are you talking about Thumb? Although that's not exactly "new" I suppose.


Yep all these ISA have a 16bit extension.


Except, for some reason, ARM's new 64-bit ISA, which has 32-bit instructions only. I wonder how long before they release a "Thumb-64" extension...


I doubt it'll be as easy as it was for 32-bit ARM because AArch64 has no (well, only a few) predicated instructions. Instead, the 64-bit ISA effectively uses those bits to name additional registers.

AArch64 already has similar code density to x86-64 anyway.


Wow, this is very cool and I had no idea it was part of the architecture. It's from 2014, but still news to me!

I find myself using (apparently unnecessary) 'LDR' instructions because of this sort of thing all the time. And that adds up, in Cortex-M chips with kilobytes of program memory.


Note that the Cortex-M chips implement only the Thumb/Thumb2 encodings. Those have different rules for encoding immediates -- the rules this article describes are for the 32-bit ARM (A32) encoding.


Oh...well, thanks. That is slightly less exciting, but still probably worth looking into.


Thumb-2 immediate encoding is even better (and cooler)!

The following classes of constant are available in Thumb-2:

•Any constant that can be produced by shifting an 8-bit value left by any number of bits.

•Any replicated halfword constant of the form 0x00XY00XY

•Any replicated halfword constant of the form 0xXY00XY00

•Any replicated byte constant of the form 0xXYXYXYXY

See page 92 here: http://hermes.wings.cs.wisc.edu/files/Thumb-2SupplementRefer...

There is also MOVT which will set the top 16 bits of a register to any 16-bit value and not affect the bottom 16bits

And there is MOVW (MOV.W) which will set the bottom 16 bits to any value and zero the top 16 bits

MOVW + MOVT allows loading any 32-bit value in 2 instructions (8 bytes)


I did some off-the-cuff analysis of some random code a couple of years ago and found the traditional ARM constant encoding to be not as good - for that code, at least - as the MIPS-/POWER-style load low/or high business. So if my code is at all representative I'm not surprised they went for MOVW+MOVT in the end: https://news.ycombinator.com/item?id=11607119#11608650

Regarding the 00XY00XY and XY00XY00 forms: I'm going to have to sleep on this, but this is a bit of mystery to me so far. I'm going to have to keep an eye out for those sorts of constants now! Maybe they're actually quite common?? - I suppose this will certainly let you form many kinds of mask useful for SIMD operations.


Kind of off topic but does anyone have any resources on operating system development for AArch64?


PowerPC has addis (add immediate shifted) though, which as it implies, shifts the immediate value 16bits to the left before the add. In this worst case, you can add any arbitrary 32bit value to any register with two opcodes.

So, is it the case that with ARM, you _may_ be able to do the same add in one op, but it's worst case appears to be 4 ops for the full range of immediate values?


For nonrepresentable constants, a compiler will usually generate a constant pool and emit a single PC-relative load instruction to materialize the value. Many assemblers have directives to do this automatically.


so... what is the proper way to set a register to the value 0x101, or any other value that cannot be represented?


In ARMv7 and previous, the program counter is an ordinary register, so you can perform a load relative to the PC for anything that doesn't fit into a register's literal field. Compilers typically emit so-called literal pools before the entry or after the exit point of each function (or sometimes before a basic block inside very large functions). Some compilers support literal pool merging as a code size reduction technique, too.

Edit: Also, ARMv8 has dedicated instructions for for pc-relative loads and jumps.


On ARM1156, and ARMv7 and newer, use MOVW, which takes a simple 16-bit immediate. If 16 bits doesn’t do it, use a MOVW/MOVT pair which can load any 32-bit value.

On older ARMs, either do a load immediate followed by an immediate arithmetic, like so:

    MOV R0, 0x0100
    ADD R0, R0, 0x01
Or use a PC relative load:

    LDR R0, [PC, offset where you stored 0x101]
Which the assembler allows you to abbreviate as:

    LDR R0, =0x101
And it will find a spot for 0x101 and generate the correct PC relative load.


Use two instructions; load part of it then load the rest.


If I understand correctly, for some values that would take four instructions.


You would use the LDR instruction to load the word from RAM. So just one instruction, but with some slower memory access.

When writing ARM assembly, the assembler (at least the ones I've seen) abstract this detail away from you.

You can use

    ldr    r0, =0xff000000
and the assembler will emit a MOV if it can, and if it can't it will put the value somewhere in RAM and use LDR to load it up.

ARM is fun.


Modern ARM actually has a 16-bit immediate version of MOV, plus MOVT which is the same but moves to the top 16 bits of a register. With a pair of those, you can load any 32-bit value in 2 instructions, without polluting the data cache like you would with LDR [pc+offset].




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: