According to a question at the end, this is about very old CPUs, K8/K10, because the newer ones authenticate microcode updates with public key cryptography which hasn't been broken. Still pretty amazing stuff.
That's just the tail end of K10 production (2012 according to Wikipedia). Its successor, Bulldozer, came out in 2011, but a new architecture being out doesn't mean its predecessor immediately stops production.
The "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.
> The "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.
Wasn't Hackintosh (or getting new OS X running on too old hardware) also using this technology to "support" CPUs without the newest SSEx instruction sets?
AMD already implements 256bit operations in terms of 2 128bit operations. Seems like going to 4 wouldn't be a stretch, at least for some of the simpler operations. Seems like a subset of operations would be possible.
> AMD already implements 256bit operations in terms of 2 128bit operations.
Current Zen has actual 256-bit registers, it just doesn't have the execution units to process the whole register at once. It's not really the same thing.
I think Zen2 implements 256 bit instructions natively [0]. For AVX512, the new instructions [1] rather than the floating point arithmetic will be a problem IMHO. Emulating them with the microcode will be expensive and will provide no performance gains.
I wonder if it would be possible to dump, from microcode, the contents of the microcode ROM. This would neatly sidestep the problems inherent in decoding the ROM contents from pictures of decapped chips.
Nope. The chip is very much designed around x86 decoding, even before you get to the ucode ROM/RAM. Additionally, you only have a handful of patch RAM locations.
It would be hard, because the ISA is tightly bound to the underlying silicon's structure.
Some of the commands cannot be translated to the silicon effectively or not at all.
e.g.: MIPS have 64 x 64bit registers. You can use any of them as a source or a destination, however x86 always designates EAX as the ALU accumulator. This has some profound effects on silicon design.
Actually no. After decoding there is nothing special in the aex register.
AMD at some point was going to release K10 which was basically Zen but with an ARM decoder. It got cancelled when Zen proved viable and AMD decided it was better to compete with Intel than all the ARM vendors.
> Actually no. After decoding there is nothing special in the aex register.
The microcode, or specifically the modern x86 processors, are using register renaming to move things around, but the actual ASM commands imply that the results should end in EAX register. You cannot arbitrarily do a MUL and get the result from EBX for example [3]. i.e. x86 assembly dictates where the results should end in.
AMD played with two ideas: A pure ARM core, and a hybrid x86 core with ARM co-processor. The ARM core missed the performance targets [0], and they also abandoned the ARM accelerated x86 core [1], but I don't know why.
They never intended to go full TransMeta and transcode the x86 ASM into something proprietary or ARM.
Bonus: It seems they are still muling the idea of X86/ARM hybrid [2].
I'm asking why. Is there some reason for them not to open it? AMD are quite positive about opening up other things, like GPU drivers for example. So why not firmware as well?
In the GPU case I know the reason - it's the DRM garbage (HDCP and Co.). Support for DRM dictates for them to keep it closed. But even there, they could provide alternative firmware without DRM, and make it open. But for CPU, there is no real reason it seems.
because there's a lot of proprietary stuff in microcode that's used for accelerations. gfx drivers too. it's the reason the closed amd drivers are so much faster than the open mesa ones.
yeah... I don't know what numbers you're looking at but that's not true in the general case. and this isn't firmware, it's microcode. firmware is already on the chip. microcode is used so the os can take advantage of chip specific features, like security patches or even acceleration.
Do you have actual benchmarks which show the closed source OpenGL driver significantly faster than the open source one? In Phoronix benchmarks I've seen, the open source driver beats the closed source one by a large margin.
A lot has changed in the last two years. Nowadays you have an occasional game that is faster on the blob driver, but most are faster under Mesa, often significantly so.
Mesa almost always uses proprietary firmware. The fail0verflow guys did some work last year to at least document it for the PS4's GPU to patch a bug. But the upstream Radeon Mesa guys are really hesitant to upstream it to avoid pissing off AMD. https://github.com/fail0verflow/radeon-tools/tree/master/f32
Of course that's all sorta orthogonal because that's all not really microcode or firmware in the classic definition, but just "code for an embedded processor I don't want to document."