LLVM IR is a lot easier to generate than machine code and unlike machine code, it can be optimized after it is generated.
When using LLVM IR as a compiler target, you leave certain decisions to be made later by the LLVM to machine code compiler. In particular, you don't have to do exact instruction selection, instruction scheduling, register allocation (LLVM IR has infinite "registers") and memory allocation when compiling. The LLVM backend is very good at making these decisions and writing a compiler that is better than the LLVM backend would be years of work.
LLVM IR is also machine independent so you can write just one compiler front end that emits LLVM IR and compile that to machine code of many different CPUs. You can even distribute the LLVM IR bitcode and use the LLVM JIT compiler to assemble machine code at run time when you know exactly what the target architecture is and have optimized code just for that platform (CPU instruction set extensions, like SSE, NEON and other SIMD extensions, etc).
In summary, LLVM IR is a lot easier to generate than machine code and the LLVM backend will emit better machine code than a hand built backend (unless you have years to spend on it).
> LLVM IR is a lot easier to generate than machine code and unlike machine code, it can be optimized after it is generated.
Just a small nit: it's possible to optimize machine code. In fact, LLVM has some peephole optimization passes on the MachineInstr level, and hooks to add others. LLVM IR is, however, much more amenable to optimization, since it's higher-level, contains type information and was generally designed to be optimizable (i.e. SSA form).
LLVM IR is more abstracted than a concrete machine assembly language, but it still contains implicit and explicit signs of the machine it was compiled for, for example
* Types that exist in one machine and not another (e.g. x86_f80 does not exist on ARM).
* Endianness
* Structure alignment
As the link above says, LLVM IR is an excellent compiler IR, but it was never meant to be a portable machine-independent IR.
But it can be abstracted enough for some purposes, and some projects do use it for related things.
Look at IR as a layer of its own, separate from Clang. You can even use it as your source language (although it's inconvenient, of course). For example, the Kaleidoscope language developed in the LLVM tutorial is target independent. The same is probably true for high-level languages that had LLVM backends attached - i.e. Haskell, Emscripten (Javascript) etc.
In a theoretical sense, yes. But it would be very hard to avoid introducing nonportable elements in your code. There is no practical way to go from any existing language to LLVM and keep it portable, with any existing LLVM frontend that I am familiar with.
When using LLVM IR as a compiler target, you leave certain decisions to be made later by the LLVM to machine code compiler. In particular, you don't have to do exact instruction selection, instruction scheduling, register allocation (LLVM IR has infinite "registers") and memory allocation when compiling. The LLVM backend is very good at making these decisions and writing a compiler that is better than the LLVM backend would be years of work.
LLVM IR is also machine independent so you can write just one compiler front end that emits LLVM IR and compile that to machine code of many different CPUs. You can even distribute the LLVM IR bitcode and use the LLVM JIT compiler to assemble machine code at run time when you know exactly what the target architecture is and have optimized code just for that platform (CPU instruction set extensions, like SSE, NEON and other SIMD extensions, etc).
In summary, LLVM IR is a lot easier to generate than machine code and the LLVM backend will emit better machine code than a hand built backend (unless you have years to spend on it).