I want to compile a C program so simple I can explain all of the assembly

old-gregg · on March 16, 2010

I wish someone would dissect JVM in exactly the same way: i.e. clearly explained what needs to be stripped off to have a quickly loading "hello world" implementation eating less than 1MB of RAM and starting in a few microseconds, like a normal, sane process should.

All these wonderful things are being built (Clojure, JRuby) on top of JVM that are of no use outside of web/EE because default JVM is so heavy.

Yes, a VM does a lot more than just bootstrapping your stdlib, like in case of libc, yet I keep thinking there must be plenty of unnecessary fat to strip off. Just look at Microsoft CLR: same feature set, yet none of that sluggish starting, RAM-wasting JVM nonsense.

kg · on March 16, 2010

This reminds me of a blog post from one of the Unity developers:

We joke that doing anything in C# will result in an XML parser being included somewhere. This is not that far from the truth; e.g. calling float.ToString() will pull in whole internationalization system, which probably somewhere needs to read some global XML configuration file to figure out whether daylight savings time is active when Eastern European Brazilian Chinese calendar is used.

http://aras-p.info/blog/2009/11/14/improving-cmono-for-games...

The sad thing is that he's not kidding - if you profile a typical application under Mono or .NET, it loads an XML parser almost immediately.

vl · on March 16, 2010

.NET policy configuration files are in XML. To parse them there is another internal XML parser in mscorlib (.NET runtime dll).

InclinedPlane · on March 17, 2010

I heard a joke once that anytime you run Javascript in a browser an XML parser is being included somewhere.

judofyr · on March 16, 2010

It's not very well documented, but BiteScript looks very interesting:

http://blog.headius.com/2009/03/bitescript-001-ruby-dsl-for-...

http://blog.headius.com/2009/05/bitescript-002-scripting-exa...

bokchoi · on March 17, 2010

There are folks within Sun/Oracle doing exactly that. One effort, Project Jigsaw, is aimed at splitting the large rt.jar into smaller modules to reduce the number of classes loaded on startup. Here is an update from last December:

http://blogs.sun.com/alanb/entry/is_the_jdk_losing_its

Somewhere in there is a comment that says the JVM loads around 300 classes to run an empty main class.

I'm curious to see what the merged HotSpot+JRockit JVM will look like in a few years.

Tuna-Fish · on March 17, 2010

> Just look at Microsoft CLR: same feature set, yet none of that sluggish starting, RAM-wasting JVM nonsense.

If anything, CLR is more bloated than the JVM, not less. The libraries it uses just are used at startup, so when you start an app they are already in memory.

Note that I don't think that the bloatedness of JVM or CLR is a bad thing -- as long as you properly dynamically link, the cost of having them is quite minimal on modern hardware.

a-priori · on March 16, 2010

If you really want to write code that you can understand all the way down I suggest starting from as close to bare metal as your level of masochism allows. For me, that's GRUB.

This tutorial walks you through making a kernel image that GRUB can load, and shows how to poke bytes into video memory to print characters to the screen:

http://wiki.osdev.org/Bare_bones

akgerber · on March 16, 2010

And then buy an FPGA and bust out some Verilog and write the metal yourself.

VBprogrammer · on March 16, 2010

When I was about 14 I found writing a floppy disk boot loader in x86 Assembly to be a good way of learning how a computer really works at the most basic level. Including reading FAT12 to find the start of your next piece of code and even displaying a slash screen!

While a little masochistic I find I still call upon what I learned back then while writing in C and to a lesser extent higher level languages.

wallflower · on March 16, 2010

"So I said, "I'll look into this floppy disk." And I started pulling up the datasheet on that chip, and I started coming up with my first ideas of "how do I have that chip get the data to a floppy disk?" And then I came up with this clever little approach. I needed a little bit of logic in here..."

Steve Wozniak, Founders at Work

Amazing full inspiring interview (I have read every interview in JL's book and it is by far my favorite)

http://www.foundersatwork.com/steve-wozniak.html

signa11 · on March 17, 2010

which kind of reminds of alan-kay's musing "Hardware is really just software crystallized early"

schemer · on March 16, 2010

6.828 gets down to the bare metal. Or close to it.

http://pdos.csail.mit.edu/6.828/2009/

pgbovine · on March 16, 2010

reminds me of a really old article (before the term 'blog' even existed) about a person exploring why the heck a 'hello world' Linux ELF binary was so darn big. i think it's an interesting exercise to figure out why 'hello world' in your favorite language/runtime environment is the size that it is (e.g., what initialization code is being called, are any VMs or intepreters being setup, etc.)

AgentIcarus · on March 16, 2010

That was my first thought as well. That article is found at http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm... (if that's the one you were thinking of)

pgbovine · on March 16, 2010

yup exactly! HN makes for a great crowdsourced expert search engine ;)

breadbox · on March 18, 2010

I'm not surprised you thought of that; Jessica McKellar's article almost sounds as if it was modeled after mine in places. I'm wondering now if that's due to actual imitation, or a subconscious influence, or if it's just a natural approach for someone to take to the subject. Which naturally leads to the question: was I unconsciously imitating someone else when I wrote my article?

(Come to think of it, I vaguely remember having in mind some of Isaac Asmiov's science essays when I wrote that -- one of the type in which he would trace the discovery of some principle through several refinements over the centuries. Those essays gave me an appreciation for the value of studying science in its historical context. But that's a much more general influence than what I'm referring to.)

jjguy · on March 17, 2010

The same project with Win32 PEs: http://www.phreedom.org/solar/code/tinype/

I've used that page as a reference several times over the years.

plaes · on March 16, 2010

Yup, it was the same article that I was thinking about. Although architecture has changed from x86 to amd64.

Periodic · on March 16, 2010

I love being reminded occasionally about the lower levels of computer operation. It never ceases to amaze me how easy it is to gloss over the details of computers with an abstraction. We create layer upon layer of abstraction and can quickly forget what is underneath, if we ever knew in the first place.

As long as those abstractions don't leak, it isn't a big deal, but when they do you had better know what's going on down below.

rythie · on March 16, 2010

I once wrote a CPUid feature check program in assembly, it took surprisingly long to learn and write those 80 lines (including blank lines and comments)

The executable was 736 bytes: http://rythie.com/labs/cpuid.php

ryanmerket · on March 17, 2010

Check out this old school keygen tutorial I wrote: http://krobar.by.ru/krobar/other/key107.txt

najirama · on March 17, 2010

This is 'Hacker' News, not 'Cracker' News..though I must admit it was a provocative read.

vinhboy · on March 16, 2010

That was a really cool article, anyone with more knowledge on the subject want to comment on how sound the author's thought process is? Thanks.

onedognight · on March 16, 2010

The article is quite sound and doing all this is standard practice in the embedded world where you can't afford the size of glibc. The article is a bit incomplete, but presumably he'll address the rest in his next installment.

As a preview, to provide the correct C ABI to main() you need to zero the BSS (this is how your static variables get initialized), maybe copy the initalized data, call the constructors and destructors for C++ / C (gcc extension), provide memcpy() and others stdlib functions (which gcc will use even if you don't), etc. You can do all of this without assembly in C, but it does take some effort.

jedbrown · on March 16, 2010

> presumably he'll address the rest in his

by Jessica McKellar

rue · on March 16, 2010

Pretty sexist to make assumptions just because the dude's name is Jessica.

jedbrown · on March 16, 2010

touché

ajross · on March 16, 2010

It all looks correct to me. Some of it is a little "needlessly-surprised", honestly, like the bit about having to make a syscall trap to exit the program. Programs don't exit on their own: something needs to tell the kernel that the process is done.

jcdreads · on March 16, 2010

This point isn't actually blindingly obvious. Those of us raised on Pascal might have assumed (or, in my case, actually did assume, without thinking hard about it) that the program ended when the execution point reached "the end," where the main method was probably located in the binary.

It's disturbing how many Pascal-isms still pervade my thinking, even 20 years since I last (willingly) touched the language.

watmough · on March 16, 2010

I don't think this is blindingly obvious.

I always imagined something 'called us', then just returning from that would be enough to get everything shutdown.

Certainly when I wrote 6502 code, or 68000 code, I would just return.

On the iPhone, you get advised you are shutting down, do your clean-up, then someone shuts down the message loop and you are gone.

DanielBMarkham · on March 16, 2010

There was an example I read recently of doing this in Windows that I liked more -- perhaps because the setup and tear-down was more interesting.

sundeep · on March 16, 2010

Do you have a link for that article? Sounds like something I'd be interested in reading. Thanks.

Gonsalu · on March 16, 2010

I think he's talking about this article: http://www.phreedom.org/solar/code/tinype/

acg · on March 16, 2010

I'm not sure I understand the revelation here. GCC is often used to target single boards with little resources. You can build gcc for the environment you want to target. Sounds like attempting to use a compiler flag in a way it wasn't intended. If you wanted to reduce the binary size wouldn't you look at the linker?

ars · on March 17, 2010

Did you actually read the article? You don't sound like you did.

kbradero · on March 17, 2010

Lets see a really simple program you can explain all te assembly :)

We need to write our program like this:

$echo 0000000: 55 48 89 e5 b8 ff aa 00 00 c9 c3 |xxd -r > sum2.bin

Here we have the SAME little program in C: $ cat sum.c int sum(void){ return 0x00ff + 0xaa00; }

We can getLook at the results: $ gcc -c sum.c -o sum.o (get the raw opcodes in osx, intel arch )

$ otool sum.o -td|sed -n '3,$p'| awk '{ print $0}'|xxd -r > sum.bin

Now you can look at asm level your code: $ ndisasm -b 32 sum.bin 00000000 55 push ebp 00000001 48 dec eax 00000002 89E5 mov ebp,esp 00000004 B8FFAA0000 mov eax,0xaaff <-- gcc put our 'sum' final product here at compilation time 00000009 C9 leave 0000000A C3 ret

So your program is now reduced to this code: $ hexdump sum.bin 0000000 55 48 89 e5 b8 ff aa 00 00 c9 c3 000000b

Test your 2 files md5 sum.bin sum2.bin MD5 (sum.bin) = a0ccc94bcdc860a81ff28252f56c2257 MD5 (sum2.bin) = a0ccc94bcdc860a81ff28252f56c2257

We could probe our code with a selfmade userland loader: $./uloader sum2.bin Display Opcodes to exec: 55 48 89 e5 b8 ff aa 00 00 c9 c3 End opcodes code to exec address: exec_code =0x100100080 new crafted Proc : address = 0x100100080 returned value ==>aaff

----BEGIN Code--- #include <stdio.h> #include <stdlib.h>> #include <unistd.h> #include <fcntl.h>

int main( int argc, char argv[] ){ unsigned int (proc)(); unsigned int fdprog=0; unsigned int exec_code=NULL; unsigned char ptr=NULL; unsigned int returned_value=0x0; exec_code=(int ) malloc( 100 ); ptr=( char )exec_code; fdprog=open(argv[1], S_IRUSR ); printf("Display Opcodes to exec:\n"); while( read(fdprog, ptr, sizeof(unsigned char)) ){ printf(" %02x ", ptr ); ptr++; }

        printf("\nEnd opcodes\n");                                                                                      
        printf("code to exec address: exec_code =%p \n",exec_code);                                                              
        proc=(unsigned int (*)() ) exec_code;                                                                          
        printf("new crafted Proc : address = %p \n",proc);                                                                       
        returned_value= (*proc)();    //here is the magic bro! :)
                                                                                                                        
        printf("returned value ==>%lx \n",returned_value);                                                                            return 0;                                                                                                               }

----END Code ---
Saludos! Jorge A. Garcia.

jriddycuz · on March 16, 2010

Thanks for the article! Also, I love that your computer is named "kid-charlemagne". :)