Depends on the language. If you're using something like C++, you can probably remove a lot of dead code. But with more dynamic languages, like Objective-C or Ruby or JavaScript, it becomes difficult or impossible to prove that a given chunk of code is never used. In Objective-C (which I'm most familiar with), the linker will keep all Objective-C classes and methods because it has no idea if you might be looking things up by name at runtime and invoking them that way.
Even if you can throw away dead code, you can run into bloat because a library has a massive set of foundational APIs that the rest is built on, and using one little feature of the library ends up bringing in half the library code because it's used everywhere.
You could use static analysis to find out if anything calls objects dynamically by name at run time which could be a useful optimization. To be safe this would be one bool value for the entire code base, but with some care it would help. A larger issue is size is simply not considered a significant issue.
With Objective-C, that will always come back "true." Even if you never use dynamic lookups, Apple's frameworks do. I suspect the same is true of other languages.
Nibs and storyboards are full of this stuff. They instantiate classes by name, call methods by name, etc. Some example documentation if you're curious:
Objective-C's dispatch is built around looking up methods by their selector, which is effectively an interned name. Looking up the selector can be slow, but once you have one, invoking a method using a dynamic selector is as fast as invoking one with a selector that's known at compile time.
> Interesting, I find that surprising due to the overhead involved.
Due to the frequency of this, objc_msgSend (which handles dynamic method calls) is hand-written in assembly, with caching and "fast paths" to improve speed. The overhead can usually be brought down to that of a virtual function call in C++.
That's the thing about performance. If you do something a million times, it's usually OK if the first time takes a thousand times longer than the fast case, as long as subsequent times are fast.
Look at how much code is written in Ruby and Python. Their method dispatches are way slower. To put it in perspective, it takes CPython about an order of magnitude longer to add two numbers together than it takes Objective-C to do a dynamic method dispatch.
The interesting part of this is that all this dynamism and dependence on runtime actually improves performance in comparison to C++. For example it is perfectly possible to implement objc_msgSend in portable C such that it is on modern superscalar and heavily cache dependent CPUs on average faster than C++-style virtual method call.
Really? That seems highly unlikely, and doesn't match with any of the speed testing I've done.
When you get down to it, an ObjC message send performs a superset of the work of a C++ virtual call. A C++ virtual call gets the vtable pointer from the object, indexes into that table by a constant offset, loads the function pointer at that offset, and calls it. An ObjC message send gets the class pointer from the object, indexes into that table by a constant offset, loads the method cache information at that index, uses the selector to look up the entry in the cache's hash table, and then if all goes well, it loads the function pointer from the table and jumps to it.
Depends on if you're statically or dynamically linking. When statically linking the final binary shouldn't contained unused functions from external libraries.