Code Craft

Software is equal parts Art, Craft, and Engineering

Code Bloat - On Size & Efficiency

Introduction

Today I happened to read an old question on StackOverflow.com which was asking about math parsers in java. One of the answers pointed the poster to the Java Math Expression Parser package. Having written a parser of my own, my curiosity was piqued to check it out again, since it was some time since I looked at it.

I was struck by the first bullet point in the JEP feature list: “small size (only 260KB as jar archive)” - that’s somewhere between 500KiB and 800KiB before JAR compression. The JEP evaluator consists of some 16 public packages (that ’s public mind you, not counting whatever packages it depends on which are not publicly documented), containing some 316 classes; let’s assume that 25% are examples and we are down to 237 classes. By way of comparison, my math parser, which is quite capable (though, admittedly having nowhere near the capabilities of JEP), is 1 class at about 9 KiB.

My, How We’ve Grown back to top

The other night I made the mistake of running a Java profiler on my dual-core 1.5 GHz system running Windows Vista with 1 GiB RAM. I accidentally left it running unattended for a couple of hours, during which it managed to acquire about 150 MiB in RAM use. When I tried to bring the laptop back to life to use it, to my dismay the thing sat there for (I kid you not) about 20 minutes with the HDD light on solid and the GUI nearly completely unresponsive as it tried to get its act together; this same laptop routinely takes in the order of 30 to 90 seconds just to get from the desktop to the switch-user screen.

How much would Microsoft still have in the bank if they had to pay out a dime for every minute a user lost due to a fault or poor design in Microsoft Windows? The best O/S I ever worked on had a micro-kernel architecture with a modular design such that you could fit a very capable POSIX compliant O/S on a single HD floppy disk. And, it was so efficient that it could service 64 high-speed modems streaming data on a single 486DX machine.

I know that computers continue to make rapid gains in performance and memory size, but since when is 260 KiB small for a math evaluator? And since when is 20 minutes acceptable for an operating system to reorganize its memory pool in order to begin responding to a user’s requests?

Believing a Lie back to top

Is it just me, or has the whole software industry by and large bought into the lie that size and performance doesn’t matter anymore? Are we programmers so incredibly full of hubris that we really believe that our time as a programmer is so much more valuable than our user’s time - even though that user time is multiplied by many hundred, thousand, or in some cases million individuals? It all kind of reminds me of a visit to the dentist or the doctor - you know the one where you have an appointment at 10:00 am, and if the doctor sees you before 11:30 you consider yourself fortunate.

Have we taken the statement by Donald Knuth that “we should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil” so badly out of context that we now believe that all attempts at efficiency are evil? Even out of context, Knuth is saying that at least 3 percent of the time small efficiencies matter. But he is at least implying that large efficiencies often matter.

It seems to me that we programmers have taken this statement as license to ignore all considerations of efficiency, at least until the end of the program’s development, and to treat efficiency as an afterthought. However, like most software problems, problems with performance and memory use will typically be cheaper to address early in the cycle rather than later. And very often it is the design that is fundamentally flawed, having failed to take into consideration the capabilities of the hardware on which it will run. That kind of flaw just cannot be fixed with 3 days and a profiler.

The performance problem is related to that of code bloat, the two often go hand in hand. At my company we sell, among other products, a Java applet. Since the applet is downloaded on demand and not pre-installed, the overall JAR size is one of our paramount concerns. I have to consider carefully how I code things and what I include. Often 3rd party libraries are just plain bloated out of consideration for me - I could not, for example, use JEP and double the size of my JAR just to get the ability to evaluate the expression (Right+1-Left)/2.

A Call to Considerate Programming back to top

What will it take to get started a renaissance of being mindful of size and efficiency in the computing industry? How long will it be before we stop wasting cycles on fading menus and animated windows? What can we do to drum in the message that, still, even today, “size does matter”, and so does speed.

Programmers - learn your art. Learn what designs and constructs are inherently efficient both in CPU cycles and memory. When you make trade-offs, make them knowledgeably and with understanding of your language and platform. Question everything you do and try to do it better next time. Grow in your art, always. Tomorrow be better, faster, and leaner than you are today. Instead of creating code which is fatter and saying “hardware is cheap, my time is not”, create it leaner and faster and remind yourself that every cycle not spent on your program is available to another.

I am not saying consider efficiency and size out of proportion, rather, just consider it, period.