Code Layout & Formatting

Created  .:|:. Updated 

Introduction

This article focuses primarily on examining issues of indentation style, but touches on other broad aspects of layout and alignment of code. Let me caveat up front that the coder who uses variable pitch fonts will likely need to ignore those parts of this article which relate to alignment, but the primary focus of indent style applies to all coders and all program and markup languages.

Code layout, in particular indentation styles are something of a holy-war in software engineering circles. Please allow me add fuel to the fire by stating that I unashamedly believe that the vast majority of the industry blindly follows layout styles simply because they dominate, and not based on their merits… consequently the dominant styles lack merit and the majority of coders are not writing code which is as easy to comprehend as it could be.

The important thing is not whether one can get used to seeing code in a particular style (that is a given), but whether one chooses a coding style motivated by a drive for achieving best practices - an ideal that should pursued in all aspects of exercising our art.

Code Indentation back to top

In my opinion the primary goal when approaching the issue of code-layout is that the logical structure of the code is represented visually as accurately as possible. Think of it as looking at a screen of code and allowing your eyes to go slightly out of focus - you should still be able to clearly see the structure of the code. This means that the location of the block delimiters or opening/closing keywords should be chosen so as to best visually reflect the logical structure. The secondary goal is that the style be as compact as possible without significantly detracting from the first goal.

Following is the same snippet of code (taken directly from Wikipedia, so as to avoid accusations of choosing code to fit my bias), laid out first in the three most prevalent indentation styles, and finally in the style that I advocate. Each code extract is examined in the context of our two code-style goals: visual structure and brevity.

BSD KNF Style (The Predominant Java/Sun Style) back to top

void defineProp(void *data, int res) {
    if(data!=NULL && res>0) {
        if(JS_DefineProperty(cx,o,"data",STRING_TO_JSVAL(JS_NewStringCopyN(cx,data,res)),NULL,NULL,JSPROP_ENUMERATE)!=0) {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
        }
        PQfreemem(data);
    } else {
        if(JS_DefineProperty(cx,o,"data",OBJECT_TO_JSVAL(NULL),NULL,NULL,JSPROP_ENUMERATE)!=0) {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
        }
    }
    ...
}

This style fails on the visual structure goal because the closing braces are “in the way” (around the “else”) and create visual breaks where they don’t belong (before “PQfreemem”). They also create “dangling” artifacts following each block of code (after “goto err”). One particular problem with this style is that the “else” which is strongly structurally associated with its “if” is visually indented from it.

On the brevity goal I give this style a solid “A”.

This style ranks 4th in my opinion.

Allman Style back to top

void defineProp(void *data, int res)
{
    if(data!=NULL && res>0)
    {
        if(JS_DefineProperty(cx,o,"data",STRING_TO_JSVAL(JS_NewStringCopyN(cx,data,res)),NULL,NULL,JSPROP_ENUMERATE)!=0)
        {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
        }
        PQfreemem(data);
    }
    else
    {
        if(JS_DefineProperty(cx,o,"data",OBJECT_TO_JSVAL(NULL),NULL,NULL,JSPROP_ENUMERATE)!=0)
        {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
        }
    }
    ...
}

This style fails on the visual structure goal because the braces create a lot of extraneous whitespace which separates the contents of blocks from their “parent” statements. As with BSD KNF they also create visual breaks where they don’t belong (before “PQfreemem” and around the “else”). The “dangling” artifacts are avoided because the opening and closing braces visually align with each other on the vertical axis. However, the braces create significant visual noise around the “else” weakening the visual association with its “if”.

On the brevity goal I give this style a mediocre “C”.

This style ranks 3rd in my opinion, because I will take clarity over brevity any day of the week.

Whitesmiths Style back to top

void defineProp(void *data, int res)
    {
    if(data!=NULL && res>0)
        {
        if(JS_DefineProperty(cx,o,"data",STRING_TO_JSVAL(JS_NewStringCopyN(cx,data,res)),NULL,NULL,JSPROP_ENUMERATE)!=0)
            {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
            }
        PQfreemem(data);
        }
    else
        {
        if(JS_DefineProperty(cx,o,"data",OBJECT_TO_JSVAL(NULL),NULL,NULL,JSPROP_ENUMERATE)!=0)
            {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
            }
        }
    ...
    }

This style is such a large improvement over Allman as to render Allman indefensible. However the visual structure goal is impacted because the braces create noise around the contents of blocks though not visually separating them from their “parent” statements. On the plus side visual breaks and “dangling” artifacts are avoided because the opening and closing braces visually align with each other and the code they contain on the vertical axis. The association of the “if” with its “else” is nicely preserved and clearly visible.

On the brevity goal I give this style a mediocre “C”.

This style ranks 2nd in my opinion.

Banner Style (My Advocated Style) back to top

void defineProp(void *data, int res) {
    if(data!=NULL && res>0) {
        if(JS_DefineProperty(cx,o,"data",STRING_TO_JSVAL(JS_NewStringCopyN(cx,data,res)),NULL,NULL,JSPROP_ENUMERATE)!=0) {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
            }
        PQfreemem(data);
        }
    else {
        if(JS_DefineProperty(cx,o,"data",OBJECT_TO_JSVAL(NULL),NULL,NULL,JSPROP_ENUMERATE)!=0) {
            QUEUE_EXCEPTION("Internal error!");
            goto err;
            }
        }
    ...
    }

This style avoids the problems outlined with the previous three, with none of the detractions. Its main weakness is that it’s harder to visually associate the opening and closing braces (but this is greatly mitigated by the ability of all worthwhile code editors to match braces) - this style deliberately trades off the location of the opening braces against visually strengthened blocks and greater brevity.

On the brevity goal I give this style a solid “A”.

This style of indentation does translate in a logically coherent manner to markup languages such as HTML and XML and structural configuration languages like CSS and JSON. However, I am not at all convinced that markup languages which use begin/end tags should be indented in this way; specifically, I generally prefer to indent the closing tag to the same level as the opening tag. To illustrate, following is this site’s (one-time) side-navigation list indented in this style and then with the closing tags unindented. Note my preference to folding “span” like elements into one line to reduce the structural noise.

Indented Closing Tags:

<div class="container">
    <div class="main">
        <div class="sidenav">
            <h1>What's New</h1>
            <ul>
                <li><a href="/program/TimeKeeper"   >Nov 2008 - TimeKeeper 2.1.6   </a></li>
                <li><a href="/program/RefactorBuddy">Nov 2008 - RefactorBUDDY 0.3.6</a></li>
                </ul>
            <h1>Recommended Sites</h1>
            <ul>
                <li><a href="http://www.java.com">Java Runtime Web Site</a></li>
                <li><a href="http://www.stackoverflow.com">Stack Overflow</a></li>
                </ul>
            </div>
        <div class="clearer">
            </div>
        </div>
    </div>

Unindented Closing Tags:

<div class="container">
   <div class="main">
      <div class="sidenav">
         <h1>What's New</h1>
         <ul>
            <li><a href="/program/TimeKeeper"   >Nov 2008 - TimeKeeper 2.1.6   </a></li>
            <li><a href="/program/RefactorBuddy">Nov 2008 - RefactorBUDDY 0.3.6</a></li>
         </ul>
         <h1>Recommended Sites</h1>
         <ul>
            <li><a href="http://www.java.com">Java Runtime Web Site</a></li>
            <li><a href="http://www.stackoverflow.com">Stack Overflow</a></li>
         </ul>
      </div>
      <div class="clearer">
      </div>
   </div>
</div>

Factoring in both goals, this style seems the clear winner. The ironic thing to me is that the style which, in my analysis, is the very worst, is the one which seems to dominate the industry.

Other Formatting Considerations back to top

The considerations which follow are far less important than those of indentation. As such there is more likely to be individual preferences involved, and I have far less conviction about them.

Tabs: On the issue of tab vs space characters, it seems the ship has clearly sailed, with spaces being almost universally preferred. Since any difference in how many spaces to which a tab is expanded causes irreconcilably misformatted code it seems difficult to defend the use of tab characters.

Indent Size: Next is the question of how many spaces constitute an “indent”. The popular choices are 2, 4 and 8 characters. The key consideration which informs the answer to this question is, again, visible structure. I find that 2 characters is too few and causes “bleeding” of the structural outline - it simply becomes difficult to visually track the structural outline. Conversely, 8 spaces causes excessive indenting a quickly pushes nested levels of code too far right for no demonstrable benefit. For program code I use 4 spaces. However, with XML type markup languages which seem to invariably end up with very deep nested levels, I currently use 2 spaces, though I must confess I have not arrived at a final conclusion regarding 2 spaces in this context.

Alignment: I really like using columnar alignment for variables declared at the class and method level (less so for block-scoped variables). I find it helps me see the types and names much more easily than when the names are run together with the types. For C code where the types are more compact I indent the names to column 21; for Java code where types are more verbose I indent to column 41.

protected void processMessage() {
    String                              typ=request.getField("CounterType","");                 // prefix for the counter ID
    String                              nam=request.getField("CounterName","");                 // suffix for the counter ID

    String                              idn=(typ+"_"+nam);                                      // identifier for the counter
    int                                 cnt=0;

    cnt=handler.getPropertyAsInt(idn,0);
    cnt++;
    handler.setProperty(idn,Integer.toString(cnt));

    reply=new MiMessage("Done");
    reply.setField("CounterValue",cnt);
    }

Where multiple similar statements are on consecutive lines I also like to align them as I feel it makes the correlation much clearer and the code more comprehensible.

if     (par==null       ) { strOffset=0;             totCount =0;            markers=null;          }
else if(par==rootContext) { strOffset=offset;        totCount =0;            markers=null;          }
else                      { strOffset=par.strOffset; totCount =par.totCount; markers=par.markers;   }

Class Indentation: I feel that indenting an entire class one level is unnecessary and so I don’t - the opening brace for the class, its contents and the closing brace are all flush against the left margin. This is a considered and deliberate violation of the banner indent style.

package ...;

import ...;
import ...;

public class Xxx
extends Object
implements Yyy
{

private String                          someVar;                                                // used to blah blah

public Xxx(...) {
    blah blah
    }

...
}

Conclusions back to top

I have ofttimes wondered if I should just bite the bullet and change the style I use, but I just can’t rationalize the change to an inferior style because “everyone else is doing it”.

Personally, as much as I love Java, and appreciate Sun for bringing it and the managed-language paradigm to our industry, I am disappointed that their widely published coding style has been thoughtlessly adopted wholesale as some sort of uncontestable truth.

It disturbs me that in an industry with such a relatively high number of arguably astute thinkers nobody seems to take the time to give any thought to some very fundamental practices. So many sheep where there ought to be more shepherds.

Discussion

Guest comments are welcome - Use "Pick a name", then "I'd rather post as a guest", below.