Code Craft

Software is equal parts Art, Craft, and Engineering

Math Expression Evaluator

This math expression evaluator was born out of a need to have a small-footprint and efficient solution which could evaluate arbitrary expressions reasonably efficiently without requiring pre-compilation. I needed something which would do basic math with variables, expressions like: “Top+2”, “Bottom-2” and “(Right+1-Left)/2”.

Research on the Internet turned up a number of fairly good solutions, all of which revolved around creating parse trees (which makes sense). The problem was - they were all rather bulky, and I couldn’t afford to add 100K to my applet size just for math. So I started wondering about a linear recursive solution to the problem. The end result is an acceptably performing single source file with no external dependencies, weighing just over 20 KiB (just over 10 KiB when JAR’d).

Primary Features:

  • Basic math operators, with inferred precedence (^ * × · / ÷ % + -).
  • Explicit precedence with parenthesis.
  • Implicit multiplication of bracketed subexpressions.
  • Correct right-associativity of exponentials (power operator).
  • Direct support for hexadecimal numbers prefixed by 0x.
  • Constants and variables.
  • Extensible operators.
  • Extensible functions.
  • Tiny 20 KiB footprint.

Thanks to Carlos Gómez of Asturias, Spain, for his time, thoughts and input in the area of unary operators and operator associativity which proved invaluable to implementing these features and verifying their correct operation.

Date Update
2012-03-27 Fix problems caused by embedded whitespace in the expression. Add support for multiple argument functions (including Math.min() and Math.max() to illustrate).
2012-06-30 Add support for registering external function handlers.
2012-07-06 Fix a bug in parsing empty function arguments (broke the random function).
2012-10-29 Fix handling unary negation for sub-expressions, variables and functions, e.g. -(2+2), x+-y and -max(1,2), including multiple unary signs, e.g. --2. Fix precedence of power over unary negation such that -2^2 = -(2^2) = -4 instead of (-2)^2 = 4.
2012-10-31 Fix handling of power operator to bind right such that 2<sup>3</sup>4 = 2^(3^4) = 2^81 = 2.417e24.
2012-11-04 Operator association. Unary operators. Registered operator support. Pure vs. non-pure functions. Implicit multiplication for consecutive sub-expressions, e.g (1+2)(3+4). Improve power operator. Improve unary-negation. Improve the registered function implementation.
2012-11-07 Fix redundant assigments in operator implementation (which were a holdover from when the operators were evaluated inline to parsing).
2012-11-09 Clean up internal code relating to default operators and functions. Remove some code that had been rendered redundant by previous changes. Fix handling of unary operators which operate on the left-side operand.

How It Works

Here’s a simple example which calculates the middle column of the subsection of a text display (biased left).

MathEval            math=new MathEval();

math.setVariable("Top",    5);
math.setVariable("Left",  20);
math.setVariable("Bottom",15);
math.setVariable("Right", 60);

System.out.println("Middle: "+math.evaluate("floor((Right+1-Left)/2)"));                        // 20

Broadly speaking, we read our string looking for operators. When we find one, if its precedence is greater than the currently pending operation we recurse on a sub-expression from the current location to the end of the current expression. If we detect a bracketed sub-expression we also recurse to evaluate it. Otherwise, we perform the current operation and make the newly discovered operator current. If the now current operation does not have greater precedence than the pending operation, or we reach the end of the current expression, we return the left-value. When we return from the top recursion level we are done.

The heart of this class is the private recursive function _evaluate() (I use the lead underscore exclusively for private methods which share the same name as a public method because they provide its actual implementation; this is almost always a recursive method).

/**
 * Evaluate the next operand of an expression.
 * @param beg       Inclusive begin offset for subexpression.
 * @param end       Inclusive end offset for subexpression.
 * @param pnd       Pending operator (operator previous to this subexpression).
 * @param lft       Left-value with which to initialize this subexpression.
 * @param cur       Current operator (the operator for this subexpression).
 */
private double _evaluate(int beg, int end, double lft, Operator pnd, Operator cur) throws NumberFormatException, ArithmeticException {

This is an immediate divergence from more traditional ways of solving this problem; instead of generating a parse tree and subsequently evaluating it, we recursively parse and evaluate subsections of the original input string. Essentially, the expression is broken up into multiple simple expressions of <left-val> <op> <right-val>. Depending on the precedence of the following operator, this sub-expression is either evaluated arriving at a new left-value, or pushed to one side (via recursion) to be actioned once the following sub-expression is done. Thus “1 + 2 × 3” becomes “2 × 3” (=6), followed by “1 + 6”.

Operator Precedence

Generally, if the current operator is of lower precedence than the next operator, we recurse to evaluate the right value before applying the current operator to the current left value.

if(opPrecedence(cur,LEFT_SIDE)<opPrecedence(nxt,RIGHT_SIDE)) {                                  // correct even for last (non-operator) character, since non-operators have the artificial "precedence" zero
    rgt=_evaluate((ofs+1),end,rgt,cur,nxt);                                                     // from after operator to end of current subexpression
    ofs=offset;                                                                                 // modify offset to after subexpression
    nxt=(ofs<=end ? getOperator(expression.charAt(ofs)) : OPERAND);                             // modify next operator
    }

If parenthesis are used, we also recurse to evaluate the entire bracketed sub-expression:

else if(ch0=='(') {
    rgt=_evaluate(beg+1,end);
    ofs=skipWhitespace(expression,offset+1,end);                                                // skip past ')' and any following whitespace
    nxt=(ofs<=end ? getOperator(expression.charAt(ofs)) : OPERAND);                             // modify next operator
    }

Operator Associativity and Unary Operators

One interesting complication with evaluating precedence is that by convention multiple consecutive power operators bind in a right-associative manner such that x^y^z is evaluated as x^(y^z) not (x^y)^z. This problem is solved by assigning operators two precedence values, one for when the operator is on the left of an operand and the other for when the operator is on the right of an operand. When evaluating their precedence they can have a different value depending on whether they are on the left or right of the intervening operand. So the power operator has a precedence of 80 for the left and 81 for the right.

static private final Operator       OPR_PWR =new Operator('^',80,81,NO_SIDE   ,false,DefaultImpl.INSTANCE); // power

Handling of unary operators is also challenging for two reasons; first they have a missing operand, and second they must be evaluated independent of precedence on one side (the side with no operand), and consider precedence on the other. These problems are resolved by making unary operators “glued” to their operand when seen from the opposite side, or the side which has no operand. Thus the negation in 2^-x^y has maximum precedence when considering 2^-x because ^-x" is nonsense, but when considering -x^y the negation has its normal precedence, lower than power.

private int opPrecedence(Operator opr, int sid) {
    if     (opr==null)                              { return Integer.MIN_VALUE;                                    } // not an operator
    else if(opr.bindTo==NO_SIDE || opr.bindTo!=sid) { return (sid==LEFT_SIDE ? opr.precedenceL : opr.precedenceR); } // operator is binary or is unary and bound to the operand on the other side
    else                                            { return Integer.MAX_VALUE;                                    } // operator is unary and associates with the operand on this side
    }

Finally, when a blank operand is detected we specifically allow it if the current operator is left-binding (e.g. 2! × 2) or the next operator is right-binding (e.g. 2^-2). When the current operator is left-binding we will always proceed to evaluate it because of the special unary Integer.MAX_VALUE precedence, while when the next operator is right-binding there must be no operand immediately left of it and we will always recurse, again because of the special unary Integer.MAX_VALUE precedence.

if(beg==ofs && (cur.unary==LEFT_SIDE || nxt.unary==RIGHT_SIDE)) {
    rgt=Double.NaN;                                                                             // left-binding unary operator; right value will not be used and should be blank
    }

Default set of operators, showing precedence:

static private final Operator       OPR_EQU =new Operator('=',99,99,RIGHT_SIDE,true ,DefaultImpl.INSTANCE); // simple assignment, used as the final operation, must be maximum precedence
static private final Operator       OPR_PWR =new Operator('^',80,81,NO_SIDE   ,false,DefaultImpl.INSTANCE); // power
static private final Operator       OPR_NEG =new Operator('±',60,60,RIGHT_SIDE,true ,DefaultImpl.INSTANCE); // unary negation
static private final Operator       OPR_MLT1=new Operator('*',40                    ,DefaultImpl.INSTANCE); // multiply (classical)
static private final Operator       OPR_MLT2=new Operator('×',40                    ,DefaultImpl.INSTANCE); // multiply (because it's a Unicode world out there)
static private final Operator       OPR_MLT3=new Operator('·',40                    ,DefaultImpl.INSTANCE); // multiply (because it's a Unicode world out there)
static private final Operator       OPR_BKT =new Operator('(',40                    ,DefaultImpl.INSTANCE); // multiply (implicit due to brackets, e.g "(a)(b)")
static private final Operator       OPR_DIV1=new Operator('/',40                    ,DefaultImpl.INSTANCE); // divide (classical computing)
static private final Operator       OPR_DIV2=new Operator('÷',40                    ,DefaultImpl.INSTANCE); // divide (because it's a Unicode world out there)
static private final Operator       OPR_MOD =new Operator('%',40                    ,DefaultImpl.INSTANCE); // remainder
static private final Operator       OPR_ADD =new Operator('+',20                    ,DefaultImpl.INSTANCE); // add/unary-positive
static private final Operator       OPR_SUB =new Operator('-',20                    ,DefaultImpl.INSTANCE); // subtract/unary-negative

Operator Evaluation

After working out what operation we need to do next we simply do it, assigning the result to the current left-value which is our running total. Then we loop around for the next one.

lft=operator(exp,beg,end,lft,cur,rgt);

...

private double operator(String exp, int beg, int end, double lft, Operator opr, double rgt) {
    try { return opr.handler.evaluateOperator(lft,opr.symbol,rgt); }
    catch(ArithmeticException thr) {
        throw exception(exp,beg,"Mathematical expression \""+exp+"\" failed to evaluate",thr);
        }
    catch(UnsupportedOperationException thr) {
        int tmp=beg;
        while(tmp>0 && getOperator(exp.charAt(tmp))==null) { tmp--; }                           // set up for offset of the offending operator
        throw exception(exp,tmp,"Operator \""+opr.symbol+"\" not handled by math engine (Programmer error: The list of operators is inconsistent within the engine)");
        }
    }

...

static class DefaultImpl
extends Object
implements OperatorHandler, FunctionHandler
{
public double evaluateOperator(double lft, char opr, double rgt) {
    switch(opr) {
        case '=' : return rgt;                                                                  // simple assignment, used as the final operation, must be maximum precedence
        case '^' : return Math.pow(lft,rgt);                                                    // power
        case '±' : return -rgt;                                                                 // unary negation
        case '*' : return lft*rgt;                                                              // multiply (classical)
        case '×' : return lft*rgt;                                                              // multiply (because it's a Unicode world out there)
        case '·' : return lft*rgt;                                                              // multiply (because it's a Unicode world out there)
        case '(' : return lft*rgt;                                                              // multiply (implicit due to brackets, e.g "(a)(b)")
        case '/' : return lft/rgt;                                                              // divide (classical computing)
        case '÷' : return lft/rgt;                                                              // divide (because it's a Unicode world out there)
        case '%' : return lft%rgt;                                                              // remainder
        case '+' : return lft+rgt;                                                              // add/unary-positive
        case '-' : return lft-rgt;                                                              // subtract/unary-negative
        default  : throw new UnsupportedOperationException("MathEval internal operator setup is incorrect - internal operator \""+opr+"\" not handled");
        }
    }
...
}

Extensibility

The evaluator’s abilities can be extended by registering handlers for operators and functions.

The design concept for handlers is that any given handler is coded for multiple operators or functions, and that it is registered for each of the operators or functions you wish to use. An important thing to observe is that the implementation methods are passed the operator or function they are to handle - a pure OO design would not do this. However, this design allows some flexibility in coding handlers which a pure-OO design would prohibit. Note that this allowance for handlers performing multiple functions in no way requires it. If Java had lightweight first-class functions, they would have been my first choice for supplying a handler implementation.

The design I chose for implementing the default operators and functions represents a compromise between pure-OO and reducing the number of classes required. This consideration was important to me because I wanted keep the footprint of the solution at around 20 KiB. The benefit of this choice, which I think outweighs the negatives, is that instead of requiring forty classes (twelve for operators and twenty-eight for functions), I have one class to handle all of the default operators and functions.

However, my implementation has two minor undesirable aspects. First the need to specify the identifying element twice - the characters for the default operators appear once when the operators are created and again within the evaluateOperator function; likewise the function names at registration and in evaluateFunction(). The second undesirable outcome of this compromise is seen within the function handling. In order to mitigate the cost of the string comparisons, it does a switch on the first character. This is pure overhead since the evaluator has already performed a map lookup on the name to find the registered function handler. If every handler did just one function, that lookup would entirely suffice to locate the code which must be done for the function.

Adding an operator extension that does factorial using the unary left-binding operator '!' looks like this:

mathEval.setOperator(new MathEval.Operator('!',90,90,MathEval.LEFT_SIDE,new MathEvalHandler()));

...

public class MathEvalHandler
extends Object
implements MathEval.OperatorHandler
{

public double evaluateOperator(double lft, char opr, double rgt) throws ArithmeticException {
    switch(opr) {
        case '!': {
            if(lft<0) {
                throw new ArithmeticException("Cannot calculate factorial of negative number");
                }
            else if(lft<2) {
                return 1;
                }
            else {
                int tot=2;
                for(int xa=3; xa<=lft; xa++) { tot*=xa; }
                return tot;
                }
            } // returns
        default: {
            throw new ArithmeticException("Operator '"+opr+"' not supported in custom operator implementation");
            } // throws
        }
    }
}

Get The Source

The source compiles to Java 5, but modifications to target earlier JVM versions are trivial (some functions will have to be sacrificed and generics need to be removed). Adding functions supported by later JVM’s is also trivial.

Download MathEval.java.