An Example of Bad Code

Recently, Tom Wu here at Voltage found a bug in some third party JavaScript software we were using. Now, someone (possibly the person writing the original code) might argue that it wasn't a bug in the code, but a bug in the JavaScript engine. But whether or not it was a bug, or if so, where the bug belongs, I think it was bad code.

Over the years, I have argued that there are certain things you shouldn't do when writing code, even if it is technically correct. These things might not be portable, or at the very least they are confusing and don't buy you anything. A lot of people have told me I'm too paranoid about portability, or my thinking is out of date, or I'm just wrong. I have heard comments such as, "That's never going to happen," or "There's no compiler or OS that will get that wrong," or "The standard specifies <something> so there's no reason not to use it." My belief is that people have to get "bitten" before they start to believe that some of my coding rules might make sense. See my blog post about that here.

So here's the offensive code.

x ^= x2 || 1

I looked up some documentation on JavaScript to see if this is part of the language, or if the programmer had simply noticed something about logic and done something "cute". Apparently this is somewhat defined behavior. However, this highlights the portability problem.

You see, what went wrong was that different JavaScript engines evaluated the code differently. So most engines got it "right" and one got it "wrong". Even in JavaScript, there are different platforms, and sometimes they handle certain constructs differently. So to be truly portable, you must forego some constructs and write your code to a "least common denominator" to make it safer to run on multiple platforms.

The programmer might declare, "The JavaScript engine is wrong, so they have to change their code, not me." Unfortunately, you don't say to the customer, "You have a bug in your code, so encrypted data won't decrypt properly, and even though we have a workaround, we're demanding you make the fix," or "We won't take your money because your JavaScript engine behaves differently than we expect."

Besides, there is some ambiguity in the above construct. First, let's look at the logical OR rule.

Part 1: Both sides of the || are booleans. For example,

if ( (x==0) || (y==0) )

The rule says to evaluate the first. If true, return true and don't evaluate the second. If the first is false, evaluate the second. If true, return true, if false, return false. Incidentally, true is 1 and false is 0.

Part 2: One of the sides of the || is not a boolean. For eaxmple

z = "someString" || (y == 0) 
z = (x == 0) || "anotherString"

If the first is not a boolean, return the first (that is, don't return true or false, return the first thing). If the first is a boolean and true, return true and ignore the second. If the first is a boolean and false, and the second is not a boolean, return the second (once again, don't return true or false, return the second thing).

So let's get back to the bad code.

x ^= x2 || 1

We are going to XOR x with something. What? Is it going to be 0 or 1 (true or false)? Or is it going to be x2 or 1?

What if the value of x2 is 0. Is that false (part 1)? Or is x2 not a boolean, so we just return x2 (part 2). If the value of x2 is 3, does that mean the first part is true, so return 1 (part 1), or does it mean return 3 (part 2)?

Tom rewrote the code

x ^= (x2==0) ? 1 : x2

That worked, so we are certain the intention was

If x2 is 0,     compute  x XOR 1
If x2 is not 0, compute  x XOR x2

So the original code writer was saying, "If x2 is 0, it's a boolean. If x2 is not 0, it's not a boolean."

Can you blame the JavaScript engine for getting it "wrong"?

Besides, why use some cryptic construct anyway? Is it to save code size or improve performance? The problem with that thinking is that code size and performance are determined by how the compiler converts the code. That is, if you can trim a line or two off of the JavaScript code (or Java or C or C++ or FORTRAN or so on), that does not necessarily mean that the code that actually runs will be smaller or faster.

If you want to see the real code size and performance, look at what the chip actually does. The easiest way to see that is assembly code (even though JavaScript does not compile down to machine language, ultimately the computer's chip will run some set of instructions represented by assembly language).

In our example, the intention of the offensive code was to be something like this.

    loadw   r5, (x2)  # Load x2 into register 5
    loadw   r6, (x)   # Load x into register 6
    movw    r7, $1    # Load 1 into register 7
    cmpw    r5, $0    # compare x2 with 0
    bceq    .label1   # if x2 == 0, branch to .label1
    movw    r7, r5    # if x2 != 0, load x2 into register 7
.label1:
    xorw    r6, r7    # XOR x with whatever is in r7
                      # if x2 == 0, 1 is in r7
                      # if x2 != 0, x2 is in r7

This is fairly efficient. Now, suppose the JavaScript code is changed to

x ^= (x2==0) ? 1 : x2

What would the assembly code look like? It would look exactly the same. How about if the code were this.

xorVal = 1
if (x2 != 0)
  xorVal = x2
x ^= xorVal

What would the assembly code look like? It would look exactly the same.

Saving space in the code file does not necessarily translate into smaller and/or faster code. Because of that fact, I say you should write your code to be as clear as possible. If you can do something unclear, but it will truly translate into something smaller and/or faster at the machine level, go ahead and do it. But if the choice is between obfuscated code and clear code and there's no difference in size or performance, use the clear code.

Leave a Reply

Your email address will not be published. Required fields are marked *