Discussions

"Floating Point" Problems, Explanations, and Workarounds

  • 1.  "Floating Point" Problems, Explanations, and Workarounds

    Posted 08-26-2020 13:24
    Edited by J. Michael Hammond 08-27-2020 14:35

    We get a lot of questions about surprising behavior when numbers have digits after the decimal point. Sometimes numbers don't look right, sometimes they don't seem to behave right mathematically. Often someone thinks they've found a bug, and it's difficult to explain how the behavior is normal and that nearly all computer software shares the same behavior.

    As a software QA guy with a mathematics background (see my introduction in the second paragraph of my first Quick Base community post), I find this kind of question particularly interesting. Let me explain a little bit about how computers represent numbers and do math. Then I'll show a few common questions that people ask and walk through explanations - and workarounds, where possible. A quick note: most of the examples discussed are around formulas. That is just because they are usually easier to illustrate with. The same caveats and information applies to any part of Quick Base where you are comparing numbers. This includes, but is not limited to: report filters, custom data rules, permissions, etc.

    Floating-Point Arithmetic

    The phrase that computer people throw around to describe how computers do math is "floating-point representation." This phrase refers to a standard, called ANSI/IEEE 754, that describes how computers are expected to represent fractional numbers (I'm only giving you this link so you can fact-check me later if you want; you don't need to go there now). Most popular computer chips, operating systems, and languages have been following this standard, at least more-or-less, since the early 1990s. I think it's fair to say that all popular computer operating systems and languages follow this standard today. That means that most software applications, like spreadsheets, databases, Quick Base, and so on, also follow this standard.

    Let me show you one example in JavaScript (outside of Quick Base). Use this w3schools sample on the JavaScript “toFixed” function

    1. Change var n = num.toFixed(2) to var n = num.toFixed(16)
    2. Click “Run” at the top.
    3. Click “Try it” on the right.
    4. Notice that your browser is showing you the value 5.5678900000000002 even though we gave it the number 5.56789?

    Let me explain the two main things that are going on with this standard for computer mathematics.

    1. There's a limited amount of space to store a number. The amount of space a computer has to store a number means it can store about sixteen digits' worth of stuff before it has to chop off the rest.
      So here's a very simple example: Say you have a formula that computes 1/3. This cannot be stored exactly. The computer stores the number as something very close to 0.3333333333333333 and has to chop it off there. Computer mathematics has no special notion of fractions that are somehow simple; all numbers are stored the same way, as "just numbers".
    2. Computers don't do math in decimal; they do it in binary. Binary is just a series of 1s and 0s that make up instructions for a computer. The result is that most numbers that have digit(s) after the decimal point cannot be exactly represented in binary. This is a little harder for some people to understand and accept than the idea of limited space. It's pretty easy to accept that 1/3 can't be represented exactly, because you can see how it looks in decimal. But it turns out that most numbers that only have a few digits after the decimal point can't be represented exactly in binary.

      If you are already familiar with this, skip down to the next bullet ("It's easy to forget that displayed decimals and actual precision are different things.") If you are interested in some more context, let's dig into this a little more.
      Say you have two measuring sticks. One is a super-precise meter stick. It has a big mark at one meter, smaller marks every decimeter, smaller marks every centimeter, smaller marks every millimeter, and so on down to sixteen levels' worth of marks. Any (decimal) number that you can write with sixteen or fewer digits after the decimal point will correspond exactly to some mark on this meter stick.  The smallest marks are 1/10000000000000000 of a meter apart. (That big number is a one followed by sixteen zeros.)

      Ruler with inches and centimeters


      Now say you have a super-precise yardstick. Let's ignore the big marks at each foot, and start with the smaller marks every inch. Below that, there are smaller marks every half-inch, smaller marks every quarter-inch, even smaller marks every eighth-inch, and so on. If this stick has about fifty-two different sizes of marks, the interval between two of the tiniest lines will be 1/9007199254740992 of an inch. See how there are sixteen digits in that big number? That means that there are about as many marks between each inch, on this stick, as there are between meters on the other one. But the difference is very important. One difference is that we can't represent, say, 1/10 of an inch exactly:

      It's less than a half an inch (0.5").
      It's less than a quarter inch (0.25").
      It's less than 1/8 of an inch (0.125").
      It's more than 1/16 of an inch (0.06125").
      It's more than 3/32 of an inch (0.093125").
      It's less than 7/64 of an inch (0.1090625").
      It's less than 13/128 of an inch (0.10109375").
      It's more than 25/256 of an inch (0.097109375").
      It's more than 51/512 of an inch (0.099609375").
      It's less than 103/1024 of an inch (0.1005859375").
      It's less than 205/2048 of an inch (0.10009765625").
      It's more than 509/4096 of an inch (0.099853515625").
      It's more than 1019/8192 of an inch (0.0999755859375").

      I'm going to stop there, but I want to make two points:

      (1) This eventually settles into a pattern. If you write 1/10 out in binary, it's 0.0001100110011001100… where that "1100" repeats, and this corresponds to the pattern of "less than" and "more than" on this yardstick.
      (2) Even if you aren't fully aware of how this settles into a pattern, look at how the decimal expression of those fractional inches is running away. Each step of this process, we get one (or sometimes two) more digits, ending with a five. As we get closer and closer to 0.1 inches, we're picking up more and more digits at the very end there. Another way to say this is that the only numbers we can represent exactly on this yardstick are numbers whose fractional representation has a denominator that is some power of two. And since 1/10 has a denominator that is not a power of two, we're never going to be able to represent it exactly on this yardstick. The same goes for 1/100, 1/1000, and so on. So the vast majority of numbers that only take a few digits after the decimal place are not exactly representable to a computer, since the computer is "using a yardstick" (binary) instead of "using a meter stick" (decimal).


      It's easy to forget that displayed decimals and actual precision are different things.
      Most software applications allow you some way to choose how many digits you wish to display after the decimal point. Many systems automatically choose to display fewer digits than would be possible when the value is very close to a short value. For example, if you have the number 0.3499999999999999, many systems will automatically choose to display this value as "0.35".

      In Quick Base, if you go to the field properties page for a numeric field, you'll find in the "Display" section a setting called "Decimal places". Remember that this is only changing the maximum number of digits the application uses to show you the approximate value - it does not change the actual underlying value.

      Bringing that back to Quick Base, let's combine both of the above concepts. We can look at a scenario where we key in one of the above numbers, like this shown below. Rounded off, this "looks like" it is .10 - but it really isn't.

      Frequently Asked Questions, with Explanations and Workarounds

      So now that you're picturing computer arithmetic as being on a very (but not infinitely) precise yardstick, and now that you're keeping in mind that displayed decimals is different from mathematical precision, let's get into some typical questions and discuss workarounds.

      I have a formula that does some math and the computer's getting the last digit wrong. What's up with that? Is that a problem?

      That's just a normal outcome of the fact that computers have a limited amount of space to represent a number. The easy example to think through is if you have a formula that does 1 / 3 * 3. It's relatively easy to picture the computer doing the 1/3 part, getting 0.3333333333333333, and having to chop it off there. Once you picture that, it should be pretty easy to see that when it does the *3 part, the answer will be 0.9999999999999999 instead of exactly 1. The computer has "forgotten" that last little piece of the number after the sixteenth digit.

      The trickier situation is when you do some math on fractional numbers and it looks like it should work out based on the display values you are staring at on screen. Say you have a formula that does 1 / 10 * 10. When you're thinking in decimal, it seems that the 1/10 part should just be 0.1, and then when you multiply it by 10 the answer should just be 1. But remember the computer is doing math on a yardstick. So the 1/10 part is .0001100110011001100… in binary, which has to get rounded off somewhere, just like the above example.

      When that number gets rounded off, and then you multiply it by 10, the little error that crept in because of the rounding off will remain.
      So when you do 1 / 10 * 10, you are likely to get the answer 1.0000000000000001 rather than simply 1, because the closest binary number to 1/10 is just a little bit bigger.

      A simple visualization of this in Quick Base is mileage reimbursement. This looks quite straightforward. But after keying in the request, we can see a few issues manifest.


      Workaround: If you're only concerned about how the number looks, this is a great place to use the "Displayed decimals" property of the field. Say you reduce the displayed decimals of the result to eight digits. Quick Base will (in a manner of speaking) round off the answer to .10000000, recognize it does not need to display the trailing zeroes, and display the number as "0.1".

      If you're concerned about how the number behaves mathematically, keep reading.

       

      These two numbers sure look the same to me. Why doesn't the "=" in this formula say they're the same?
      This is illustrated in the mileage example above, and usually happens when at least one of the numbers is the result of some calculation - especially when you're comparing it to a fixed value with only a few decimal places, like "[Total Cost] = 19.98". Remember that the value ".98" is not exactly representable on the computer's yardstick. Nor are most of the cost values you're adding up to get to this total. Since all of these numbers are being rounded off a little bit before they get added up, it's possible we could run into a set of numbers where more of them are getting rounded in the same direction, and their sum is just a little bit different from how 19.98 gets rounded.

      Saying this another way - if you were considering writing a formula that said "[Total Cost] = 33.33333333333333", and you knew your formula took simple numbers and divided them by three before adding them up, you would probably be a little wary about expecting it to work. Remember not to be fooled by a number that looks simple in decimal, like 19.98, because in binary it's going to have to get rounded off just the same.

      Workaround: There are two common strategies to work around this problem.
      (1) Whenever you're comparing numbers with decimal places, compare them to some kind of tolerance. So, for example, instead of saying

      If ( [Cost] = 1.1, "Yes", "No" ), in a formula, you might consider saying If ( [Cost] > 1.09999 and [Cost] < 1.10001, "Yes", "No" )

      (2) Round the values to some number of decimal places before comparing. You should round both sides of the equality to the same number of decimal places - even if one of them is just a constant! - and you should still be aware that, with this strategy, there could be some very rare cases where things don't behave exactly as you'd expect.

      If ( ROUND([Cost],.00001) = ROUND(1.1,.00001), "Yes", "No" )

       

      When I round a number to a particular decimal place, it's not handling the "point fives" consistently or correctly. Why is that?

      (For example, if you're rounding to two decimal places, you might notice that 0.265 rounds up to 0.27, and 0.275 rounds up to 0.28, but 0.285 rounds down to 0.28.)

      This is another side effect of the fact that the computer stores fractional numbers in binary, not decimal. That number that looks like 0.265 when you display it in decimal might actually be just a tiny bit more, so it rounds up. That number that looks like 0.285 might actually be just a tiny bit less, so it rounds down.

      Workaround: The general strategy here is to round numbers as late as possible, to as many digits as possible.
      One example we've seen a few times now is when someone is computing a unit price for a large order. Some math gets done that comes up with a small price per item, that looks like it's got exactly half a cent in it (like the 0.285 example above). The application developer rounds this rate to the nearest cent before multiplying the number of units. The business owner expects this to be 0.29 cents per unit, but Quick Base computes it as 0.28 cents per unit, and the one-cent difference times tens of thousands of units comes up to a hundreds-of-dollars "discrepancy".

      In this case, we suggest that you don't round the unit rate to two digits. Consider rounding it to three or four digits, or even not rounding it at all and just displaying it to three or four digits, and then round the price after you multiply by the number of units.

      When I display a number to a particular number of decimal places, sometimes the last digit is wrong. Sometimes it's different from what I get when I round the number to the same number of decimal places. What's happening?

      Quick Base goes through different code paths when it is rounding numbers and when it is choosing how to display numbers. All it takes is a tiny little difference in the algorithms to cause rounding and display to make different decisions about that last digit.

      Workaround:
      There really isn't a direct workaround. The only thing I know how to suggest is that you learn to expect variability in the very last digit of any fractional number. This is really the most important principle of the whole story, right here. If you learn to not expect that last digit to be exactly right, you will recognize and figure out specific workarounds to any problems like these you encounter in the future.

      I have a custom key field (or I'm trying to merge on a numeric field). I'm getting duplicate entries. What's the problem?

      This is another symptom of the fact that two fractional numbers can look the same, even when displayed to full precision, but be mathematically different way down in the smallest bit or two. Remember that a value that looks simple in decimal, like 1.4, is not exactly representable in binary. The value already stored in a record might have come from some mathematical operation and be the binary number just bigger than 1.4, and when you type 1.4 in directly it might be the binary number just smaller than 1.4. Those numbers are not equal, so Quick Base thinks you're adding a new record, not editing an existing one.

      Workaround: As with the previous question, there is no direct workaround. If you use fractional values in an existing key field, you are almost guaranteed to eventually run into this problem. So the first rule is don't use fractional values in a numeric key field, or other matching criteria.

      If it turns out that a field that has fractional values in it is natural to use as a key field, or a merge field, the best recommendation I can give you is to tweak how you define the field so that its value is always an integer. For example, if you have a [Cost] field that contains values that look like dollars and cents, and for some reason you need to use this as a merge field or a key field, I recommend that you redesign your application so that you have a [Cost in pennies] field instead, whose values are all integers. This will be safe to use as a key field, merge value or matching criteria.

      Hopefully this helps. We encourage you to reach out to our Care team for assistance with specific build patterns.
      ------------------------------
      J. Michael Hammond
      Senior Software Engineer in Test
      Quick Base, Inc.
      ------------------------------