Tags: programming
<< previousnext >>Spot quiz! See how many of these calculations you get right. No calculator necessary.
Answers in 5...
4...
3...
2...
1...
spoiler: 25, 94, 7 and 11.
If you got all of them correct without finger-counting, without drops of nervous sweat rolling down your neck, and without asking any questions on StackOverflow, then congratulations. You are an Übermensch of calculation, and there is nothing for you to learn here.
If not, then you've fallen victim to one of the 2 hard problems1 in computer science: the off-by-1 error.
Off-by-1 errors have been written about since before Jesus Christ. At the latest, they were mentioned by a Roman named Vitruvius, who served in the army of Julius Caesar. He was talking about the number of fenceposts around a temple, but the principle is the same. The problem is old, and you shouldn't feel bad if it caught you out.
The good news is that there's a simple, easy-to-remember formula that applies to all of the above calculations. Once you've learned it, you'll never again suffer from this particular brand of off-by-1 horror. Nor will you miscount the number of days until Aunt Catherine's birthday.
Let's reconsider all of the above problems as calculations of interval length.
What's an interval? It's a set of integers that lies between a lower and upper bound. For example, the interval [3, 7]
contains the numbers 3, 4, 5, 6 and 7. It's a set, so we write it using the set notation {3,4,5,6,7}
.
The bounds of an interval can be open or closed. If a bound is open, it means that the bound itself is excluded from the interval. If a bound is closed, it's included. In the above example, both bounds were closed. The same interval with open bounds, written (3, 7)
, is the set {4,5,6}
.
The bounds can also be half-open. The half-open intervals (3, 7]
and [3, 7)
are {4,5,6,7}
and {3,4,5,6}
, respectively.
Now, let's rephrase the problems from before in terms of intervals.
(2, 28)
, contains 25=28-2-1 elements.(3, 97]
, contains 94=97-3 elements.[2, 9)
, contains 7=9-2 elements.[50, 60]
, contains 11=60-50+1 elements.We have all 3 types of interval here: open, half-open and closed. If you look closely, you'll notice a pattern in how we calculate the number of elements: subtract the lower bound from the upper bound, then add -1, 0 or +1 depending on the type of interval.
Here it is more explicitly. If you have an open interval (L, U)
, then the number of elements is U-L-1
. If you have a half-open interval (L, U]
or [L, U)
, then it's U-L
. And finally, if you have a closed interval [L, U]
, it's U-L+1
. They're all the same formula, you just have to add 1 if both bounds are closed and subtract 1 if they're both open.
This is the reason why programming interfaces use half-open bounds for ranges. In Python, for example, you say mylist[L:L+N]
to copy the list items with indexes in the interval [L, L+N)
. That's (L+N)-L=N elements. Much neater than having rogue +1s and -1s floating around the place.
That's it. I was going to make this into a whole big thing, but it's really that simple. I used to come up with small example cases and finger-count in order to figure out interval length. This way is easier.
You should now be able to count the number of days until your birthday with full confidence, and perhaps more usefully, avoid annoying off-by-1 errors in your programming. Happy counting, and say hello to Aunt Catherine.
From the over-used joke: "There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors." ↩
I'd be happy to hear from you at galligankevinp@gmail.com.