Number Systems and Bases

Base systems like binary and hexadecimal seem a bit strange at first. The key is understanding how different systems “tick over” like an odometer when they are full. Base 10, our decimal system, “ticks over” when it gets 10 items, creating a new digit. We wait 60 seconds before “ticking over” to a new minute. Hex and binary are similar, but tick over every 16 and 2 items, respectively.

Try converting numbers to hex and binary here:

Way back when: Unary Numbers

Way back in the day, we didn’t have base systems! It was uphill both ways, through the snow and blazing heat. When you wanted to count one, you’d write:

When you wanted 5, you’d write

lllll

And clearly, 1 + 5 = 6

l + lllll = llllll

This is the simplest way of counting.

Enter the Romans

In Roman numerals, two was one, twice. Three was one, thrice:

one = I
two = II
three = III

However, they decided they could do better than the old tradition of lines in the sand. For five, we could use V to represent lllll and get something like

l + V = Vl

Not bad, eh? And of course, there are many more symbols (L, C, M, etc.) you can use.

The key point is that V and lllll are two ways of encoding the number 5.

Give each number a name

Another breakthrough was realizing that each number can be its own distinct concept. Rather than represent three as a series of ones, give it its own symbol: “3″. Do this from one to nine, and you get the symbols:

1 2 3 4 5 6 7 8 9

The Romans were close, so close, but only gave unique symbols to 5, 10, 50, 100, 1000, etc.

Use your position

Now clearly, you can’t give every number its own symbol. There’s simply too many.

But notice one insight about Roman numerals: they use position of symbols to indicate meaning.

IV means “subtract 1 from 5″

and VI means “add 1 to 5″.

In our number system, we use position in a similar way. We always add and never subtract. And each position is 10 more than the one before it.

So, 35 means “add 3*10 to 5*1″ and 456 means 4*100 + 5*10 + 6*1. This “positional decimal” setup is the Hindu-Arabic number system we use today.

Our choice of base 10

Why did we choose to multiply by 10 each time? Most likely because we have 10 fingers.

One point to realize is you need enough digits to “fill up” until you hit the next number. Let me demonstrate.

If we want to roll the odometer over every 10, so to speak, we need symbols for numbers one through nine; we haven’t reached ten yet. Imagine numbers as ticking slowly upward – at what point do you flip over the next unit and start from nothing?

Enter zero

And what happens when we reach ten? How do we show we want exactly one “ten” and nothing in the “ones” column?

We use zero, the number that doesn’t exist. Zero is quite a concept, it’s a placeholder, a blank, a space, and a whole lot more. Suffice it to say, Zero is one of the great inventions of all time.

Zero allows us to have an empty placeholder, something the Romans didn’t have. Look how unwieldly their numbers are without it.

George Orwell’s famous novel “1984″ would be “MCMLXXXIV”! Rolls right off the tongue, doesn’t it?

Considering other bases

Remember that we chose to roll over our odometer every ten. Our counting looks like this:

1
2
3
4
5
6
7
8
9 (uh oh, I’m getting full!)
10 (ticked over – start a new digit)

What if we ticked over at 60 when we counted, like we do for seconds and minutes?

1 second
2
3
4
5
…
58
59
1:00 (60 seconds aka 1 minute. We’ve started a new digit.)

Everything OK so far, right? Note that we use the colon (:) indicate that we are at a new “digit”. In base 10, each digit can stand on its own.

Try Base 16

If we want base 16, we could do something similar:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 (we’re getting full)
1:00 (16 – we’ve started a new digit)

However, we don’t want to write hexadecimal numbers with the colon notation (even though we could). We’d rather cook up separate symbols for 10-15 so we can just write numbers like we’re used to. We’ve run out of numbers (1-9 already used, with 0 as a placeholder) so we need some other symbols. We could use some squiggly lines or other shapes, but the convenions is to use letters, Roman style. Just like 5 became V, programmers use letters A-F to get enough digits up to 16. That is,


1
2
3
4
5
6
7
8
9
A (10 – we’re using the symbol “A”)
B (11)
C (12)
D (13)
E (14)
F (15 – uh oh, we’re getting full)
10 (16 – we start a new digit)

Ahah! Now we can use one digit per “place”, and we know that 10 actually means we’ve “ticked over to 16″ once.

20 means we’ve ticked over to 16 twice (32).

25 means we’ve ticked over to 16 twice (giving us 32) and gone an extra 5. The total is 32 + 5 = 37.

Quick review

With me so far? This is pretty cool, right? We can count in any system we want. Also notice that base 16 is more “space efficient” in the sense we can write a number like 11 in a single digit: B.

Base 16 really isn’t that different from base 10, we just take longer to fill up.

The wonderful world of binary

We’ve seen plenty of base systems, from over-simple unary, to the unwiedly Roman numerals, the steady-going base 10 and the compact base 16.

What’s great about binary? In the spirit of keeping things simple, it’s the simplest number system that has the concept of “ticking over”. Unary, where we just write 1, 11, 111… just goes on forever. Binary, with two options (1 and 0) looks like this:


1: 1
2: 10 (we’re full – tick over)
3: 11
4: 100 (we’re full again – tick over)
5: 101
6: 110
7: 111
8: 1000 (tick over again)
…

and so on.

Because binary is so simple, it’s very easy to build in hardware. You just need things that can turn on or off (representing 1 and 0), rather than things that have 10 possible states (to represent decimal).

Because it’s so simple, binary is also resistant to errors. If your signal is “partially on” (let’s say 0.4), you can assume that’s a zero. And if it’s mostly on (say 0.8), then you can assume it’s a 1. If you’re using a system with 10 possible states, it’s difficult to tell when an error occurred. This is one reason digital signals are so resilient to noise.

Other examples of bases

We use other bases all the time, even dynamically changing bases. We usually don’t think of it that way:

Hours, minutes, seconds: 1:32:04

We know this is 1 hour, 32 minutes, 4 seconds. In seconds, this is (1 * 60 * 60) + (32 * 60) + 4.

Feet and inches: 3′ 5″

This is 3 feet, 5 in or 3 * 12 + 5 inches.

Pounds and ounces: 8 lbs, 5 oz

Since a pound is 16 oz, This is 8 * 16 + 5 oz. We’ve been using a base 16 number system all along!

Parting thoughts

The digits 10 in any number system means we have ticked over the base one time. For example, 10 in binary means two, 10 in decimal means ten, and 10 in hexadecimal is sixteen.

How do you keep these numbers apart, how do you tell what base was used? Programmers will often write 0b in front of binary numbers. So 2 in binary is

0b10

Similarly, they’ll write 0x in front of hex numbers. So 16 in hex is:

0x10

If there aren’t any symbols (0b or 0x) in front, we assume it’s base 10, a regular number.

Now go forth and enjoy your new knowledge!