We start off with a few categories of numbers:
Some interesting properties of real numbers:
We write our numbers in base-10 (10 possible digits). Moving left from the decimal point (substitute comma for some locales), we read factors of increasing powers of 10. We write the base "10" in subscript after the number, to remind us.
133.75₁₀
^^^ ^^- 5 · 10⁻² = 0.05
||| |-- 7 · 10⁻¹ = 0.7
|||---- 3 · 10⁰ = 3
||----- 3 · 10¹ = 30
|------ 1 · 10² = 100
We can do the same thing for "binary" (base-2, two possible digits) numbers:
10000101.11₂
^ ^ ^ ^^- 1 · 2⁻² = 1 / 4 = 0.25₁₀
| | | |-- 1 · 2⁻¹ = 1 / 2 = 0.50₁₀
| | |---- 1 · 2⁰ = 1 = 1.00₁₀
| | 0 · 2¹ = 0 · 2
| |------ 1 · 2² = 4 = 4.00₁₀
| 0 · 2³ = 0 · 8
| 0 · 2⁴ = 0 · 16
| 0 · 2⁵ = 0 · 32
| 0 · 2⁶ = 0 · 64
|----------- 1 · 2⁷ = 128 = 128.00₁₀
= 128+4+1+0.5+0.25 (base-10)
= 133.75
Binary is very appealing for digital electronics, because of our ability to construct circuits performing boolean operations.
Instead of decreasing the number of digits, we could increase them; hexadecimal (base-16) is very popular, because one hexadecimal digit corresponds to four binary digits. The symbols for hexadecimal run 0-9 and then A-F; here F₁₆ = 15₁₀.
Exercise: build a list of all four-digit binary numbers, and compute the corresponding single-digit hexadecimal number.
85.C₁₆
^^ ^---- 12 · 16⁻¹ = 12 / 16 = 0.75₁₀
||------ 5 · 16⁰ = 5 · 1 = 5₁₀
|------- 8 · 16¹ = 128₁₀
Unsigned integers are stored as sequences of binary digits. On some processors, integers are expected to be stored in "reverse" order, with the most significant bit (largest contribution) last and the least significant bit (smallest contribution) first. The order in which bits are stored is called "endianness"; "little-endian" has the least significant bit first, and "big-endian" has the most significant bit first.
CPUs are generally little-endian (reverse from human writing), while network protocols are big-endian. Care must be taken when data crosses an endianness boundary.
There are several different ways of storing; you may refer to the Wikipedia. The most popular is called Two's complement.
Fractional numbers can be stored in several ways; the most common way is with "floating point": the number can be stored as a pair of signed integers: the digits, called the mantissa and the exponent.
Example for decimal:
133.75 = 13375 · 10⁻²
= [13375][-2]
^^^^^ ^^----- exponent
|----------- mantissa
The most common kind of floating point, which is physically wired into processors, is binary, meaning that the exponent is base-2.
The number of bits allocated to mantissa and exponent for a standard 32bit floating point number is 28 and 6 bits, respectively; but this is a naïve story!
Read more at The Perils of Floating Point; if you want to know even more, read the Wikipedia article on IEEE 754 floating point.
Exercise: Pretend we use a naïve floating-point format with 5bit mantissa and 3bit exponent (base-2). What is the smallest possible positive number representable? What is the largest positive number representable? The first bit of each is used for sign:
[±****][±**]
^^^^^ ^^^--- exponent
|---------- mantissa
One of the many perils of working with floating-point numbers: recall that the fraction 1/3 has an infinitely long decimal expansion, while 1/10 has a finite expansion:
1/10 = 0.1
1/3 = 0.333…
In binary, we have a similar problem: powers of 2 have finite expansions, while 1/10 has an infinite expansion:
1/2 = 0.1₂
1/10 = 1/16 + 1/32 + 1/256 + ⋯
= 0.00011001… ₂
Since our processors use binary floating point, we have to truncate (cut off) this representation, resulting in an imperfect approximation of 1/10.
Exercise: use python to check this:
from decimal import Decimal
print(repr(Decimal(0.1)))
should result in
Decimal('0.1000000000000000055511151231257827021181583404541015625')
Exercise: what is the best approximation of 0.01?
Read the Python floating point tutorial for more information on how this can apply to python.