teach-ict.com logo

THE education site for computer science and ICT

2. Floating point

For the same number of available bits, floating point can be used to represent a wider range of real numbers compared to the fixed point approach.

It is called floating point because the radix point (binary or decimal) moves around.

Floating point is similar to representing a number in scientific notation :-

floating point number

The 'mantissa' holds the number value and its sign and the 'exponent' defines where the decimal point needs to be if the number is shown in standard format. In the above case, the 103 indicates that it needs to be multiplied by a thousand and so the radix point has to move three places to the right, like this :

Binary floating point uses the same idea. A binary floating point number is in two parts. The Mantissa and the Exponent. Here is an 8 bit floating point number

Both the mantissa and the exponent is in two's compliment. If there is a 1 present in the most significant bit (MSB) of the mantissa then it is a negative number. If there is a 1 in the MSB in the exponent then it is a negative exponent (in decimal this is like 10-3 which means shift the point left rather than right.)

Here is what it looks like as an actual value in an 8 bit register. In the mantissa there is one bit for the integer part and three bits for the fractional part. As there is a 1 in the leftmost digit, it is a negative number.

We continue on the next page