IEEE Standerd Floating Point format. Long answere "how does it work"

BinaryBob · May 21, 2001

I know this is one of the more time consuming questions. Thing is, is I just got to the FPU on the processor and am trying to figure out the pattern of the format. No one seems to know and I'll be amazed if any one here does. The only reason I want to know is so I can make a really good calculator, one that can solve straght algebra problems. So if any one can help that would be great! Even recomending a book is good. Oh, by the way I'm only in geometry so I'm illiterate in the ideas of algebra two. Thanks alot.

AmkG · May 22, 2001

I'm not that good at understanding the format of the IEEE floating point itself, but I can give an analogy...

Here's how we people store a floating point number in what we call SCIENTIFIC NOTATION:

1.234 E -2
(which is 0.01234 )

Now floating point is similar. If you look at the format, you'll see a lot of bits put into a so-called exponent, a lot of bits into a so-called mantissa. Now the exponent portion is in a standard integer format, high bit means negative, two's complement negatives, etc. This portion is similar to the right part of the SciNotation, which is an exponent. The left part of the SciNotation is similar to the mantissa.

Now, in SciNotation, any zeros that are not between non-zero digits are not put in the mantissa since they do not affect the magnitude of the number. Now the mantissa that the computer is concerned with is a binary mantissa, so for example, you have:
0.25 decimal
= 0.01 binary
0.125 decimal
= 0.001 binary
0.375 decimal
= 0.011 binary

After the decimal (binary?) point, the binary digit represents 1/2, then 1/4, etc., the same way decimal digits after the decimal point represents 1/10, then 1/100, etc. Now one thing that the IEEE designers noticed about binary floating points is that since there are only two possible digits, all binary mantissas (except in the case of 0.0) start with a 1. So they decided to remove that 1 from the format, making it an implied 1. Which complicates the matter, since by throwing away the implied 1, making it always set, there was no easy way of defining 0.0...

So IEEE decided that a float or double containing all 0's was called 0.0.

But unfortunately, the problem was merely shifted. 1.0 couldn't be defined exactly! That's because 1.0 in binary floating point is:
1.00 E 0
Since that first 1 is thrown away, it meant that the mantissa stored was 0, and the exponent was 0... but if both mantissa and exponent was 0, then the number was 0.0! Now what???

IEEE then decided that a very approximate value for 1.0 would work. So they decided that 1.0 would be represented by:
1.000000000000000 (15 more zeros for a float, 47 more for a double) 1 E 0

So that 1.0 was represented by a number with a 1 in the lowest order bit.

Got that?

Anyway the explanations aren't very accurate, they're pretty good ones, if only to help demonstrate how difficult floating point is.

BinaryBob · May 22, 2001

Wow thanks a lot, this is exactly what I needed although I don't quite get it yet I've already taken it to a calculus teacher who has volunteered to help. Adding this information will help even more, and when I figure it out I will post it here for all to see.

AmkG · Jun 1, 2001

Now there is also this little bit of datum you need to know (and which I forgot the first time):
The exponent portion is NORMALIZED to 128 (short) or 1024 (long).

What "normalized" actually means is that the number in the exponent portion of the format is the actual exponent +128 (or +1024). This is because the base is in fact -128 (not zero) largely because the people at IEEE couldn't decide whether or not to use 2's complement for negative numbers (or to use the non-complemeted version which has both a positive and negative zero). They eventually opted for the non-complemented (!) version for the mantissa, since the math was easier (on the hardware level) that way, but the exponent would still be easier with a 2's complement. So they decided to skip it all and that the exponent in the register could no longer be negative. But of course, we all need a negative exponent somewhere. So they decided to put a limit on the most negative value there is, and set this to zero. So for example, a mantissa of (1)01 (first 1 is the implied one) with a zero in the exponent is actually:
1.01b x 2^-128

Get?
"Information has a tendency to be free. Which means someone will always tell you something you don't want to know."

AmkG · Jun 2, 2001

Oops that's BIASED not NORMALIZED. Normalized refers to the removed first 1 on the mantissa, biased refers to the +128 of the exponent "Information has a tendency to be free. Which means someone will always tell you something you don't want to know."

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

IEEE Standerd Floating Point format. Long answere "how does it work"

BinaryBob

Technical User

AmkG

Programmer

BinaryBob

Technical User

AmkG

Programmer

AmkG

Programmer

Similar threads

Part and Inventory Search

Sponsor

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

IEEE Standerd Floating Point format. Long answere &quot;how does it work&quot;

BinaryBob

Technical User

AmkG

Programmer

BinaryBob

Technical User

AmkG

Programmer

AmkG

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor

IEEE Standerd Floating Point format. Long answere "how does it work"