Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

float numbers

Status
Not open for further replies.

Cagliostro

Programmer
Sep 13, 2000
4,226
GB
But the way, what is the binary format of float/bouble numbers? For example run this code you to understand me:
#include<iostream>
using namespace std;
struct x
{
union
{
int i;
float f;
};
};
int main(int _args,char** _cmdline)
{
x a;
a.f=1.;
cout<<a.i<<endl;
a.i=1;
cout<<a.f<<endl;
return 0;
} John Fill
1c.bmp


ivfmd@mail.md
 
I was trying to figure this out a while ago as well... if you should figure it out please post.

Matt
 
ok, I think is a question for asm programmers. John Fill
1c.bmp


ivfmd@mail.md
 
Unfortunately, I don't have the book with me here, but the general binary format for floating point numbers is defined by the IEEE. It involves a sign bit, several bits for the exponent, and the rest for the mantissa. You can probably find it online, as I believe that VC++ uses the same format that the Pentium processor does.
 
Most likely... and I am not very good at it. I did copy my idea and merged it with your post... here is what I was thinking along the lines of

#include <iostream.h>

struct x
{
union
{
int i;
float f;
};
};
int main(int _args,char** _cmdline)
{
x a;
a.f=1.;
cout<<a.i<<endl;
cout<<&quot;BINARY: &quot;;
for(int i = 31;i+1;i--)
{
cout<< ((a.i & 1<<i) ? 1:0);
}
cout<<endl<<endl;
a.i=1;
cout<<a.f<<endl;
cout<<&quot;BINARY: &quot;;
for(int j = 31;j+1;j--)
{
cout<< ((a.i & 1<<j) ? 1:0);
}
cout<<endl;
return 0;
}
 
hi there
as far as i know float values are defined like:
byte 0: exponent (e)
byte 1-5: mantissa (m)
as the msb is used as sign bit, it is always regarded as set for the mantisse. therefore 1/2 <= m < 1 is always valid.
signing for the exponent is realised by offsetting. that means -128 is 0x00.
the float value is m * 2e

hope it helps
yours
copton

 
4 bites float
Bit Description
31 - Sign
30-23 - Power
22-0 - Mantissa (real bites of the number)

8 bites float (double)
63 - Sign
62-52 - Power
51-0 - Mantissa

10 bites float (long double)
79 - Sign
78-64 - Power
63-0 - Mantissa John Fill
1c.bmp


ivfmd@mail.md
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top