Double vs Float 4

sabe · Sep 26, 2001

I don't understand why I'm getting this output...

Here's the program
#include <stdio.h>
int main()
{
float x;
double y;
x = 145678.999817;
y = 145678.999817;
printf("Float is %f and double is %f\n", x, y);
x = 456.9998651;
y = 456.9998651;
printf("Float is %f and double is %f\n", x, y);
return 0;
}

OUTPUT:
Float is 145679.000000 and double is 145678.999817
Float is 456.999878 and double is 456.999865

---

I get the first line of the output, but I don't understand why the float = 456.999878 and double = 456.999865 on the second line.

ShowLine · Sep 27, 2001

You should use %lf (long float) for a double.

If i´m right a float is 32bits and a double is 64 bits.

adyrotaru · Sep 27, 2001

The thing is simple:

When you use "%f" format, printf function expects a double value, not float. As ShowLine said above, float values are 32 bits long while doubles need 64 bits. This leads to a misinterpretation of float values.

Good luck!

Lim · Sep 27, 2001

It is not printf problem, it is more tricky.
the problem is in transformation from float to double. These two types have a different matematical representation so there is a special formula to transform float to double and there is float.h with

#define FLT_EPSILON 1.192092896e-07F
/* smallest such that 1.0+FLT_EPSILON != 1.0 */
that tells that precision of this transformation will be done with no better result then FLT_EPSILON.
Just keep in mind that float->double is mathematical operation.

In your case:
x = 145678.999817;
y = x; /*mathematical operation with FLT_EPSILON accuracy */
/* y=x is the same that you just put x in printf */

printf("%12.8e\n",(y - 145678.999817)/145678.999817);

will be 1.25617794e-08 that is less then FLT_EPSILON and exactly legal.
Same with second example.
Or:
float x=99.9;
double y=x;
printf("%12.8e\n", y);
will be 9.99000015e+1 (and Epsilon wil be 1.52e-08).

Working with float and double is more tricky than with integer.
Good luck.

Lim · Sep 27, 2001

Second thought around
FLT_EPSILON 1.192092896e-07F

The float in computer stored in binary and binary representation of the float with matissa is not exact with the accuracy of FLT_EPSILON.
And in transformation when transformation float->double adding 0 (I think) you have representation of double that has been obtained from other binary number that had been close to original float.
With float and double you never have exact value just because binary->float has FLT_EPSILON error.
Keep in mind that Epsilon=(X1-X2)/X1; not just X1-X2.

adyrotaru · Sep 27, 2001

Lim,

I do not blame printf function, which works fine.
As I pointed out, the difference consists between internal reprezentation of the two real number types.
Of course, you gave a more detailed answer which I admit I didn't know, but because printf function is really expects a double value for "%f" format, it will try to treat a float value as double. If I'm wrong here, please let me know.
And, for my post to be useful:

Type Exponent length Mantissa length
float 8 bits 23 bits
double 11 bits 52 bits

Bye

sabe · Sep 27, 2001

Thank you for your help! =)

smaniraja · Sep 28, 2001

Hi all the fractional numbers are double by default.

so in ur program the numbers
x = 145678.999817;
y = 145678.999817;
are double by default but your are storing the one double precision constant to float variable (x) and another double to a double precision variable.

As you all know that the momory allocated for float variable will be 32 bit and for double variables 64 bits. So the double constant may be stored perfectly in the double variable

but the its not in the case of x, because some portion of the binary equivalent of the double number may negleted while stroing it to a float variable.

The second problem may come in the printf statement(printing a double number using the float format ). %g is prefered.

So if u want to use a float constant in your program, use the constant followed by a letter f. Like 1.34f;

Otherwise these things will cause major problems in the comparision.

Maniraja S

Lim · Sep 28, 2001

adyrotaru,
You are right it expects double.
Inside printf will be used macros va_arg to get variables arguments.
In printf will be something like:
double d;
d = var_arg(ArgList, double);
where var_arg is a macros.

In stdarg.h you can see (for Intel):
#define _INTSIZEOF

((sizeof

+sizeof(int)-1)&~(sizeof(int) - 1) )
#define va_arg(ap,t) (*(t*)((ap+=_INTSIZEOF(t)) - _INTSIZEOF(t)))

As you see it will cast float pointer to double and will do the same as
y=(double)x;

But what is interesting that it will move pointer in argument list on the size of double not float and this should invalidate second variable argument. But this is not happining as we see in sabe example. Second double are printing Ok. If you put instead of first float an integer you will invalidate your second double.
This is interesting because sizeof(int) and sizeof(float) are the same 4 bytes.

So I thing that C allocate 8 byte in stack for float when it passing it through arguments and make float->double convertion right during the function call.
What do you think?
Leon.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Double vs Float 4

sabe

Programmer

ShowLine

Programmer

adyrotaru

Programmer

Lim

Programmer

Lim

Programmer

adyrotaru

Programmer

sabe

Programmer

smaniraja

Programmer

Lim

Programmer

Similar threads

Part and Inventory Search

Sponsor