char - '0' == int?? 2

AtomicChip · Jul 19, 2006

Hey all,

Just looking for an explanation of how and why this works:

Code:

const char* foo = "123";
int n = foo[ 0 ] - '0'; // resulting in the int value 1

Now, I understand that C++ will do auto type-casting 'n stuff like in the following example:

Code:

const int n = 3;
const float m = 1.1;
const int l = n*m; // will result in 3

...but why does subtracting a '0' ascii value from an ascii char result in a proper int value??

Thanks in advance..

-----------------------------------------------
"The night sky over the planet Krikkit is the least interesting sight in the entire universe."
-Hitch Hiker's Guide To The Galaxy

cpjust · Jul 19, 2006

If you look at an ASCII table, you'll see that '0' = 0x30 (or 48 in decimal). Thankfully when they invented the ASCII table, they grouped numbers and letters together in the proper order; so '1' is right after '0' and therefore has a value of 0x31 (or 49 in decimal).

'1' - '0' is the same as 49 - 48, which equals 1.

ArkM · Jul 19, 2006

Code:

const char* foo = "123";
int n = foo[ 0 ] - '0'; // resulting in the int value 1

In char arithmetics operands are converted into int (it's so called integral promotion conversion). You get the first element of "123" ('1') via foo pointer - see cpjust's explanation.
After initializer evaluation its value (int after promotion int-int) initializes int n.
Be careful: in C++ basic execution character set may be not only ASCII.

Code:

int n = '0';

In that case char value '0' depends on basic character set of the C++ implementation. Moreover, char value may be signed or unsigned (it's implementation specific).

AtomicChip · Jul 19, 2006

Thanks guys.. Makes sense.

-----------------------------------------------
"The night sky over the planet Krikkit is the least interesting sight in the entire universe."
-Hitch Hiker's Guide To The Galaxy

dEVooXiAm · Jul 20, 2006

In that case char value '0' depends on basic character set of the C++ implementation

can you name any other than ASCII "character set of the C++ implementation" that has '0' as another numerical value? yeez.. some people really see simple things in a hard way.

and for the answer of the first post - there is no such thing as a character type for a computer. Everything that he understands is just numbers. And characters is just a way we treat one number or another. What we see as a text 'AAA', for computer is just 3 bytes of values 65, 65, 65 (ASCII encoding, no terminating byte). When you start looking at characters this way, you'll start understanding the way some things really work.. (like tolower() being just a range check and a simple ch+32 operation).

------------------
When you do it, do it right.

lionelhill · Jul 20, 2006

Not all of us are gifted with detailed knowledge of what character sets will be implemented next year. ArkM does have a point. It never does any harm to understand the assumptions on which your work is based.

ArkM · Jul 20, 2006

EBCDIC on IBM mainframes, for example.

cpjust · Jul 20, 2006

like tolower() being just a range check and a simple ch+32 operation

This is true for ASCII, but not for things like EBCIDIC. It would be even worse if you were using something like UTF16...

So to be clear, when you want to convert characters to upper or lower case, never try to optimize your code by adding or subtracting 32 from the character value -- always use the library functions like toupper(). Otherwise, if you decide to port your program to another platform, you might be in big trouble.

dEVooXiAm · Jul 28, 2006

Common, dont be silly guys. EBCIDIC is a mainframe encoding - things where most likely 0.01% of worlds programmers care about. Not sure what you guys use to code for that, but that's definately not C++ we talk about here.
UTF16 is a Unicode encoding which for programming languages will always be ASCII compliant, which means that the basic set (first 128) of characters are ALWAYS in the same position as ASCII.
And I am not telling people to add 32 to make their letters go lower case, because some <noob> reading the code will spend 2 weeks to understand what that +32 hardcoded thing means. What I am telling, is that people should try looking deeper into how programs work and not come later talking about "integral promotion" stuff...

------------------
When you do it, do it right.

Salem · Jul 28, 2006

> (like tolower() being just a range check and a simple ch+32 operation).
What about locales?
As well as being more readable, it will also do the right thing (to paraphrase your sig) when you're in a foreign language.

> Be careful: in C++ basic execution character set may be not only ASCII.
True, but this quote from C99 is pertinent (C++ standards will be equivalent).

draft C99 said:
In both the source and execution basic character sets, the
value of each character after 0 in the above list of
decimal digits shall be one greater than the value of the previous.

So you can always assume that no matter what the underlying character set is, at least '0' to '9' will be consecutive to permit the [tt]ch - '0'[/tt] type of calculation. For everything else though, the appropriate API should be used.

--

mingis · Jul 31, 2006

And finally to the initial question:

but why does subtracting a '0' ascii value from an ascii char result in a proper int value??

Because char in C/C++ is simply 8-bit wide int.

lionelhill · Jul 31, 2006

But see cpjust's post, 2nd in this thread. The important thing isn't the 8-bitness of char, it's that char is a number, and the ascii character set happens to contain 0-9 listed successively in the correct order. It would work equally well on a hypothetical 19bit processor if the designers have the sense to keep the numbers in the right order.

mingis · Jul 31, 2006

I thought not 8-bitness but int-ness of the char - char is one of integer data types like short and long.

lionelhill · Jul 31, 2006

yup, true.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

char - '0' == int?? 2

AtomicChip

Programmer

cpjust

Programmer

ArkM

IS-IT--Management

AtomicChip

Programmer

dEVooXiAm

Programmer

lionelhill

Technical User

ArkM

IS-IT--Management

cpjust

Programmer

dEVooXiAm

Programmer

Salem

Programmer

mingis

Programmer

lionelhill

Technical User

mingis

Programmer

lionelhill

Technical User

Similar threads

Part and Inventory Search

Sponsor