Converting numeric ASCII string to a long integer.

denc4 · Feb 17, 2005

I have been trying to write this converter. I have already
written one for 16 bit values in a distant past. But now I
am having trouble writing one for 32 bit values. I use
MASM 6.11 in a DOSbox under Win95.

At the end of the program I expect the return value in
dx:ax to be FFFF:FFFF. But the code returns FFFF:FFFD.
What happened to those two bytes? The Art of Assembly book
did not answer my question either. Any help would be
great.

I know this ain't the fastest code on earth, so any
optimization suggestions would also be more than welcome!

thx.

Code:

  org 100h            ;.com format.
  main proc near
    lea si,long+9
    call convert
    int 20h
  main endp

  long byte '4294967295'    ;0FFFFFFFFh
  power10 dword 1,10,100,1000,10000,100000,1000000, \
                10000000,100000000,1000000000

; Convert assumes end of buffer in si.
; It outputs the numeric value in dx:ax and si points at
; si - length(ASCIInumber) - 1.

  convert proc near
    push bx
    push cx
    std

    xor bx,bx
    push bx
    push bx
    mov bp,sp       ;use this dword for temporary storage.

  lp0:
    xor ax,ax
    lodsb
    cmp al,30h
    jb endproc
    cmp al,39h
    ja endproc
    sub al,30h                ;character is numeric ASCII.

    mov cx,ax                 ;save value.
    mul word ptr power10[bx]
    add [bp],ax
    mov ax,cx                 ;restore value.
    mov cx,dx                 ;save H.O. word.

    mul word ptr power10[bx+2]
    add ax,cx                 ;add in saved H.O. word.
    add [bp+2],ax

    add bx,4                  ;increment power10 offset.
    cmp bx,sizeof power10
    jnz lp0
    dec si                    ;si = offset(ASCIInum) - 2

  endproc:
    pop dx
    pop ax
    pop cx
    pop bx
    ret
  convert endp

cseg ends
end main

TessaBonting · Feb 18, 2005

mov cx,ax ;save value.
mul word ptr power10[bx]
add [bp],ax !!!! this add could generate a carry
mov ax,cx ;restore value.
mov cx,dx ;save H.O. word.

mul word ptr power10[bx+2]
!!!! and is lost here
add ax,cx ;add in saved H.O. word.
add [bp+2],ax

add bx,4 ;increment power10 offset.

See !!!! change your code to save the result and add
all at ones.

b.v.
mov cx,ax ;save value.
mul word ptr power10[bx]
add [bp],ax
pushf
mov ax,cx ;restore value.
mov cx,dx ;save H.O. word.

mul word ptr power10[bx+2]
popf
adc ax,cx ;add in saved H.O. word.
adc [bp+2],ax

Tessa

denc4 · Feb 18, 2005

Thank you, that's what was missing. The code works.

Although I do not understand why the last ADC should be ADC,
and not ADD. Considering the power10 array values, the
ADC AX,CX can never overflow, or am I wrong?

lionelhill · Feb 18, 2005

If you don't mind doing this, assuming your code will never run on a pre-286 processor (!), you could consider using the 32 bit registers for 32 bit arithmetic. They can be used in 16-bit assembly (provided you don't try 32-bit addressing. That isn't allowed). The only down-side is that they involve a size override prefix on the instruction, which makes your code a byte longer each time you do it, and which also increases processing time slightly. But against that, you can do a 32-bit addition in 1 go.

TessaBonting · Feb 18, 2005

The second ADC is of inportance.

65565 + 1 = 0000h 0000h if you omit the second ADC

now it will convert to 0001h 0000h is it should.

Dear "lionelhill" sometime it is the best way to learn
handling software problems by just doing it from the base
up.
The use of EAX will do the job, but then "denc4" would'n
know what to do when converting a quad number, now he
knows what could go wrong if you forget that a result
can overflow.

Greatings, Tessa

denc4 · Feb 19, 2005

I just found out the second ADC can just as easily be an ADD
instruction. Take a look at the following "power10 array to
hex" table:

1: 0000 0001
10: 0000 000A
100: 0000 0064
1000: 0000 03E8
10000: 0000 2710
100000: 0001 86A0
1000000: 000F 4240
10000000: 0098 9680
100000000: 05F5 E100
1000000000: 3B9A CA00

The highest value the left-most ASCII character in my "long"
buffer can contain is 4. When we arrive at the second mul
instruction in my code it will operate on the high word,
which is 3B9A. 3B9Ah * 4 = EE68h (61032t).

The maximum modulo value the previous mul instruction could
have generated is 1. CA00h MOD 3 = 1.

The case is that ADC AX,CX can NEVER generate an overflow.

......criticism more than welcome.

lionelhill:
32bit programming sounds great.
maybe it would be time to get into it when I am good enough
to figure out by myself that I forgot to add in the CF

I want to try to understand 16 bit programming the most I
can before I get into 32bit programming (if ever).

lionelhill · Feb 21, 2005

Dear Both,

please don't think I'm criticising you for going for the pure 16-bit approach. You are absolutely right about the educational value - I was merely pointing out an alternative with which you might like to experiment.

You should certainly learn about carries and the difference between adc and add, because you'll still need that however big your processor. A 32 bit instruction can only add 64 bits if you understand how to handle the carry, and it's Exactly The Same as you're doing now!

The real problem of 32 bit world is probably learning how to handle the addressing. The only thing I'd say now is, when learning in 16-bit world, try not to get into bad habits like using segment registers to contain data numbers (you probably haven't even thought of this. It's an evil and silly thing to do!). There are a few things like this that you can get away with in 16-bit world that will definitely cause problems if you go 32-bit.

denc4, I think you might appreciate Michael Abrash as an author. A long time ago he wrote various things on assembly programming, mostly with graphics in mind. Very good stuff.

If you want to experiment with the 32 bit registers, just insert .286 or .386 to let your assembler know it's allowed to burst into longer stuff, and then try out all your favourite instructions in their eax, ebx, etc. form.

denc4 · Feb 21, 2005

I did not think you were criticising me. You were right,
16 bit programming IS old. But since I am used to DEBUG.EXE
to debug my code, I am sort of reluctant to switch to 32bit.
I could use CodeView, but CodeView only supports .EXE files,
and since I don't know anything about .EXE files (header
layout, register initialization, how DOS loads 'em.. etc.
basically how to exploit them best), I do not want to switch
to .EXE, as of now.

Why not use a segment register to hold temporary values.
If you want to execute some that does not use the es
register for instance, and yer out of registers to hold
values, why not push es on the stack and pop it later on.
It IS tricky, I know. But since memory operations are
slower, and a segment register is not used anyway, why not
use it? Just be bold

Ah yes, Michael Abrash, he helped code Quake 1.. wow.
The book you are refering to is probably "Zen of Assembly
Language". I've got it. But I am lacking time. I will
definitely read it some time though. What's the use of
programming in assembly anyway if yer code is not optimal.
I wouldn't be surprised if some C compilers out there can
do a better job in generating machine code than me...

ontopic:
Can anybody confirm my findings in my previous post?
I would like a second opinion.

thx.

lionelhill · Feb 22, 2005

You can push and pop what you like on the stack, but my reason for urging caution about putting ordinary numbers (rather than segment addresses) in the segment registers is that in 32-bit world it won't work.
In 16-bit world it's fine because a segment is simply an address shifted left 4 places. Any segment number can be valid. In 32-bit world a segment is an index into a table, and if you choose an index that doesn't exist, there will be an error. Therefore the segment registers in 32-bit world may only be used to hold the numbers of valid segments, not the number of dots you still need to draw before the end of the screen, or whatever other piece of data won't fit into the normal register set.
Besides which, there's not much to be gained by misuse of a seg register; pushing and popping is a pretty quick way to free up a register temporarily.

A cheeky thing you can do in 16-bit code using the 32 bit instructions is to use bswap, which turns the bytes of a 32-bit register from Mac order to IBM order/vice versa. This provides a convenient way to move the bottom (working) 16 bits into the upper (out of the way) bits, from which they can be retrieved later in the same way. But beware, bswap isn't the fastest/most pairable instruction, so far as I know.

I agree entirely about only using assembler where it't important, and therefore also needs to be right. Good attitude.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Converting numeric ASCII string to a long integer.

denc4

Programmer

TessaBonting

Technical User

denc4

Programmer

lionelhill

Technical User

TessaBonting

Technical User

denc4

Programmer

lionelhill

Technical User

denc4

Programmer

lionelhill

Technical User

Similar threads

Part and Inventory Search

Sponsor