Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ALIGN directive for inline ASM code...

Status
Not open for further replies.

juliak66ussd

Programmer
Feb 25, 2004
5
CA
Hi, everyone... first time here.
I am taking on a big project with a lot of parsing
and such, so I use inline ASM a lot. As known, the
latest Pentium CPUs are performing better when code
and data are aligned on 32-byte boundary. Here is a code
sample with a loop aligned on 16 byte:
Code:
_asm {
 ALIGN 16
 NEXT:
 // ... some instructions here ...
 loop NEXT
}
And that works OK, however, I need 32-byte alignment, but
when I specify it - the compiler gives the following:

error C2413: 'align' : illegal align size

My question is: is there any directive or command line flag
which will fix that problem? And why 16 is fine, but 32 is not?
My guess, the segment where code is generated does not have
an alignment property compatible with 32-bytes. I
encountered this in pure ASM code. So, how can that be changed? Is it possible at all?
 
> so I use inline ASM a lot
I would suggest that you stop now and rewrite what you have in easy to read C++. In addition, unless you're a hardcore ASM programmer, the chances that you'll be able to make best use of issues like pipelining and instruction scheduling is small.

Then read this.
Optimization of Computer Programs in C

Trying to predict in advance where all the hotspots are going to be is unreliable. You're simply wasting effort trying to optimise things long before you know if they're even a problem. Not only that, you're making your debugging and testing much harder as well.

A parser sounds like its going to be reading files a lot, so it could be I/O bound to start with. Carefully converting your "stringlower" function in ASM isn't going to help with that.


The first point is key "Choose a Better Algorithm". No amount of ASM will ever rescue you from a bubble sort compared to a decent implementation of quicksort.

WHEN your program is complete, then you can run a profiled version of your program with a representative sample of your data. The profile will tell you exactly how many times each function was run, and how much time was spent in each one.

Look at how often each one is called - is it in line with your expectations. If you're parsing a file, you may expect several functions to be called exactly once for each line in the input file.

Remember, the biggest optimisation is not calling the function in the first place. If you detect unnecessary work being done, that is where you'll get your big wins. I didn't get a 10x increase in performance (twice) by heading straight for the ASM manual in an old project I was involved in.

If everything looks OK, and its still too slow, then start looking at how you can make certain things faster. There are an awful lot of things to try before resorting to assembler.
The important point being is that you have a completed program to play with, so your changes can be carefully controlled and measured. Before you accept a change which improves things, consider whether the code is now less readable, harder to maintain and perhaps more prone to bugs.

--
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top