Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Developing an assembler

Status
Not open for further replies.

borgrulze

Programmer
Apr 4, 2002
11
0
0
IN
Hi,
Does any one here has experience in implementing an assembler?. I need to implement an 8086 asssembler as a part of my college project, i have no choice.
Please send any pointers,suggestions to me
bye
 
First:
Most of the lists of instruction opcodes you will find will be in hex. I suggest you convert them to binary so that you will see some special relationships between certain instructions. Better yet, find one with those relationships already shown in the binary code numbers. I have a manual for Masm v.5.00 and it contains that info.

Second:
I suggest you limit yourself to assembly into DOS *.com format.

Third:
Implement a memory-based database with the following characteristics:
Three fields:
Text Field (variable-length!!!)
Type Field (one byte will do)
Numeric Field (Four bytes, one dword)
VERY SMALL SIZE
SEARCH VIA TEXT FIELD
This database will contain the labels of the assembly file. Text Field will contain the name of the label, Type Field the type, Numeric Field the value.

Fourth:
Make sure you know what types your label might be. Is it an address?
address db 12 ;data address
address: ;code address

A numeric constant value?
constant EQU 32

A string constant value?
s-constant EQU <string>

REMEMBER:
mov ax,label

can assemble to two different things! It might mean, load the value of label into ax (since it was defined as EQU), or load the memory area pointed to by label into ax (since it was defined using db or dw or whatnot).

Assign each type you can think of with a separate value for the Type Field of your label database. Mind you, the label database NEEDS a variable length Text Field.

Five:
TIME. Give yourself that. Don't tell me you need it in a month, unless you cheat and get someone else's assembler, you can't do that. Give yourself some three months at the least, maybe more if you feel it difficult.

Six:
Good Luck and Have Fun!

&quot;Information has a tendency to be free. Which means someone will always tell you something you don't want to know.&quot;
 
Developing an assembler is sure not an easy task, you need a full idea and knowledge of assembly language and you will also need all the developers manual from Intel. i am not breaking you down but i have tried writing one myself and it ended up in the garbage maybe i'm not too good with it but i strongly suggest you write one for dos and NOT windows do some research online and see where you can get an open source assembler.
 
here is my ten pence worth:

I would use free format (ie text editor) to for the source code.

forget about labels to start

I would create a database for all the command opcodes.

then I would implement a usage database, basically each opcode changes (certain bits) depending on usage ie difference between 'inc eax' and 'inc ebx' etc

each line will decode the text into an opcode then you know how to imterpret the operand.

must go, do later 0000:0000:0000:0000h
The people who have nothing to say and say it too loud have little knowledge, It's the quiet ones you need to worry about!
 
i'm back,

the operand field should be quite easy at it is a number of either 8,16,or 32 bits.

Once you have this working you can then concentrait on the labels. All a label does is reference a point within the program. Usually an assembler makes serveral passes over the source code. The first pass produces the opcodes and calculate the number of bytes each command uses and allowing it to resolve all labels ie give them a number (offset address from start). on the last pass the assembler can calculate the exact value it needs to put into the operand by using the label values and performing adjustments.

lets recap
1st pass the assembler determins opcode and size of command building the basic results and determins program size.
2nd pass the assembler calculates the offset address for all labels defined.
3rd pass the assembler places the calculated value into all the operand where labels have been referenced.

Obviously there are many more apsects to creating an assembler but this should get you going for now.

TaTar
Straiph.
0000:0000:0000:0000h
The people who have nothing to say and say it too loud have little knowledge, It's the quiet ones you need to worry about!
 
You can put things into two simple passes:
First pass - Assemble, but if you encounter an unknown label, make assumptions. Do NOT generate code. In the first pass all you're interested in is to get all the labels. This means you want to keep your program counter ($) updated for 'label:' and 'label proc' type labels. E.g.
mov al,label1 ;leave one byte for label1 to go into.
jmp label2 ;leave two bytes and assume it's a near jump

Second pass - assemble for real. Generate code. If you encounter an unknown label here, tell the user he is stupid. No, wait, tell him there's an error instead. Since in the first pass you already made sure the labels are there, there should be no more unknown labels.

More on the label database:
Label database requires simply a record-ADDING routine. No delete routine is required because we assume labels cannot be removed. This allows us to use variable sized records more easily. Each record can be something like this:
byte 0-3: value
byte 4:type
byte 5:label length
byte 6-n:label itself.


&quot;Information has a tendency to be free. Which means someone will always tell you something you don't want to know.&quot;
 
We need 8086 source for assembly language programming... it's very very important for us. If you want to help us send projects at

experiments_lab@virgilio.it
or
paolomilanesi@libero.it


THANK YOU!!!! ;-))))
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top