#2576 - Dragon - Fri Feb 07, 2003 4:52 pm
Hi everyone.
I have decided to write a assembler for my college programming project. I have programmed a bit with the GBA, though never much liked the asm compilers I have used. Having knowledge on instruction-set processors and the idea of using an efficient asm compiler for the GBA has inspired me to design one.
Can anyone guide along this process? I can't seem to find very informative document on the ARM/Thumb instruction set that could aid me in my programming of the compiler. Id anyone has any links that could help me out, I would deeply appriciate it.
Thanks.
#2577 - jeff - Fri Feb 07, 2003 4:59 pm
Re-Ejected has a few sheets for the ARM/THUMB instruction sets that specify what each opcode is -- these are what I used (and Goldroad assembler to check) for my assembler. Be aware you'll need his version 2 for THUMB, as version 1 has several errors on it. You could also use ARM's data sheet which is huge, but very accurate (of course). I forget all the web addresses, but I'm sure someone else has them handy -- and you can get them all at the DOCS section of gbadev.org.
Actually, aside from parsing ARM instructions (which isn't that hard, just more painful than something like a Z80), writing an assembler for the ARM is very easy indeed.
If you have never written an assembler before, I'd suggest reading up on multi-pass assemblers.
Good luck!
Jeff
#2639 - Dragon - Sun Feb 09, 2003 9:46 pm
I am pressed for time and havent had much luck on learning about these multi-pass assemblers. Could sombody please give me a few links that could help me out? Anything that can help me write an ARM assembler would really be appriciated. Thanks.
#2652 - pelrun - Mon Feb 10, 2003 1:20 am
The actual *assembly* job is easy - once you work out what the instruction on a particular line is, you just look it up in a table and dump out the appropriate opcode, along with whatever operands were given.
The multiple-pass thing is not strictly necessary for the task of converting a mnemonic to it's opcode. What *does* require a second pass is when you have labels. And if you want to support macros then that needs a third pass.
How it works is this:
Pass 1: read through the source file and expand any macros found.
This may need multiple passes if you want your macros to nest...
Pass 2: "Assemble" the output from pass 1, but *don't* generate object code. Store all labels (and the current address you're "assembling to" when you encounter them) into a table.
You need to do this pass if you want to be able to use labels which are defined *after* the instruction that references them (a forward reference.) Otherwise, how are you supposed to know what address to substitute for the label when you haven't seen the label defined yet?
Pass 3: Assemble the output from pass 1 again, generating object code. When label references are encountered look up their address from the previously generated table.