#6137 - torne - Sun May 18, 2003 2:29 am
OK; it's time for me to start thinking about my final year project (I'm a BA Computer Science undergraduate at Cambridge), and I'm intending to write a so-called 'smart assembler', initially targetting the ARM processor. Coincidentally, I might just happen to implement stuff such that it can be used for GBA development; wouldn't that be handy. =)
This is not intended to be a competitor to DevKitAdvance or HAM.. it should end up just being a drop-in replacement for GNU as or GoldRoad, and will still require a linker, libraries .. etc.
Plans currently include:
* Syntax modelled on, or just copied directly from, GNU as, because it's nicely standard
* Ability to declare functions, rather than just jump entry points; this means the assembler can handle the calling semantics for you. (saving/loading registers as required by call standard; and not doing so when not required, for optimisation; also making interworking work without any effort)
* Ability to use 'register variables' - symbolic names which are bound to available registers at assemble-time. Means you get to use sane names for values, and not worry about which register they go into. This cooperates with the above to let you use named parameters/return values.
* The ability to declare local variables and have stack space allocated for them such that you can refer to them with symbolic names (and have SP-relative offsets produced for you).
What I'm posting for is, well, does anyone who codes in ASM have suggestions for things that would be useful? Anything GNU as can already do, I'll try to clone as much as possible; I don't use GoldRoad so if there's anything special it can do, let me know about it.
I don't intend to make it TOO smart. Any optimisations outside the scope of the above are probably too much. What I'm mostly interested in is things that are mechanical but neccecary, like choosing a free register for some arbitrary temp value, and junk like that which we can all do without. =)
Suggestions specific to GBA development will be considered, but suggestions applicable to all ARM coding are more interesting.
This is a long-ish term project, since it's for my course, and I won't be able to share it with you until the end (it will be released under the GPL but not until after it's been marked this time next year, due to regulations about project submission). Implementation isn't going to start until about October. Right now I'm just fishing for fun things to talk to my project supervisor about, and to try and work into some kind of design. I do fully intend to continue to maintain and extend it after my project is over, btw.
Thanks in advance for any suggestions, thoughts, or words of encouragement;
Torne
assembler-lover and proud
#6150 - arundel - Sun May 18, 2003 11:31 am
Very nice. I'm currently using goldroad, but since it seems the development has been stopped it would be nice to have another standalone assembler.
You might want to think about letting the user switch between the 2 syntax models (at&t and intel), which would be very cool. Interworking would be nice, too. I don't know if you'll have the time to implement different endienne models (I think the AGB only uses little).
Plus all those commands which come in handy, like:
Code: |
incbin and include
textarea and org
dup
fsize (padding the obj file - goldroad uses 0xff)
echo and lineo (just some (line)echoing for pseudo debugging)
macros (oh yeahh...) |
...and of course define. It would be nice, if define would work for memory loctions and internal assembler vars like the registers or even all the commands. Since I'm programing in pure assembly it would be nice if you could add some switches to the assembler for header manipulation (nintendo logo, crc check, multiboot support,...).
Since I'm also studying CS (2nd semester - algorithms and data structures) this is realy exciting for me to witness. Keep going and don't worry about the lack of a few details. I think everybody would apreciate a new assembler which will be updated regularly and whose developer is staying in contact with the users.
You might want to check out Martin Korth's GBA development suite also, because it's got a pretty nice assembler built inside.
Good luck.
_________________
http://www.nausicaa.net
#6153 - torne - Sun May 18, 2003 1:48 pm
I don't like GoldRoad's syntax, but it would be trivial to implement it, really; just an alternate parser grammar. Interworking is easy to implement. I intend to write the assembler in such a fashion that it can generate code for any platform at all, regardless of instruction set or endianness, but during the project I will only be implementing it for little-endian ARM.
As for the others:
incbin is silly. Use the linker.
include is a preprocessor function. Depends whether I use a preprocessor or not.
textarea/org/dup.. what do these do?
padding object files: No chance. I will be outputting ELF object files. GoldRoad's method of directly outputting a ROM is absurd and you should be using a linker and loader to do that.
What's echo supposed to do?
Macros.. maybe.
define, as a preprocessor instruction, already works for absolutely anything. If Goldroad can't do this, then it sucks. =)
No header manipulation; headers should not be inserted by an assembler.
As far as a 'conventional' assembler goes, GNU as is already, well, finished. I haven't heard of any desireable features in GoldRoad yet. I really just want smart register assignment and such. =)
Martin Korth's assembler doesn't really have anything new or exciting in it, either.
T.
#6155 - arundel - Sun May 18, 2003 3:15 pm
torne wrote: |
textarea/org/dup.. what do these do? |
textarea: define where the code will execute from (ROM, IWRAM, EXWRAM)
org: increase the address. same as add lx,lx,0x** (increase PC)
dub: dcb 0xff,0xff,0xff,0xff = dup 0x04 0xff
torne wrote: |
What's echo supposed to do?. |
echo: The assembler echoes this line if it assembles it, eg. echo add r1,r1,r1 would output add r1,r1,r1 to the output, when the command is beeing assembled.
lineo: If the assembler comes accross this it ouputs the current line.
Keep going.
_________________
http://www.nausicaa.net
#6158 - torne - Sun May 18, 2003 4:10 pm
textarea, then, is just a limited section declaration; org is a skip directive, dup is a fill directive. I'll read the GoldRoad docs sometime. In general, though, GoldRoad seems flawed in that it is targetting a single platform, rather than being a more general tool.
echo/lineno I can't think of a use for but are trivial anyway.
When I asked for features, I meant, well, *features*, not just random directives. ..
T.
#6164 - tom - Sun May 18, 2003 6:35 pm
incbin is not silly. it can come in very handy. actually it's the only thing that's cool about goldroad =)
(as a side note i would advice anybody not to use goldroad. it's buggy)
how about support for structures, like in tasm/masm ?
sane mechanism for local labels like in nasm, like this:
globallabel1:
.locallabel:
globallabel2:
.locallabel:
local labels start with a dot, and their scope reaches between the two nearest global labels.
#6168 - arundel - Sun May 18, 2003 8:01 pm
What i expect from a good assembler are the realy basic features. It should be fast, it should produce solid code and it shouldn't be too complicated, meaning it should not introduce too many syntax changes. I'm not a fan of High Level Assembly (HLA), because I don't quite see the sence in turning a 2nd level programing language into a 3rd generation language. We've got some excellent languages like C(++/#), Java and Delphi already.
A nice feature would be a complete register push and the complete pop of course. Maybe an internal macro for interrupts (sti/cli), but that can be done by hand very quickly. And you might want to think about porting the assembler to several OSes. w32, linux, *bsd at least. Did you say you wanted to release the src?
I believe the best ideas you can come up with are hidden somewhere in the nasm manuals. Nasm is an outstanding assembler and showed to everybody that, even though assembler was suposed to be dead, it can compete with tasm and masm.
_________________
http://www.nausicaa.net
#6172 - tepples - Sun May 18, 2003 9:16 pm
arundel wrote: |
I'm not a fan of High Level Assembly (HLA), because I don't quite see the sence in turning a 2nd level programing language into a 3rd generation language. We've got some excellent languages like C(++/#), Java and Delphi already. |
The point of turning assembly language into a higher-level language is to have some of the advantages of a C-level language available and instructions such as mull (32x32=64 multiplication) and ldm/stm (multi-word data transfers) available inline. (I've read that GCC's inline asm support imposes a lot of overhead at the beginning and end of each inline asm block.)
Either that, or go the other way and make C--, the portable assembly language.
Quote: |
A nice feature would be a complete register push and the complete pop of course. |
Many assemblers do have 'prolog' and 'epilog' pseudo-operations to set up and tear down stack frames, push and pop specified registers, etc.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#6173 - torne - Sun May 18, 2003 10:17 pm
Quote: |
incbin is not silly. it can come in very handy. |
Uhm.. for what? I can't think of any reason to want to include binary data in an assembler source file.
Structures already exist in GNU as, as do local labels.
arundel, I have no intention of making it into an HLA. Automatic register binding and the other things I discussed just save you making arbitrary decisions and unneccecary typing. =)
A complete register push is rarely neccecary, and my assembler will just do it anyway even if it is. It will run on anything already; I develop in Java. (comments about this will be ignored). I will be releasing the source under the GPL and continuing to maintain the project.
I'll poke at nasm a bit, but I am not really after 'traditional' features on the whole; I can get enough of those to fill my project time just from GNU as, and on the whole, implementing directives to do this and that, and macros for this and that, is trivial and therefore not interesting. I am interested in intelligent features; having the assembler do things that have previously required human decision-making.
Tepples, I don't know what you mean about having mull/ldm/stm available 'inline'.. surely you can just use them anyway.
What you mention about setting up stack frames I already intend to provide, but not with pseudo-ops; I intend to have function declarations. =)
T.
#6175 - arundel - Sun May 18, 2003 10:34 pm
Well...I'm currently out of ideas. It would be very nice, if you (after the concept phase) let us somehow take part in the development.
I'd realy apreciate it, if you would regularly update a diary or .plan to keep us know what you're doing.
Good luck.
btw.: Java rules...sometimes. ;)
_________________
http://www.nausicaa.net
#6176 - Dracula-X - Sun May 18, 2003 11:59 pm
You can't think of a reason to include binary data in an assembler source file? Ask any game developer who's been in the business for twenty years to explain to you why INCBIN is NOT silly... Out of most features this one will be the easiest for you to implement. It simply makes life easy to include assets. Period. It's not much work going the linker route but incbin is easier and faster to have at your disposal.
Professional outfits like SN Systems have been modifying gnu as for various platforms (gba, cube, etc) to include features like incbin for developers because it has been demanded of them...
-DX
#6179 - tepples - Mon May 19, 2003 2:54 am
Dracula-X wrote: |
Ask any game developer who's been in the business for twenty years to explain to you why INCBIN is NOT silly... Out of most features this one will be the easiest for you to implement. It simply makes life easy to include assets. Period. It's not much work going the linker route but incbin is easier and faster to have at your disposal. |
I agree that incbin can be useful, but using incbin for assets that change often during development requires a recompile and relink every time you change an asset, which means the artist needs to have a GCC toolchain installed on his or her machine. I'm working on an extension to GBFS that allows for inserted assets in addition to appended assets, and it may even allow for inserting assets into a .elf file to preserve debuggability. (Or does .elf carry a CRC that ld checks?)
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#6184 - Dracula-X - Mon May 19, 2003 4:56 am
Quote: |
I agree that incbin can be useful, but using incbin for assets that change often during development requires a recompile and relink every time you change an asset, which means the artist needs to have a GCC toolchain installed on his or her machine. |
A coder could very easily knock up a simple engine and a tool to insert these assets so the artist could check them out. That was never a serious problem. Certainly not to discount your efforts, I agree with you - these days projects are more complex and would benefit from solutions like GBFS. I'd still use incbin to include assets that we're comfortable they wouldn't change much, if at all (title screens, music, etc). I'd still use incbin to include data tables for example, generated by my own tools, rather than pissing around either making my tool spit out ascii values to include in my c files or converting them with bin2c type tools. It's easier + less code to export binary files from my own tools and it takes all of 2 seconds to write:
levelmap: incbin "l33tmap.bin"
and I'm done with it.
Quote: |
I'm working on an extension to GBFS that allows for inserted assets in addition to appended assets, and it may even allow for inserting assets into a .elf file to preserve debuggability. (Or does .elf carry a CRC that ld checks?) |
Good question. Sorry, I'm not sure if .elf files are crc'd, but I'd be quite surprised if they are.
-DX
#6186 - sgeos - Mon May 19, 2003 5:55 am
tepples wrote: |
I've read that GCC's inline asm support imposes a lot of overhead at the beginning and end of each inline asm block.
|
I used this in my rotation/scaling demo:
Code: |
mov r0,%0
mov r1,%1
mov r2,#1
swi 0x0E0000 |
The compiler used r12 for %0, something else for %1. It pused a bunch of registers onto the stack before executing the asm block, and restored them afterwards. (If I remember correctly.)
-Brendan
#6191 - torne - Mon May 19, 2003 9:27 am
Arundel, sure; I'll try and keep you updated on my site or something. I can't share any code until after the project's been marked, though, but I can show you examples of what the syntax will look like and stuff. Oh, and yes, Java rules sometimes. =)
dracula-x: uhm, well, I've never needed to use it in a few years programming in assembler; I find using the linker much easier. It doesn't matter; you are right that it is trivial. More importantly, features like that I would rather leave until after the project's over; they aren't going to give me anything to write about, just more code to write and test; so for now, they aren't 'important'. Basically, as long as it works a little bit, such that I can demonstrate and write about my more innovative functionality (the 'smart' bits), that'll do for the project; then once I release it, I can add all the trivial things like that =)
tepples: elf has no checksums, but it's, well, a complicated format. One of the things I will have to implement will be a library for manipulating ELF files, though of course I can't release that to you for a year either.
dracula-x, your second post: You don't need to mess around with ascii values or bin2c or anything; a sensible linker can just include a raw binary file directly into the output, and will automatically create associated symbols for the start, end and length. Total code required: 0. Just give the file a name that reflects its target section and add it to the linker's input, which is probably LESS typing than your incbin. =)
sgeos: GCC's inline assembler does slightly odd things sometimes. It doesn't try to be very smart, I don't think; I wouldn't be suprised if it pushed/popped more than it needed to. What my 'smart' assembler should eliminate is people wanting to write C code like:
int somefunction(x,y,z)
{
asm (" stuff " );
}
i.e. using GCC to generate function pre/postamble, as the assembler will be able to take a similar function declaration and do the same or better code generation than GCC.
Torne
#6200 - Dracula-X - Mon May 19, 2003 4:05 pm
torne wrote: |
dracula-x: uhm, well, I've never needed to use it in a few years programming in assembler; I find using the linker much easier. It doesn't matter; you are right that it is trivial. More importantly, features like that I would rather leave until after the project's over; they aren't going to give me anything to write about, just more code to write and test; so for now, they aren't 'important'. Basically, as long as it works a little bit, such that I can demonstrate and write about my more innovative functionality (the 'smart' bits), that'll do for the project; then once I release it, I can add all the trivial things like that =) |
I understand. I'm just defending the relevance of that feature. You've posted about your project in a forum consisting mostly of those who aspire to or currently write games. As someone who has written assembler since the 80's I can tell you I've used incbin extensively, and so has everyone I know :)
Quote: |
dracula-x, your second post: You don't need to mess around with ascii values or bin2c or anything; a sensible linker can just include a raw binary file directly into the output, and will automatically create associated symbols for the start, end and length. Total code required: 0. Just give the file a name that reflects its target section and add it to the linker's input, which is probably LESS typing than your incbin. =) |
As I said before, it's not much work going the linker route, I'm doing as such in my current project. I'll probably disagree that it's less typing tho :)
And why is it that of the many projects I've seen for the gba, people are still converting files with typical bin2c tools when using the linker is allegedly simple? Pissing around with a linker means I have to add these filenames in a makefile, make sure I know how to use objcopy, fudge them into .rodata sections with another tiny linkscript (with my current environment, easily fixed if I get off my ass and modify my linkscript), a few other things, and some more makefile wizardry to export these symbols to a header file to make my life a bit easier. It still forces me to become more intimate with these tools than I need to. It is still less typing to do this once in an assembler file:
symbol: incbin "gfx.bin"
symbolend:
than to do this for this very same file going the linker route, and the programmer can expect to not worry about the asset, the symbol, and the size of the asset. Being left with '_binary_symbol_bin_start[];' is barbaric.
A good assembler will just leave me with 'symbol' using incbin.
There's even more situations and tricks I can describe where I'd prefer to have incbin handy, but I feel I've gone on about this already long enough and I'm sure ppl are bored :) Regardless, I wish you luck with your project, I'd love to see a good replacement for 'as'...
-DX
#6217 - torne - Tue May 20, 2003 1:35 am
I'll stop arguing this back and forth; I can still get my linker to do it for me in less than one line, and get convenient symbol names. And, as I said, it's just trivial.
Any more smart stuff that people would like to see? Even if it seems unfeasible to implement? =)
Torne
#6223 - sgeos - Tue May 20, 2003 3:53 am
torne wrote: |
Any more smart stuff that people would like to see? Even if it seems unfeasible to implement? =) |
I want the compiler to speak to me. Telepathically, so it doesn't bother anyone. =P j/k
Code blocks that can be relocated without me having to add those details every time. I think all it would have to do is load the pc into a register and treat branches and load instructions relative to that. It might need to know what to treat relative to the start of the code block.
It would be really neat to make 32k player unit code blocks (gfx and all) that can be traded between games and have the various games put the data where-ever they see fit.
Then again my asm is weak. Whatever is out there already might already be doing this by looking at the pc!
-Brendan
#6235 - torne - Tue May 20, 2003 12:37 pm
Most assembler code is already relocatable. Branches and loads are generated as offsets to SP or PC whenever possible. In GBA development, the only things which typically produce absolute addresses are references to variables in iwram/ewram, and jumps between functions in different modules (and even the latter can be avoided if they're small enough).
It's technically possible to consume a spare register in order to point to the start of your variable section, and you can do cunning things with so-called 'jump islands' (having sections of code that just re-branch to a new offset) in order to avoid making longjumps.
I will look at relocation issues as I make the assembler; if there are few enough places where absolute addressing is naturally used, then I might try to implement some automatic method of avoiding them. =)
So, not sure how feasible it is, but it's a nice idea to consider, thanks.
It'd be slightly useful to me; my freestanding OS currently has to have its executable image produced by the linker, in order that it is possible to relocate all the thread code at build time. If everything was relocatable, I'd only need to have absolute addressing for the kernel, and could just append thread code to the end =)
Torne