gbadev.org forum archive

I have been studying compiler design recently with the hope that it could give me an insight on how to write more efficient code, and I am suprised at how simple compiler design can be for a limited language.

So what I am thinking of doing is writing a compiler for a language with C style syntax, but pretty cut down to start off with. I have a general plan of how it will all be put together in my head to make it as simple as possible.

Basically the language would be custom built for the GBA, I would ditch alot of things that are either unneccessary (or truthfully just too hard for me to implement) and it would only output thumb .S files to begin with (I am striving to make to output compatible with gcc because I really can't be bothered writing my own assembler and linker).

In general my scribblings of ideas include (mostly with the basis of making it incredibly simple):
*Only 6 types of variables: s8,s16,s32,u8,u16,u32. No floats EVER.
*No typedefs (makes my job marginally easier).
*No volatile keyword - I wont be optimising enough for it to be needed.
*Built in optimising for constants (too many compilers dont seem to care, example a constant pointer or reference still bieng dereferenced in the output code even though its always the same after its declaration and is initialised at the declaration).
*Optimise things like vram[7] to an actual constant address in the output (but not with a variable as the index like vram[i]).
*Proper bios calls for divide etc.
*Perhaps built in structures for sprites etc.
*No dynamic memory.
*Simple dma/memcpy use built into language and easily selectable.(example: palette = dmacpy(0) mypalette; palette = memcpy palette;) Sounds odd but think of the new and delete operators in c++, that sort of thing. Just something to save on work for the user.

Anyway, I am just throwing around ideas and playing with flex and bison so far, I still have a bit to go and a bit of research on the target CPU etc. Throw me some other optimisations you wish gcc had etc...

Oh, and don't tell me it's a stuipid idea because I am mostly only doing it as a project to learn some new things, if I can create a compiler that takes a VERY simplified language as input and output highly optimised thumb/arm code that can be linked to gcc projects then I will be happy if some people find it usefull.
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

You will need to output ELF files in order to be able to link against normal binaries. Here's some advice: don't try to implement ELF yourself. You will go hopelessly mad. The standard is missing, presumed gone. Use libbfd from GNU binutils. =)

You don't want to use libbfd directly. Instead, emit ARM7TDMI assembly language and pass it to as, the same way GCC does.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

Support for coroutines would be great.

-Brendan

torne wrote:

You will need to output ELF files in order to be able to link against normal binaries. Here's some advice: don't try to implement ELF yourself. You will go hopelessly mad. The standard is missing, presumed gone. Use libbfd from GNU binutils. =)

Trust me, the last thing I wanna do is get anymore complex with this than really necessary. Basically I wanna concentrate more on optimisation and output .S files that gcc/as can worry about.

sgeos wrote:

Support for coroutines would be great.

True, but based on my limited understanding of coroutines I would have to totally redesign how the stack etc works right?

Can anyone give me any hints on how gcc-arm handles parameter passing. I am considering using my own method to give me maximum control and more chances to optimise code, and allow an extern keyword that makes a function use gcc's standard method. Or maybe I should always use gcc's method and allow a fastcall modifier like borland does. What should be the default in your opinion?
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

MumblyJoe wrote:

based on my limited understanding of coroutines I would have to totally redesign how the stack etc works right?

Given that coroutines can be implemented in C, which uses a common stack model, this page may give you some implementation hints.

Quote:

Can anyone give me any hints on how gcc-arm handles parameter passing.

First four args in r0-r3; remainder on the stack; return value in r0. See also the ARM-Thumb Procedure Call Standard. Function may trash only r0-r3 and r12.

Quote:

Or maybe I should always use gcc's method and allow a fastcall modifier like borland does.

That might be the best option. Perhaps make all 'static' functions fastcall by default. What improvements do you think you can make over ATPCS?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

Thanks for those links tepples, the ATPCS pdf will make my life alot easier when I get to that stage. Truthfully I really dont know how I can improve on it, maybe a fastcall modifier which uses more registers and forbids the taking of parameter addresses so nothing "escapes" to the stack. Of course that would mean any function that was extern and can link to other languages would have to have code inserted by the compiler to clean up the registers to conform to the ATPCS. Still I think that if I come up with some good optimisations and its totally optional to use them in cases where they can be dodgy....

As for coroutines (which I have no experience with btw) I would basically use a pointer to call the function and when it returns I set that pointer to the address where it needs to reenter next, and next time it calls it continues from there. All local variables would have to be static of course. I assume that if the function body ends the next time it is called it begins from the start again... Am I right about this? Also is there any reason this would be useful for gba coding...
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

Quote:

*Only 6 types of variables: s8,s16,s32,u8,u16,u32. No floats EVER.

How about arrays, pointers, and structs?

Quote:

*No typedefs (makes my job marginally easier).

After you have the basic types working, you'll probably find that typedefs are trivial.

Quote:

*Optimise things like vram[7] to an actual constant address in the output (but not with a variable as the index like vram[i]).

Since array[index] is the same as *(array+index), what you'll really be doing is optimizing arithmetic that involves constants.

Quote:

*Perhaps built in structures for sprites etc.

If you have structs, there's no real reason to build a special struct into the language. I guess you could have a certain header file that's always included or something... at least that wouldn't dirty up the language.

Quote:

*No dynamic memory.

You don't have to do anything to support malloc, except allow external functions to be called in the standard C manner.

Quote:

*Simple dma/memcpy use built into language and easily selectable.(example: palette = dmacpy(0) mypalette; palette = memcpy palette;) Sounds odd but think of the new and delete operators in c++, that sort of thing. Just something to save on work for the user.

It's tempting to add lots of special syntax and neat tricks to your language, but do you really want another perl?

Quote:

*No volatile keyword - I wont be optimising enough for it to be needed.
...
if I can create a compiler that takes a VERY simplified language as input and output highly optimised thumb/arm code...

Your code will definitely not be highly optimized if you don't do the sorts of optimizations that require the 'volatile' keyword.

Writing optimizers is not a walk in the park. It's taken GCC many years to get where it is, and people still complain about it (though it's the most retargetable compiler around).

As I did in my custom-written Basic for the GBA, you could make a language construct to implement something like co-routines (or more like just interrupts :). In my language you could write

Run <proc> As <number>
Stop <number>
Stop All

A call to Run would "start the procedure" in every V-Blank; in that way you don't have to worry about monsters, effects etc. You could have a main game loop like:

Repeat
Until Player.state = psDead

Good for beginners to programming, unnecessary for the rest of us and implementable in a flexible language! ;)
_________________
In a world without fences and walls you will never need gates and windows.

Quote:

Writing optimizers is not a walk in the park. It's taken GCC many years to get where it is, and people still complain about it (though it's the most retargetable compiler around).

C is a language pretty hard to optimize but a custom-built could most likely easily be optimized on; in fact, in my custom-made Basic language (yet again) I made some quite good optimizations that resulted in very good code indeed.

But actually, haha, you *can* get lost in the park sometimes; first I did a *genius* optimization that resulted in skipping pre-evaluated expressions... just to realize that I had forgetten all about side-effects! :)
_________________
In a world without fences and walls you will never need gates and windows.

Well in response to a couple of good points, I am very very early in the development stage and hopefully anything I produce eventually will be useful.

I am ditching the volatile keyword because it seems like the sort of thing that I dont need yet. For now I'm actually tweaking together something that outputs a .c file with variables for all the registers and functions for all of the instructions, only because this makes it very easy for me to look at the output and test it without worrying about compiler directives etc yet. When I get something I'm proud of enough together I will post it and show you. Maybe I will add volatile when I get to optimisations, but once again I doubt that I will create a compiler that a whole game/program could be done with, just hope I can make something useful to link in.

I should have said that I meant to include pointers and structs, I should have been clearer on that. Wouldn't be much good for a memory mapped system like the gba if all you had was local variables :P

As for defending my idea of built in systems like sprite structs and dma modifiers: I want sprites etc built in because I am using a fairly modular method anyway so I can always remove it later if I wanna change to the DS or something, and I think my chances for optimising struct dereferencing will be high, but even higher if the struct is designed into the language. Its the difference between optimising s[i].x=160; when someone else has defined the struct and only general optimisations I will implement for all structs will be applied, and any special shit I can come up with on my own.

As for the DMA, I intend to have at least a couple of modifier keywords which can be overloaded. Example, the code:

Code:

memarea1 = copy(1) memarea2;

And you can overload the copy keyword by defining a function roughly like:

Code:

T* copy(T*to, T*from, T n=0);

so the end user can save on some effort by overloading copy for thier own preferences to work with dma or whatever they want. The n argument would be up to them to fill so they can use a switch or something to decide how to copy. If a switch seems like a bad idea dont give an argument to copy.

I don't expect to replace gcc, but I have read alot of compiler comparisons and gcc really doesnt do a great job. Check out http://www.ddj.com/documents/s=9132/ddj0405a/0405a.htm if you wanna see what i mean.
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

MumblyJoe wrote:

As for coroutines (which I have no experience with btw) I would basically use a pointer to call the function and when it returns I set that pointer to the address where it needs to reenter next, and next time it calls it continues from there.

That is more or less how coroutines are implemented.

MumblyJoe wrote:

All local variables would have to be static of course.

Local variables do not have to be static. Subroutines are always called by and return to a 'parent function' of sorts. They always have a 'parent-child' relationship. Coroutines don't. I have only seen coroutines packaged as sets. Coroutines are chunks of executable code that call each other.

Lets say we have a couple of parent functions, mama() and papa(). Lets say we also have three coroutines, a_ko(), b_ko(), and c_ko(). (Seen Project A-Ko?)

Lets say that mama() calls a_ko(). a_ko() works with b_ko() and c_ko() to get the job done. When they are done, b_ko() may well end up returning to mama(), even though mama() told a_ko() to do the job in the first place. When papa() calls a_ko(), b_ko(), or c_ko(), they work as a group to get the job done.

Local variables want to be considered local to the set of coroutines. When one of the routines returns to the caller of the coroutine set, the local variables are no longer needed and can be clobbered. I'm not overly fond of static data.

MumblyJoe wrote:

I assume that if the function body ends the next time it is called it begins from the start again...

That depends on how it is set up. As the compiler writer you get to decide if that function restarts, or if the coroutine set returns to mama() or papa(). =) (I think the second option makes a little more sense.)

MumblyJoe wrote:

Am I right about this? Also is there any reason this would be useful for gba coding...

*Yes!* If the coroutines include an initialization section that gets called every time, it makes changing the game mode very easy:

Code:

void load_game(void)
{
int file;

init {
   load_gfx(LOAD_GAME);
   show_gfx(LOAD_GAME);
   load_music(LOAD_GAME);
   play_music();
}

file = get_file_from_sram();
while (1) {
   update_keypress();
   if (release(UP))
      file--;
   if (release(DOWN))
      file++;
   fix_file_number(file);
   if (release(A))
      start_game(file); // Coroutine
   if (release(B))
      title(); // Coroutine
   vsync();
}
}

I may want a coroutine in my set for each major game state:
splash_screen()
title_screen()
save_game()
load_game()
walk_about()
battle()
menu()
credits()
game_over()

I'm not sure how other non-assembler languages handle coroutines, but I think that the biggest difficulty will be defining how members of a coroutine set are allowed to communicate, share data and return to the calling function. If it would interest anyone, I would be happy to write an article on how to implement coroutines in C using a controller function. It will assume that the reader has read Simon Tatham's article, mentioned above. Coroutines are also detailed in section 1.4.2 of volume 1 of Donald Knuth's The Art of Computer Programming. (I highly recommend the boxed set.)

I'll note that if your compiler provides language level support for coroutines I will use it and probably become reluctant to touch other compilers.

-Brendan

hmmmm sgeos, you raise some interesting ideas. I still dont really understand coroutines but I will do some research now while I may be able to squeeze into my notes and test run grammer.

What I really need to know at this stage is if there is any good reason c/c++ dont natively support them?

http://www.sidhe.org/~dan/blog/archives/000178.html describes them pretty well, and i think i will use the yield keyword if i do implement them. Which method that he describes on that page should I attempt: The ignoring of paremeters on subsequent calls, creating new instances, altering the variable, some other way... Which way is most useful for most people?

I am pretty sure I get the whole yielding idea, but having more than one coroutine that shares data with another or something.... I dunno yet.
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

Quote:

What I really need to know at this stage is if there is any good reason c/c++ dont natively support them?

Mainly because C was designed with a single hardware stack in mind, and real coroutines each need their own stack. The compiler could emulate additional stacks, but that would be a higher-order feature and C was also designed with low-level programming in mind.

Hmmm, I might just get on with some of the other work that I badly need to do for the compiler and have the yield and coroutine keywords reserved just in case (just in case I cant figure out a grammer where I can tell if a couroutine is what is wanted through use of the yield keyword). If anyone can give me an example of how other languages use it that would be cool.

Anyway, I have this new optimisation in my head and I want to know if it has been done before or not...

First step, __superconst or something gay like is a language extension in my compiler (I can change the word in a second if anyone has a better one). A superconst cannot have it's constness casted away or have it's address taken or anay of the other things that c/c++ can allow to un-const something. This allows me some assumptions.

A function that takes a superconst can have the modifier __multiple. Then amagine this:

Code:

void __multiple SomeFunction(__superconst u32 in);

int main()
{
__superconst u32 somenum = 90;

SomeFunction(somenum);
SomeFunction(56);

return 0;
}

Now the assembler expands this into two SomeFunction's, both of which are built with the constant passed to them tied into the end asm as much as possible. If this sounds odd to you, imagine why people prefer to use a #define sometimes when an inline function could be used, its better in the end result some times.

When the compiler is done it will output some data to a file or to stdout or whatever that gives information on what optimisations have been taken, how much larger they have caused the end result to be, an estimate on if it has made it a lot or a little faster (or even slower).

Obviously __multiple functions are static etc. Anyway, I can see it as a useful optional extension that could allow some cool things done for really time critical code.
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

It sounds like you're idea is to mimic inlining, without actually inlining. It might help fight code bloat (only if a function is called multiple times with the same const), but it also removes some of the benefits of inlining: removing the function call, and allowing the inlined function to optimize "into" the calling function.

If you do decide to implement this feature, you shouldn't need a superconst to do it. If you calculate reaching definitions (a more useful optimization I would guess) your compiler will be able to determine constant expressions without the user's help.

If you're interested in writing an optimizing compiler, I'd suggest reading the "Dragon" Book. It's a nice starting point for the aspiring compiler writer.

Ben

Instead of speculating about optimizations that C compilers don't do, you might start by doing the usual optimizations -- they are far more important. When you have a working compiler, then you can experiment with these other ideas and find out if they're worthwhile.

BTW, look at the assembler g++ generates for this:

Code:

const int x=100;

int main()
{
*(int*)&x=123;
return x*2;
}

I don't know what kind of assumptions you wanted to be able to make with your superconst idea, but it appears g++ makes a lot of them anyway, even when it's wrong.

Casting away const is just stupid. :P

Forget about optimisations at first, in most cases you are just correcting a coders poor algorithm. The only optmisation you need ever do is with the assembler, not co-routines etc.

If you want to know how to optmise you need to create code tracing AI, or at least create a few finite rules to optmise. Code tracing AI is easily tackled with graph theory - and the higher level your language is, the easier this is.

Now where graph theory comes into place is when you, for example have poor code that asks a function twice for the same data, or the function need not be called twice with different data as a calculation could be made from the first call. An example of this is x = a^6 and y=a^7, but y=x*a.

Also variables could be replaced by registers completely, that is quite simple to tackle in some instances. But focus on getting it working, and by hand find optmisations and then apply an model and algorithm for finding it.

keldon wrote:

Forget about optimisations at first, in most cases you are just correcting a coders poor algorithm. The only optmisation you need ever do is with the assembler, not co-routines etc.

Coroutines are not an optimisation. They are something that I wish was easier to implement in C.

MumblyJoe wrote:

hmmmm sgeos, you raise some interesting ideas. I still dont really understand coroutines but I will do some research now while I may be able to squeeze into my notes and test run grammer.

Naturally, I have some suggestions (later in the post).

Quote:

What I really need to know at this stage is if there is any good reason c/c++ dont natively support them?

C is not the divine language. It is lacking in some areas. Depending on who you ask, C++ is better or worse. At any rate, C++ is far from divine.

sajiimori suggested the stack. Coroutines can be implemented in ANSI C, which uses "a common stack model", as tepples pointed out. It's a bit of a pain, but in vanilla C, I can set up coroutines with the following features:

shared variables (I pass a struct in from my controller function)
persistant variables (in the passed in struct)
the ability to resume state (switch on __LINE__ trick)
an initialization section (anything before the switch statement)

The goal is to have the language facilitate the programmer. I can do everything now, I just want it to be easier to implement and easier to read. These go hand in hand.

MumblyJoe wrote:

http://www.sidhe.org/~dan/blog/archives/000178.html describes them pretty welli think i will use the yield keyword if i do implement them.

Neat article. I do not like the way he opens the article though:

Squawks of the Parrot wrote:

Anyway, what is a coroutine? Well, a coroutine is a subroutine or function that can pause in the middle and return a value.

Subroutines may or may not return values. The same is true of coroutines.

Knuth does a much better job of describing them:

Donald E. Knuth wrote:

Subroutines are special cases of more general program components, called coroutines. In contrast to the unsymmetric relationship between a main routine and a subroutine, there is complete symmetry between coroutines, which call on eachother.

To understand the coroutine concept, let us consider another way of thinking about subroutines. The viewpoint adopted in the previous section was that a subroutine merely was an extension of the computer hardware, introduced to save lines of coding. This may be true, but another point of view is possible: We may consider the main program and the subroutine as a team of programs, each member of the team having a certain job to do. The main program, in the course of doing its job, will activate the subprogram; the subprogram will perform its own function and then activate the main program. We might stretch out imagination to believe that, from the subroutine's point of view, when it exits it is calling the main routine; the main routine continues to perform its duty, then "exits" to the subroutine. The subroutine acts, then calls the main routine again.

This somewhat far-fetched philosophy actually takes place with coroutines, for which it is impossible to distinguish which is a subroutine of the other. Suppose we have coroutines A and B; when programming A, we may think of B as our subroutine, but when programming B, we may think of A as our subroutine. That is, in coroutine A, the instruction "JMP B" is used to activate coroutine B. In coroutine B the instruction "JMP A" is used to activate coroutine A again. Whenever a coroutine is activated, it resumes execution of its program at the point where the action was last suspended.

The coroutines A and B might, for example, be two programs that play chess. We can combine them so that they will play against each other. (193-194)

MumblyJoe wrote:

i think i will use the yield keyword if i do implement them.

"yield" sounds good.

MumblyJoe wrote:

Which method that he describes on that page should I attempt: The ignoring of paremeters on subsequent calls,

When dealing with two coroutines that work together, they can essentially return to eachother. I do not think that using two coroutines is the best way to illustrate the concept. A lot of assumptions can be made that cease to exist as soon as a third coroutine enters the picture.

My opinion is that coroutines, in general, should not return values to eachother. Values should either be passed using parameters, or coroutines should use shared data to communicate. Consider the following scenario:

mama() "There is some work that needs to get done. a_ko(), I'll leave it up to you to make sure that this gets started."

a_ko() *works for a while* "Hey, b_ko(), I really need you to do your part before I can continue."

b_ko() *finishes quickly* "Ok c_ko()! Your turn."

c_ko() *works for a while* "b_ko(), I need you work on this some more."

b_ko() *finishes quickly* "Ok c_ko()! Your turn again."

c_ko() *works for a while* "a_ko(), you need to have a look at this..."

(Continues from there)

The problem is that a_ko() gives control to b_ko(), but has control returned from c_ko(). I think that passing control from one coroutine to another should always be done in a forward fashion. If one ever wants a coroutine to return to it's calling coroutine, I suspect that that individual really wants a subroutine setup. (Chances may also be that that individual has not figured out what he or she is doing.)

To be clear, I think that parameter values should be changed when we yield to another coroutine. This essentially allows more than one value to be returned to each coroutine in an input only fashion. "Returning" values is bad because we do not know who is returning the value. a_ko() has no way of knowing if b_ko() or c_ko() is going to get back to her.

Assuming a structure model for shared variables, the parameters could be part of the structure, or they could be passed like subroutine parameters.

Input only is limiting because it allows only one set of data passed, with the types predefined. If a given type is needed that is not part of the parameter list, that can be done with shared data.

To the extent that coroutine can communicate using nothing but shared data, I could see an arguement that the parameter list and return type should *only* be used to communicate with the "outside world". Using that approach, the parameter "variables" should not change on subsequent calls.

MumblyJoe wrote:

creating new instances,

If parameters are used as a way for coroutines to communicate with eachother, then new instances are not useful. Regardless of how I think about it, I don't see any merit in new instances, actually. Discarding on a parameter change is also not useful.

MumblyJoe wrote:

altering the variable, some other way... Which way is most useful for most people?

I think that altering the variable is the way to go, if parameters used as a means for coroutine communication. Otherwise I'd ignore parameters on subsequent calls.

Code:

int a_ko(char x, char y, char z)
{
}

If mama() calls a_ko() and later, c_ko() yields to a_ko() with different values, I think that a_ko() should pick up where she left off and run with the new parameter "variable" values. I say this because I do not think that coroutines should return values to one another as such. Again, all of this could be done with shared variables. Regardless, int a_ko() indicates that a_ko() returns int to mama(). b_ko() and c_ko() should also return int to mama().

Quote:

I am pretty sure I get the whole yielding idea, but having more than one coroutine that shares data with another or something.... I dunno yet.

I do not think that having more than one coroutine with shared data is an especially difficult concept. "More than one coroutine" and "shared data" are different concepts. I'll start with the "more than one coroutine" concept.

Coroutines work as teams. a_ko(), b_ko() and c_ko() may work together, while d_ko() and e_ko() also work together. (d_ko() and e_ko() are not part of a_ko()'s team.) There has to be a way of telling who is on which team. The compiler should not lump all coroutines together, nor should it allow coroutines on different teams to yield to one another any more than it allows subroutines to yield to one another.

This works well for subroutines:

Code:

type_t subroutine(arg_t args);

Subroutines work on teams of one. (These loners are often enslaved by supreme calling functions.) Coroutines band together, so how about something like this?

Code:

coroutine {
u16 a_ko(int);
u16 b_ko(char);
u16 c_ko(void);
}

coroutine {
void d_ko(u16);
void e_ko(u16);
}

My main objection to this setup is that if another routine wants to join a team, the programmer has to hunt down the teams declarations. These may be at the top of the file, or in a header. In other words, far, far away. The advantage to this is that once we find the team declaration, we know exactly who is on the team.

I much prefer team_name delimiter member_name. Something like this:

Code:

u16 abc@a_ko(int);
u16 abc@b_ko(char);
u16 abc@c_ko(void);
void de @ d_ko(u16);
void de @ e_ko(u16);

That should make things easy for the compiler. Of more importance, is that somebody reading the code knows that these functions are coroutines. Not only that, the reader knows what team the coroutines are on. The reader may not know who all the team members are.

I could see mandatory declaration of all coroutines on a team within a file. Something like this:

Code:

u16 abc@a_ko(int);
extern u16 abc@b_ko(char);
extern u16 abc@c_ko(void);

Then there is the question of moving from one routine to the next. I think that the easiest way would be for yield b_ko(); (from a_ko()) to expand to the following psuedo code:

Code:

save_a_ko_state(resume_here);
branch(b_ko, parameters);
resume_here:

The other way is for a_ko() to branch to something like __a_ko2b_ko:

Code:

__a_ko2b_ko:
save_a_ko_state(resume_here);
branch(b_ko_starts_here);
__c_ko2b_ko:
save_c_ko_state(other_resume_here);
/* No branch needed */
b_ko_starts_here:

This would have to be set up for every function. All in all, I suspect that the first way would be easier to implement. The second way is how Knuth does things in the main text of TAOCP.

Naturally, trying to yield to d_ko() from a_ko() or mama() should result in a compiler error.

Also, I'd like to point out the while:

Code:

void a_ko(void) {
yield b_ko();
}

passes control to b_ko(),

Code:

void a_ko(void) {
b_ko();
}

creates a new instance of the abc team, and control starts there with b_ko().

Next is the issue of shared memory. When I write coroutines in C, I have my controller function direct control from one routine to the next. It also passes the same structure to every routine. This is bad because the structure is defined at the top of the file or in a header. It is a pain to add shared variables. Somebody reading my code has to hunt the structure down. It would be nice if the compiler put the struct together so that the data could be declared in the functions- where it really belongs!

If I were doing things, this code:

Code:

u16 abc @ a_ko(int a)
{
shared data_t data;

while (1) { /* ... */ }
}

u16 abc @ b_ko(char b)
{
shared data_t data;
int i;

init { /* ... */ }
{ /* ... */ }
}

u16 abc @ c_ko(void)
{
int i;
int j;
int k;

init { /* ... */ }
}

Would be compiled into the rough moral equivalent of:

Code:

struct __abc {
data_t __shared_data;
int __b_ko_i;
int __c_ko_i;
int __c_ko_j;
int __c_ko_k;
void * __a_ko_state;
void * __b_ko_state;
};

u16 __abc__a_ko(int a, struct __abc *__abc)
{
/* No init */
branch(__abc->__a_ko_state);
while (1) { /* ... */ }
}

u16 __abc__b_ko(char b, struct __abc *__abc)
{
init { /* ... */ }
branch(__abc->__b_ko_state);
{ /* ... */ }
}

u16 __abc__c_ko(struct __abc *__abc)
{
init { /* ... */ }
}

Everything starts with __ because that is the compiler's name space. c_ko() does not have a state because everything is in the initialization section. data is not shared with c_ko(). Trying to access data from within c_ko() should result in a compiler error. b_ko() and c_ko() both have variables named i. These are not shared- they are different variables. In theory, c_ko() should be able to have a non shared variable named data.

I think that initializing the function state pointers properly is something to look out for. I'm not sure how tricky the (shared) variables would be to implement.

In closing, I'd love to see coroutines included. I think that the keywords yield, shared, and init (or their moral equivalents) and a team delimiter would be a good way of doing things. I'd also like to hear what other people have to say on the topic.

I do think that your first priority should be to get a basic compiler up and running.

-Brendan

Don't worry, I have a table of textbooks on the subject that I am studying and I have every intention of performing any standard optimisations too, but I need to throw ideas around before they get stagnant in my head :P
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

Thanks again sgeos, thats given me some good insights on how to allow for coroutines at a later stage. This is important for me to do at this stage because I am toying with namespaces etc in theory etc, and until I get that done I cant really even start register allocations etc.

From what I can tell they shouldn't get me in any trouble with grammar, the coroutine {} method would get through my parser as ok with a minimum of fuss because its so similar to things like extern "C" {} or static {} groupings. Of course your idea of using a delimeter like @ wouldn't take long to add in either. Also thanks for the examples, I am pretty sure I get the shared data now. Adding support for the shared keyword is obviously gonna be as easy as it looks.

Back to the books...
_________________
www.hungrydeveloper.com
Version 2.0 now up - guaranteed at least 100% more pleasing!

Brendan I think you've hit the nail. Whenever yield is used to call the co-routine the instance variables created, if any are global to this level of co-routine calls just like when a function is called it has its own instance variables.

This is no difficult task, nor is co-routine implementation - it is merely a jump. To pass parameters to a new instance when yield is not declared could be handled like so:

Code:

coRoutineJump ( *(void *) function, param 1, param2 etc.... ) {
jmp function
}

So for example you can have recursive co-routines, etc.

I also agree with them being grouped together, that way the concept shown above works with ease, so that you always pass x amount of instance variables to the function.

With the co-routines, you may not implement them now, but make sure that your compiler is structured to be able to insert its functionality later.

Quote:

Forget about optimisations at first, in most cases you are just correcting a coders poor algorithm.

I strongly disagree with the above. When given the choice between a highly optimized and hard-to-read solution and an easy-to-read solution that the compiler will optimize to be as fast as the first solution, a good programmer will always choose the latter.

Consider: Why does anybody write inline functions? To be faster, right? Wrong. They could have inlined the function by hand, and they wouldn't have needed an optimizing compiler to do it. Using an inline function makes the code easier to write, read, and maintain. The inlining compiler allows the coder to write code that is fast and good.

Quote:

The only optmisation you need ever do is with the assembler...

I've never heard of an "optimizing assembler". That wouldn't really make any sense.

sajiimori wrote:

Quote:

Forget about optimisations at first, in most cases you are just correcting a coders poor algorithm.

I strongly disagree with the above. When given the choice between a highly optimized and hard-to-read solution and an easy-to-read solution that the compiler will optimize to be as fast as the first solution, a good programmer will always choose the latter.

Consider: Why does anybody write inline functions? To be faster, right? Wrong. They could have inlined the function by hand, and they wouldn't have needed an optimizing compiler to do it. Using an inline function makes the code easier to write, read, and maintain. The inlining compiler allows the coder to write code that is fast and good.

Quote:

The only optmisation you need ever do is with the assembler...

I've never heard of an "optimizing assembler". That wouldn't really make any sense.

You didn't understand what I meant. When I said optmizing assembler, I meant that optimise the output using variations of methods in assembler. E.g.

Code:

function ( int f ) {
int a = 3;
return f * 3 mod diff ( f );
}

would be outputted as say

Code:

;; f = r1
sub r1 3
ldr r2 0
cmp r1 r2
sub lt r2 r1
mul r2 r1

...returning in r1. Excuse my assembler mistakes, such as how the stack is handled here; I have no idea how it is done in the GBA as I haven't looked through it yet.

And what I meant by the programmer writing better code, I meant in the case that they have created a poor algorithm, not an algorithm that doesn't use loads of optmisations such as, for example, directly accessing registers. A good algorithm doesn't necessarily have to be unreadable, which any good programmer knows.

As a note, there are numerous optmisations that a compiler can choose by examining the path of data and processing. That means that it can even examine the processes of the functions being called and calculating what portion of the process could be extracted, etc.

Actually, I've been writing an optimising assembler for my final-year project. Some time in the next few months I will hopefully have it in a condition that someone else could use it. =)

Quote:

Actually, I've been writing an optimising assembler for my final-year project.

Hmm... maybe it ceases to be an assembler then, and becomes a compiler for a language that strongly resembles assembly. =)

Quote:

When I said optmizing assembler, I meant that optimise the output using variations of methods in assembler.

Oh, alright then. I guess you were distinguishing between optimizations in output versus optimizations in lines of source or programmer effort. Normally, the phrase "optimizing compiler" only refers to optimizations that affect the output.

Quote:

And what I meant by the programmer writing better code, I meant in the case that they have created a poor algorithm, not an algorithm that doesn't use loads of optmisations such as, for example, directly accessing registers. A good algorithm doesn't necessarily have to be unreadable, which any good programmer knows.

Ok. I still think you're wrong to say that optimizing compilers are usually only useful for correcting bad algorithms. Optimizers allow me to write code that is clearer and more modular, without sacrificing speed.

Or are you going to argue that there is no such thing as a trade-off between speed and clarity/modularity?

I can write code that is fast, optmised, modular without being unreadable. You are simply not a programmer in the poor category, so this does not apply to your code.

Consider this code:

Code:

b = 0; m = 4; x = 1;
for ( int i = 0; i < 25; i++ ) {
b = b + (m ^ x) ;
x = x + 1;
}

Now a smarter algorithm is:

Code:

b = 0; m = 4;
for ( int i = 0; i < 25; i++ ) {
b = b + m;
m = m * m;
}

A compiler may wish to change the first algorithm to create the second, however that takes time for some to design. Both are readable, but the first would take ages to compute, simply because of what's behind the exponental functions.

Quote:

I can write code that is fast, optmised, modular without being unreadable. You are simply not a programmer in the poor category, so this does not apply to your code.

Are you saying there is such a thing as a trade-off between speed and clarity/modularity, but only for bad programmers?

Code:

b = 0; m = 4;
for ( int i = 0; i < 25; i++ ) {
b = b + m;
m = m * m;
}

Oops, I think the compiler had to correct your bad algorithm! Here's the x86 output with -O0:

Code:

movl $0, -4(%ebp)
movl $4, -8(%ebp)
movl $0, -12(%ebp)
L2:
cmpl $24, -12(%ebp)
jle L5
jmp L3
L5:
movl -8(%ebp), %edx
leal -4(%ebp), %eax
addl %edx, (%eax)
movl -8(%ebp), %eax
imull -8(%ebp), %eax
movl %eax, -8(%ebp)
leal -12(%ebp), %eax
incl (%eax)
jmp L2
L3:

And here's the output with -O3:

Code:

movl $24, %eax
xorl %ecx, %ecx
movl $4, %edx
L6:
addl %edx, %ecx
imull %edx, %edx
decl %eax
jns L6

Does that mean you're a poor programmer after all?

Quote:

A compiler may wish to change the first algorithm to create the second, however that takes time for some to design.

Yes, it's hard to write optimizing compilers, but I don't see how that applies to your argument that optimizers are only useful for bad programmers.

Quote:

Are you saying there is such a thing as a trade-off between speed and clarity/modularity, but only for bad programmers?

Well a poor programmer would mostly write code in the hope that it works. You don't need to know ways around a compiler to be a good programmer, just have knowledge and understanding of algorithms. Many people don't, and some methods of software optmisation only exist for their benifit.

Quote:

Does that mean you're a poor programmer after all?

And this may be slightly off topic, but you have illustrated a great level of stupidity, ignorance, and poor understanding of algorithms and assembler.

The first output is the compiler using stack variables for every variable; the second is the compiler realising that it only needs to use registers. If you were clever you might have tried to input the first algorithm, or even realised how fundamentally wrong it was - which many poor programmers may do; which the second is a faster version of.

Also being an assembler programmer at heart, the second output is how I would have written it anyway. Also you failed to realise (/understand) that I meant that a compiler may wish to optmise the first code I gave you (you know, the one that called the exponent function multiple times) to give the same code you've given there.

In fact I am surprised that the first output came out like that, but I wouldn't consider the second one an optmisation on the code; simply better code than the first one given - but then again I mostly program in assembler so all my code would come out pretty much like that anyway :-). So I guess however good a programmer you are, or may be, you're not a bright spark.

sajimori is completely right, you know. Don't be so pompous. There are many compiler optimisations which cannot be done at C source level at all, however illegibly you write your code. Allocating local variables to registers *is* an optimisation; the standard 'model' of the compiler states that local variables are on the stack, so doing anything different for performance reasons is optimising.

Quote:

Oops, I think the compiler had to correct your bad algorithm! Here's the x86 output with -O0:

Just thought I'd add to your stupidity, but both of the outputs you've given me are the same algorithm, represented with different code - but the same algorithm.

---
Now in regards to the compiler it just goes to show the difficulties of converting a high level expression into a low level expression.

But even that is not a true display of optmisation; the biggest optmisation is when operations are arranged so that as as many instructions are carried out at the same time, sometimes up to 4 or 5, which is simply not an issue with the gameboy advance.

torne wrote:

sajimori is completely right, you know. Don't be so pompous. There are many compiler optimisations which cannot be done at C source level at all, however illegibly you write your code. Allocating local variables to registers *is* an optimisation; the standard 'model' of the compiler states that local variables are on the stack, so doing anything different for performance reasons is optimising.

Well like I said, I write in assembler - and to suggest that as being pompous is quite farfetched, although I did give quite a stately display, and I hate stupid comments like that when people attempt to cover up their own mistake.

But I am mostly pointing out his attempt to insult me by suggesting I created a poor algorithm by my own definition; however no change was made to the algorithm - and he completely ignored the the original code, also I was expressing the difference between good and bad code and why writing an optmiser that focuses on fixing bad code to good code (which can be done) should be dismissed.

Also this is over the discussion of focussing on optmisation. There are various methods to optmise code - and if using a register instead of a variable on a stack is seen as an optmisation then I will accept that, and that is an optmisation that should not be missed. Just remember that I code in assembler, so this is a quite humourous joke to assemblers given the arguments against writing in assembler - a little too off topic so I'll stop here.

I'm done talking to you, keldon. Let me know when you grow up.

Me grow up?!? Whether I spoke out of turn, or was a little 'pompous' - the fact remains that you tried to be facetious and incorrectly attempted to point out a fault, which was in fact no fault. So is it me who should grow up and stop overreacting to your mistakes, or you who should grow up and accept your mistakes - as I have accepted mine.

Also this little squabble is taking up considerable space in MumblyJoe's topic.

I would also add that I was not attempting to squabble with you or argue with you until you attempted to show me up with your output therefore you initiated the bad response and now attempt to cover it by calling me childish, bearing in mind who wanted to insult who first. It would be far more intelligent to admit you misunderstood what I said and that I, also misunderstood what you said and did not explain myself fully; that is the wise and ADULT way of doing things in this case.

Also don't take my comments on me being an assembler programmer as a declaration of importance and priority - it is a choice because I learned more doing assembly and found working with it a lot easier.

gbadev.org forum archive

Coding > Building my own compiler.....

#21076 - MumblyJoe - Sun May 23, 2004 10:03 am

#21078 - torne - Sun May 23, 2004 3:50 pm

#21087 - tepples - Sun May 23, 2004 8:07 pm

#21099 - sgeos - Sun May 23, 2004 9:36 pm

#21112 - MumblyJoe - Mon May 24, 2004 1:03 am

#21117 - tepples - Mon May 24, 2004 4:59 am

#21119 - MumblyJoe - Mon May 24, 2004 6:40 am

#21120 - sajiimori - Mon May 24, 2004 6:47 am

#21130 - Daniel Andersen - Mon May 24, 2004 3:41 pm

#21131 - Daniel Andersen - Mon May 24, 2004 3:49 pm

#21160 - MumblyJoe - Tue May 25, 2004 3:28 am

#21169 - sgeos - Tue May 25, 2004 7:09 am

#21232 - MumblyJoe - Wed May 26, 2004 8:11 am

#21280 - sajiimori - Wed May 26, 2004 5:28 pm

#21303 - MumblyJoe - Thu May 27, 2004 3:44 am

#21310 - bats - Thu May 27, 2004 1:00 pm

#21319 - sajiimori - Thu May 27, 2004 7:49 pm

#21338 - keldon - Fri May 28, 2004 1:26 am

#21342 - sgeos - Fri May 28, 2004 2:28 am

#21343 - MumblyJoe - Fri May 28, 2004 2:32 am

#21346 - MumblyJoe - Fri May 28, 2004 2:59 am

#21377 - keldon - Fri May 28, 2004 10:32 am

#21414 - sajiimori - Fri May 28, 2004 10:12 pm

#21421 - keldon - Fri May 28, 2004 11:22 pm

#21424 - torne - Sat May 29, 2004 12:05 am

#21432 - sajiimori - Sat May 29, 2004 2:29 am

#21446 - keldon - Sat May 29, 2004 11:47 am

#21471 - sajiimori - Sat May 29, 2004 7:12 pm

#21492 - keldon - Sun May 30, 2004 12:42 am

#21494 - torne - Sun May 30, 2004 12:55 am

#21495 - keldon - Sun May 30, 2004 1:06 am

#21496 - keldon - Sun May 30, 2004 1:15 am

#21499 - sajiimori - Sun May 30, 2004 1:45 am

#21502 - keldon - Sun May 30, 2004 2:18 am