gbadev.org forum archive

I figured one of the regulars would know how to best phrase an answer to this question.

A user on another board wrote:

Is there any specific reason that you really don't want to use .include statements in a big C program in particular?

As I understand it, there are two ways to set up a compile-link step.

Option A:
Each .c file is a separate translation unit producing a separate object file and doesn't #include any other files containing executable code. The linker combines the object files into a finished binary.

Advantages:

Shorter compilation times with Make
Uses less RAM

Option B:
Each .c file #includes other .c files, and each file has multiple-inclusion guards with #ifndef / #define / #endif. Only main.c is passed to the compile/link tool.

Advantage:

Simpler for a novice to set up if a toolchain distribution doesn't already provide Make and example makefiles. I ask people to install MSYS (which includes Make) through devkitPro Updater, but not everyone wants to go that route.

But what's the other killer advantage of Option A on a PC with a fat CPU and RAM?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

Since C does not have all of the encapsulation options that C++ does compiling separate files can help you isolate variables and methods that would be exposed otherwise in a single file,

Having said that there are libraries out there that do include source (I tend not to like this but I understand it in rare cases).

By having the modules in separate files it is possible for the linker to see that routines in an object file are not referenced so they can be excluded from the final executable. Which may be your "save memory".

Thre is no had rule for NOT doing it it it just not a standard practice so doing it is considered bad programming practice. But there are exceptions for every rule.

Of course these statement were targeted for C not C++ ...

Do you spend much time modifying a single file, recompiling, and testing your changes? I do.

Think about this: The link step for a substantial program already takes much longer than recompiling a single module. If just linking together already compiled modules takes that long, imagine having to actually compile the entire program every time you modify a single line!

But by all means, try it -- there's no better conclusion than one you reach yourself. =)

Edit: I realize tepples asked for a different reason than the ones he mentioned, but really, this is the only reason you'll need.

Yeah, the main 3 reasons I can think of are:
1. Speed. Always recompile the smallest unit you can.
2. Encapsulation. Static variables and functions are essentially global if everything is included into one file.
3. Duplicate definitions. Not a problem if you have a single .c file that includes everything else, but if you have multiple .c files and some of them include other .c files, then you might get two separate ones including the same thing, causing linker errors or ROM bloat. This is the same as the old "don't put your data in .h files" thing.

None of those are troublesome on a small project, but since the question was about larger projects, I'd sum it up as a waste of time an effort to put off learning something you'll need eventually anyway.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

tepples wrote:

Option B:
Each .c file #includes other .c files, and each file has multiple-inclusion guards with #ifndef / #define / #endif. Only main.c is passed to the compile/link tool.

Advantage:

Simpler for a novice to set up if a toolchain distribution doesn't already provide Make and example makefiles. I ask people to install MSYS (which includes Make) through devkitPro Updater, but not everyone wants to go that route.

But what's the other killer advantage of Option A on a PC with a fat CPU and RAM?

This is one of these things that you just don't do. While option B may seem easier for a novice and there may even seem to be some advantages to having everything in a single source file this approach will eventually kill your project stone dead if it evolves into anything complex.

Regardless of your CPU power and available RAM there will come a point when the compiler just won't be able to cope and you'll have to start splitting things up into smaller source files anyway. It doesn't even take a very large source file to kill the compiler stone dead - one of the things I get asked about regularly is why attempting to compile VBA with any of the devkitPro toolchains never completes. The cause of that is gba.cpp which won't currently compile with optimisation enabled. Right now I'm still not sure why that's the case but that file leaves the compiler requesting more and more memory and entering a continous swapping cycle when it runs out of RAM. It may complete eventually but I don't know - I've let it run overnight without success and that's on a 3.8GHz AMD64 with 2gig of RAM.

One of the things gcc is particularly bad at is compiling large data arrays - this can take a lot of RAM to fit both the source file and the compiled data. In game programming these tend to be graphics and audio which will tend to change a lot less often than your code. Having to rebuild those arrays every time you compile your project will result in ever more time consuming compile/debug cycles. This in turn tends to result in demotivating the programmer and the eventual failure of the project.

Something else no-one has pointed out is that including your entire project source in a single file effectively forces the compiler to optimise it as a whole. On the face of it this looks like a good thing - dead code will be removed, lots of code will be inlined making it faster and of course the compiler knows a lot more about the intent of the code since it can see everything at once. Unfortunately compilers are complicated beasts and rarely bug free - the optimiser can and will make mistakes. The more code the compiler can see at once, the more likely it is that latent optimiser bugs will trigger.

These days gcc does quite a bit of code re-ordering too where it believes that order of execution doesn't matter. This can be particularly troublesome when memory is banked - after all the compiler has absolutely no idea that writing to a particular memory location will move memory to a completely different place. That one has bitten me several times when copying data into a VRAM bank before switching it to the ARM7 or the 3D hardware - it's not that easy to debug either. If the compiler has access to more of your code it's much easier for it to put it in entirely the wrong place.

In short, #include of data and code is a horrible evil habit that will mess you up.

The best way to break a bad habit is never to form it in the first place.

As for not installing make - you're absolutely crazy if you try to create anything significant with gnu tools without make. That's one of the reasons that I spent so much time creating the madness that is the devkitPro automagic makefile - you don't need to understand make to use it and it includes many features that programmers don't know they need until the associated problems bite them in the ass- automatic dependency generation being one.
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

Well, go ahead and do without Make... if you want to use Jam instead. =)

Jam rocks.

KJam, baby. KJam. :^D
_________________
July 5th 08: "Volumetric Shadow Demo" 1.6.0 (final) source released
June 5th 08: "Zombie NDS" WIP released!
It's all on my page, just click WWW below.

The author currently explicitly forbids KJam from being used for anything besides evaluation purposes.

I love the author's page about the IP addresses of the visitors to his website. Look, people from Microsoft are reading about his software! Aren't you all impressed?

I'm also skeptical of the supposed performance benefits over vanilla Jam (or at least Boost.Jam):

http://www.oroboro.com/kjam//docserv.php/page/doc_perf

XP's task manager says I'm using just about 100% of my CPU time (on all cores) while doing a full build, so it doesn't seem like more aggressive task scheduling could yield much an improvement for me.

Hope no one minds me reviving a dead topic, but something unmentioned: having a single compilation unit lets the compiler optimize things that the linker may not be able to. This includes such things as inline and custom calling conventions and can make a pretty big difference in performance. It's also part of the reason why you should always declare functions static if they're not required outside of the module.

I personally never bundle everything up into one module, but it can pay to keep critical functions in the same places where it makes sense.

Don't inline functions go in header files anyway?
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

Perhaps this should be a sticky. Seems to come up a lot, plagues beginners (and some not-so-beginners), and is important.

gbadev.org forum archive

C/C++ > Don't #include code. But why not?

#165463 - tepples - Mon Dec 22, 2008 4:29 am

#165471 - gmiller - Mon Dec 22, 2008 2:14 pm

#165483 - sajiimori - Mon Dec 22, 2008 9:17 pm

#165488 - DekuTree64 - Tue Dec 23, 2008 12:02 am

#165489 - wintermute - Tue Dec 23, 2008 1:55 am

#165509 - sajiimori - Tue Dec 23, 2008 10:58 pm

#165512 - silent_code - Wed Dec 24, 2008 1:13 am

#165527 - sajiimori - Wed Dec 24, 2008 10:17 pm

#166855 - Exophase - Fri Feb 20, 2009 9:06 pm

#166856 - Dwedit - Fri Feb 20, 2009 9:30 pm

#166857 - the-anonymous-coward - Fri Feb 20, 2009 10:35 pm