#162344 - RobinWatts - Mon Sep 01, 2008 1:20 am
Hi all,
I've just written the guts of a simple processor sampling profiler for the NDS (hugely inspired by TickProf, by Julian Smith, for various systems).
Initially it's intended simply to point to areas of tuna-vids that might benefit from tuning, but it could well be useful to other coders.
I still need to write the tool to take the binary output and combine it with the linker mapfile to produce a nice textual output, but the binary data it's producing looks plausible.
Anyone interested in beta (well, alpha!) testing it please speak now...
Robin
#162390 - RobinWatts - Mon Sep 01, 2008 8:43 pm
I've just got the analysis thing working too, so it seems appropriate for a post with some more details...
To use the profiler, in the arm9 code, you add the following line:
Profiler_autoInit();
Then, at the end of the program, you do:
Profiler_fin();
Avoid calling irqInit(); within your code, because Profiler_autoInit(); does that for you.
Then compile, deploy, and run your program. The program should run as normal, but on closedown, it'll dump a file 'PROFILE9' onto your memory card.
Take this file, and do:
annotate PROFILE9 mapfile
where mapfile is the mapfile produced by the link stage.
This will output timings for your program, first listing by address, then by percentage of time spent.
For instance, for Tuna-Vids-1.1, it gives:
Code: |
Profile by Address
01000000 ( 0.00%: 0) Profile_passThru
0100001c ( 0.00%: 0) Profile_getSp
01000024 ( 0.00%: 0) Profile_offsetFinderWaitLoop
01000034 ( 0.00%: 0) Profile_offsetFinderWaitLoopEnd
01000038 ( 0.00%: 0) Profile_tick
010000f0 ( 17.78%: 6217) yv12_to_rgb555_asm
01000230 ( 0.00%: 0) irqTable
010002f8 ( 0.04%: 13) IntrMain
01000424 ( 0.11%: 37) init_vlc_tables
01000abc ( 0.17%: 58) check_resync_marker
[Snip]
Profile by Time
010000f0 ( 17.78%: 6217) yv12_to_rgb555_asm
02033330 ( 11.82%: 4135) simple_idct_ARM
02035580 ( 10.35%: 3619) _M3CF_startup
0200a6a8 ( 7.59%: 2653) interpolate8x8_halfpel_hv_c
0202d724 ( 6.96%: 2433) transfer_16to8copy_c
0202cf40 ( 6.63%: 2320) dequant_h263_intra_c
0202d960 ( 6.19%: 2163) transfer_16to8add_c
02047770 ( 5.54%: 1938) memcpy
0204293c ( 4.81%: 1683) scanKeys
02001b04 ( 4.23%: 1478) play_movie
020479b0 ( 3.56%: 1246) memset
[Snip]
|
Robin
#162495 - kleph - Wed Sep 03, 2008 11:37 pm
hello Robin,
I am pretty interested in testing your profiler.
I was just looking for a tool like this a few days ago :)
#162504 - RobinWatts - Thu Sep 04, 2008 1:26 am
kleph wrote: |
hello Robin,
I am pretty interested in testing your profiler.
I was just looking for a tool like this a few days ago :) |
I've PM'd you with a location for an archive. Let me know how you get on.
Robin
#162516 - a128 - Thu Sep 04, 2008 7:41 am
why not post the archive link here?!
#162529 - RobinWatts - Thu Sep 04, 2008 12:38 pm
a128 wrote: |
why not post the archive link here?! |
Cos I haven't sorted out a proper distribution yet. When I get a license sorted, and decent instructions, I'll do a proper distribution, and post a link.
Until then I'd like to keep track of who's laughing at my code. :)
Robin
#162555 - kleph - Thu Sep 04, 2008 9:27 pm
RobinWatts wrote: |
kleph wrote: | hello Robin,
I am pretty interested in testing your profiler.
I was just looking for a tool like this a few days ago :) |
I've PM'd you with a location for an archive. Let me know how you get on.
Robin |
thanks
I just tried it with my current project but unfortunately a linking error occurs.
Since i've messed a bir with the linker in porting a few library I suspected my build.
I tried with a few other simple program (in this case a simple file copier i wrote to test my microSD card) and the same error occurs
Here is the error message :
Code: |
profile9.o: In function `Profile_init':
/home/kleph/progs/ds/filecp/./profile9.c:177: relocation truncated to fit: R_ARM_THM_CALL against symbol `Profile_offsetFinderWaitLoop' defined in .itcm section in profileARM9.o
collect2: ld returned 1 exit status
make[1]: *** [/home/kleph/progs/ds/filecp/filecp.elf] Error 1
|
I don't understand the meaning of that message.
I am a pretty tired tonight so I'll try to investigate more during the weekend.
By the way, the annotate program compile fairly well :)
#162563 - RobinWatts - Fri Sep 05, 2008 1:08 am
kleph wrote: |
I just tried it with my current project but unfortunately a linking error occurs.
Code: |
profile9.o: In function `Profile_init':
/home/kleph/progs/ds/filecp/./profile9.c:177: relocation truncated to fit: R_ARM_THM_CALL against symbol `Profile_offsetFinderWaitLoop' defined in .itcm section in profileARM9.o
collect2: ld returned 1 exit status
make[1]: *** [/home/kleph/progs/ds/filecp/filecp.elf] Error 1
|
|
That's saying, I think, that it's trying to call an ARM routine (my assembly code) from a Thumb routine (Profile9.c), and not being able to fit the offset into the instruction.
The simple solution is, I think, to ensure that Profile9.c is built as ARM, not thumb.
If you post your makefile, I'll try and figure out the runes to make it work (and when I fail, someone with more clue than me can look :) ).
Robin
#162567 - Cearn - Fri Sep 05, 2008 10:08 am
RobinWatts wrote: |
kleph wrote: | I just tried it with my current project but unfortunately a linking error occurs.
Code: |
profile9.o: In function `Profile_init':
/home/kleph/progs/ds/filecp/./profile9.c:177: relocation truncated to fit: R_ARM_THM_CALL against symbol `Profile_offsetFinderWaitLoop' defined in .itcm section in profileARM9.o
collect2: ld returned 1 exit status
make[1]: *** [/home/kleph/progs/ds/filecp/filecp.elf] Error 1
|
|
That's saying, I think, that it's trying to call an ARM routine (my assembly code) from a Thumb routine (Profile9.c), and not being able to fit the offset into the instruction.
|
That's not quite it. Jumps between ARM and Thumb code shouldn't be a problem, but jumping between sections is. From the looks of it Profile_offsetFinderWaitLoop is in ITCM but its caller presumably isn't. The reverse (calling main RAM code from ITCM) has the similar problems. The ITCM_CODE macro should have taken care of this, but perhaps something else is going on as well.
#162575 - RobinWatts - Fri Sep 05, 2008 11:15 am
Cearn wrote: |
That's not quite it. Jumps between ARM and Thumb code shouldn't be a problem, but jumping between sections is. From the looks of it Profile_offsetFinderWaitLoop is in ITCM but its caller presumably isn't. The reverse (calling main RAM code from ITCM) has the similar problems. The ITCM_CODE macro should have taken care of this, but perhaps something else is going on as well. |
Yes, it's jumping between sections that's causing the problem.
If memory serves though, jumps in ARM code can safely jump further than ones from thumb code (24 bit offset rather than 22).
So swapping to using ARM code for profile9.c should (could?) solve it, I hope.
I'll check out the ITCM_CODE macro later though; thanks for that.
Robin
#162662 - RobinWatts - Mon Sep 08, 2008 12:38 am
RobinWatts wrote: |
I'll check out the ITCM_CODE macro later though; thanks for that. |
New version up in the same place, that uses the ITCM_CODE macro.
Robin
#162687 - kleph - Tue Sep 09, 2008 7:53 am
The new version compiles well. (I still had troubles with the assember file while compiling profile9.c in arm mode)
However, the program seems to crash before returning from Profile_init().
The real DS stay silent after the four iprintf().
desmume reports : main (arm_instructions.c:243): Undefined instruction: 93004B25
I did not test with simple programs this time, i'll do that tonight i think.
#162926 - Lazy1 - Tue Sep 16, 2008 3:38 pm
Can I get a copy of this profiler?
#162933 - RobinWatts - Tue Sep 16, 2008 5:24 pm
Lazy1 wrote: |
Can I get a copy of this profiler? |
Link sent via PM.
Robin