gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

ASM > n00by question (code not working right)

#127978 - JapanLover - Mon May 07, 2007 4:15 am

hey there...

i'm just starting to do asm on gba (i've done some asm in dos and windows, but never arm)...

here is a little bit of code that doesn't seem to work right.........

Code:
@ --- screen.h (presumably) ---
#define REG_DISPCNT     0x04000000

#define MODE_3          0x0003
#define BG2_ENABLE      0x0400

#define vram    0x06000000

@ --- main.s ---
    .text         @ where the code should go
    .code   32    @ code type (ARM in this case)
    .global main  @ makes main visible for the whole project
main:
    ldr r1,=REG_DISPCNT
    ldr r2,=BG2_ENABLE|MODE_3
    str r2,[r1]
    ldr r1,=0x03E0
    ldr r2,=0x06000000
   ldr r3,=0x06012C00
   .Lopp:
   str r1,[r2]
   add r2,#1
   cmp r2,r3
   ble .Lopp
.Lloop:
    b .Lloop
.pool


shouldn't that fill the screen with green ?...

instead i get stripes...

what am i doing wrong?

i appreciate any replies =)

thanks in advance for you time/help =)

#127988 - kusma - Mon May 07, 2007 7:30 am

you're storing 0x000003E0 (str writes 32bits, each pixel is 16bits), that is one green and one black pixel. Try either loading 0x03E003E0 into r1 instead of 0x03E0, or using the strh-instruction to store a halfword instead.

#127989 - kusma - Mon May 07, 2007 9:09 am

oh, and i also see that you increase the pointer you're writing to with one. This will mean that you'll do a lot of unaligned stores, and this isn't supported on ARM. You should increase it with the data-size you store (in this case 4) instead. This can be done without a separate instruction by using the !-postfix on the destination-operand. ("str r1, [r2]!" - notice the exclamation-mark)

edit: In cas you didn't know: unaligned writes are writes to addresses that aren't exact multiplies of the data-size. This means that for 32bit writes, the two least significant bits of the address must be 0, and for 16 bit writes the one least significant bit must be 0. IIRC, most ARM-implementations ignores the lower bits, but this might not always be the case. The result is undefined.

#127993 - Cearn - Mon May 07, 2007 11:08 am

kusma wrote:
This can be done without a separate instruction by using the !-postfix on the destination-operand. ("str r1, [r2]!" - notice the exclamation-mark)

This isn't quite true. The exclamation mark means 'pre-indexed write-back'. "str Rd, [Rn, Op2]!" means you take the address given by Rn+Op2, store Rd there, and then update Rn to the offset address: Rn = Rn+Op2. But as Op2=0 in kusma's case, nothing would actually change. He probably meant "str r1, [r2,#4]!".

However, that's not quite right either because that's pre-indexing, so you'd miss the first address. What you want is post-indexed write-back: "str r1, [r2], #4".

There's a slightly faster way to loop as well: don't count up to 0x12C00, count down to 0. Also, code needs to be aligned just like data.

Code:
@ --- screen.h (presumably) ---
#define REG_DISPCNT     0x04000000

#define MODE_3          0x0003
#define BG2_ENABLE      0x0400

#define VRAM            0x06000000

@ --- main.s ---
    .text           @ where the code should go
    .align          @ Align to words
    .arm            @ code type (ARM in this case)
    .global main    @ makes main visible for the whole project
main:
    ldr     r1,=REG_DISPCNT
    ldr     r2,=BG2_ENABLE|MODE_3
    str     r2,[r1]

    ldr     r1,=0x03E0
    orr     r1, r1, r1 lsl #16      @ r1 |= r1<<16
    ldr     r2,=VRAM
    ldr     r3,=0x12C00             @ Number of bytes
Lopp:
        str     r1, [r2], #4        @ *wordptr++ = r1
        subs    r3, r3, #4          @ subtract and compare to 0
        bne     .Lopp
.Lloop:
    b       .Lloop
.pool

#127998 - kusma - Mon May 07, 2007 12:36 pm

Cearn wrote:

However, that's not quite right either because that's pre-indexing, so you'd miss the first address. What you want is post-indexed write-back: "str r1, [r2], #4".

Right. Sorry about my confusion here ;)

#128044 - JapanLover - Tue May 08, 2007 12:56 am

thanks for the help guys, i thought it might have something to do with storing too much, i just couldn't figure out where i was doing that...

cearn i got an error when trying to assemble the code you provided...

Quote:
Error: garbage following instruction -- `orr r1,r1,r1 lsl#16'


edit:

another quick question...

i'm now messing around with mode 4...

and from what i understood, you write a byte into the vram that corresponds to a color in the palette, so used strb to write that byte to vram, but when i did that i got two pixels instead of the one that i wanted, so then i tried strh and got the one pixel i was after...

does that make any sense? it doesn't to me...

#128067 - Cearn - Tue May 08, 2007 7:32 am

JapanLover wrote:
thanks for the help guys, i thought it might have something to do with storing too much, i just couldn't figure out where i was doing that...

cearn i got an error when trying to assemble the code you provided...

Quote:
Error: garbage following instruction -- `orr r1,r1,r1 lsl#16'


`orr r1,r1,r1, lsl #16', then. I often miss the comma before the shift >_>.

JapanLover wrote:

edit:

another quick question...

i'm now messing around with mode 4...

and from what i understood, you write a byte into the vram that corresponds to a color in the palette, so used strb to write that byte to vram, but when i did that i got two pixels instead of the one that i wanted, so then i tried strh and got the one pixel i was after...

does that make any sense? it doesn't to me...

You can't write in byte-sized chunks to VRAM. Well, you can, but it'd fill both pixels in that halfword with that byte. With strh you have something similar going on as the str case in mode 3: you're still plotting two pixels, but if you're only filling for one you'd get one colored pixel and one zero-pixel.

The byte-write thing is covered in most tutorials. While most GBA tutorials focus on C, they can still help you with the intricacies of the GBA hardware. Tonc covers most nearly all aspects of the hardware in detail, and has an assembly chapter to boot.

#128136 - JapanLover - Tue May 08, 2007 10:45 pm

Cearn wrote:
JapanLover wrote:
thanks for the help guys, i thought it might have something to do with storing too much, i just couldn't figure out where i was doing that...

cearn i got an error when trying to assemble the code you provided...

Quote:
Error: garbage following instruction -- `orr r1,r1,r1 lsl#16'


`orr r1,r1,r1, lsl #16', then. I often miss the comma before the shift >_>.

JapanLover wrote:

edit:

another quick question...

i'm now messing around with mode 4...

and from what i understood, you write a byte into the vram that corresponds to a color in the palette, so used strb to write that byte to vram, but when i did that i got two pixels instead of the one that i wanted, so then i tried strh and got the one pixel i was after...

does that make any sense? it doesn't to me...

You can't write in byte-sized chunks to VRAM. Well, you can, but it'd fill both pixels in that halfword with that byte. With strh you have something similar going on as the str case in mode 3: you're still plotting two pixels, but if you're only filling for one you'd get one colored pixel and one zero-pixel.

The byte-write thing is covered in most tutorials. While most GBA tutorials focus on C, they can still help you with the intricacies of the GBA hardware. Tonc covers most nearly all aspects of the hardware in detail, and has an assembly chapter to boot.


gotcha!

thanks for the help =)

i have been reading Tonc but i've been sorta just scanning it so i missed the part about 'the byte-write thing'

Tonc is really great, it must have taken a lot of effort on your part, so i thank you very much

#128274 - JapanLover - Thu May 10, 2007 6:16 am

alllllright i've got another problem with non-working code again....

not sure at all whats wrong this time...

i'm trying to draw a diagonal line, in mode 4...

i feel like i'm so close to getting it right, but it's off for some reason...

please take a look...

Code:
.text                      @ where the code should go
.align                     @ align to words
.arm                       @ code type (ARM in this case)
.global main               @ makes main visible for the whole project

main:
   ldr    r1,=0x04000000   @ display register location thingy
   ldr    r2,=0x0404       @ BG2 orr MODE4
   str    r2,[r1]          @ put display setting in register

   ldrh   r1,=0x03E0       @ 16bit colour into r1
   ldr    r2,=0x5000002    @ loading palette location into r2
   strh   r1, [r2]         @ and finally putting colour into palette

   ldr    r2,=0x06000000   @ vram loaded into r2
   ldr    r1,=0x01         @ color from palette loaded
   ldr    r3,=0x0A         @ number of pixels to draw
   ldr    r4,=0x0101       @ number to later xor with loaded

.Lop:
   str    r1,[r2],#0xf0    @ palette number put into vram and 240 added to pointer
   eor    r1,r1,r4         @ palette numbers switched with a xor
   str    r1,[r2],#0xf2    @ palette number put into vram and 242 added to pointer
   eor    r1,r1,r4         @ palette numbers switched with a xor
   subs   r3,r3,#1         @ decrease number of pixels left
   bne    .Lop             @ loop until we've drawn all the pixels we wanted

.Lloop:
   b     .Lloop            @ continuous loop
.pool


also, what exactly is .pool for?...

thanks for any replies once again =)

#128284 - Cearn - Thu May 10, 2007 8:35 am

Inside the loop, use strh, not str.

.pool is where literal values (as in ldr Rd,=value) are dumped. I think it's not really required anymore though. Answers to these questions can be found in the manual: http://sourceware.org/binutils/docs-2.17/as/index.html. Manuals for othe GNU tools can be found here. Note that they've hidden the GNU assembler documents under binutils, rather than give it its own slot :(.

Other points:
  • The registers start at r0, not r1.
  • Registers r0-r3 and r12 can be used as scratch registers, the rest should leave a function with the same value as they came in. Use the stack for temporary storage. r0-r3 are also used for function parameters and r0 for the return value.
  • Thumb code for recommended for GBA ROM because of its 16bit bus.
  • Memory loads (ldr) are usually slightly slower than mov. The assembler will turn 'ldr r0,=xx' into a mov if possible by itself, but it's still something to be aware of.
These aren't really important now, but could be later.

#128381 - JapanLover - Fri May 11, 2007 2:12 am

thanks Cearn =)

and thanks for the extra tips!...

one thing though...

i'm a little confused as to when i'm using thumb or arm...

is it just that some instructions are thumb and some arm? or is something else to it?

#128433 - tepples - Fri May 11, 2007 1:01 pm

ARM ordinarily uses a 32-bit instruction encoding. Thumb is a compressed 16-bit encoding of common ARM instructions. A subroutine ("function" in C) can be compiled to ARM instructions or to Thumb instructions. (In GCC, this is controlled by the -marm or -mthumb compiler option.) When a subroutine puts the instruction decoder into Thumb mode, the CPU acts as if each Thumb instruction were decoded into one ARM instruction. A subroutine compiled to Thumb instructions may run faster from 16-bit memory than the same subroutine compiled to ARM instructions because each fetch cycle executes faster, pulling one 16-bit unit rather than two. However, in 32-bit memory (such as IWRAM on the GBA or any ARM9 cacheable memory on the DS), ARM instructions run faster because many ARM instructions correspond to two or three Thumb instructions.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.