gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Announcements And Comments > devkitARM release 30

#173186 - wintermute - Thu Mar 25, 2010 9:56 pm

There were a couple of odd bugs that crept in to release 29 so say hello to release 30.

http://www.devkitpro.org/devkitarm/devkitarm-release-30/

And no, hopefully this 3 day release cycle won't get to be a habit :p
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

#173229 - wintermute - Sun Mar 28, 2010 10:59 am

bumping this back to the top after replying to posts in earlier releases.
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

#173264 - sverx - Mon Mar 29, 2010 9:58 am

oops... so I should bump again this one, now! :) (wasn't it easier to make this sticky? ;) )

#173277 - Dwedit - Mon Mar 29, 2010 6:09 pm

If this newlib doesn't contain an optimized memset and memcpy, I'm going to cry. No more excuses, just replace their C memcpy and memset functions with optimized ASM versions already.

edit: What the hell is this?
Code:

lib_a-mempcpy.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <mempcpy>:
   0:   e352000f    cmp   r2, #15
   4:   e92d0030    push   {r4, r5}
   8:   e1a03000    mov   r3, r0
   c:   e1a04001    mov   r4, r1
  10:   e1a0c002    mov   ip, r2
  14:   9a000002    bls   24 <mempcpy+0x24>
  18:   e1815003    orr   r5, r1, r3
  1c:   e3150003    tst   r5, #3
  20:   0a00000a    beq   50 <mempcpy+0x50>
  24:   e35c0000    cmp   ip, #0
  28:   0a000006    beq   48 <mempcpy+0x48>
  2c:   e3a02000    mov   r2, #0
  30:   e7d41002    ldrb   r1, [r4, r2]
  34:   e7c01002    strb   r1, [r0, r2]
  38:   e2822001    add   r2, r2, #1
  3c:   e152000c    cmp   r2, ip
  40:   1afffffa    bne   30 <mempcpy+0x30>
  44:   e0800002    add   r0, r0, r2
  48:   e8bd0030    pop   {r4, r5}
  4c:   e12fff1e    bx   lr
  50:   e1a04002    mov   r4, r2
  54:   e1a0c001    mov   ip, r1
  58:   e59c5000    ldr   r5, [ip]
  5c:   e5805000    str   r5, [r0]
  60:   e59c5004    ldr   r5, [ip, #4]
  64:   e5805004    str   r5, [r0, #4]
  68:   e59c5008    ldr   r5, [ip, #8]
  6c:   e5805008    str   r5, [r0, #8]
  70:   e2444010    sub   r4, r4, #16
  74:   e59c500c    ldr   r5, [ip, #12]
  78:   e354000f    cmp   r4, #15
  7c:   e580500c    str   r5, [r0, #12]
  80:   e28cc010    add   ip, ip, #16
  84:   e2800010    add   r0, r0, #16
  88:   8afffff2    bhi   58 <mempcpy+0x58>
  8c:   e2422010    sub   r2, r2, #16
  90:   e1a00222    lsr   r0, r2, #4
  94:   e0425200    sub   r5, r2, r0, lsl #4
  98:   e2800001    add   r0, r0, #1
  9c:   e1a00200    lsl   r0, r0, #4
  a0:   e3550003    cmp   r5, #3
  a4:   e0814000    add   r4, r1, r0
  a8:   e1a0c005    mov   ip, r5
  ac:   e0830000    add   r0, r3, r0
  b0:   9affffdb    bls   24 <mempcpy+0x24>
  b4:   e3a01000    mov   r1, #0
  b8:   e7942001    ldr   r2, [r4, r1]
  bc:   e7802001    str   r2, [r0, r1]
  c0:   e2811004    add   r1, r1, #4
  c4:   e0612005    rsb   r2, r1, r5
  c8:   e3520003    cmp   r2, #3
  cc:   8afffff9    bhi   b8 <mempcpy+0xb8>
  d0:   e245c004    sub   ip, r5, #4
  d4:   e1a0212c    lsr   r2, ip, #2
  d8:   e2823001    add   r3, r2, #1
  dc:   e1a03103    lsl   r3, r3, #2
  e0:   e0844003    add   r4, r4, r3
  e4:   e04cc102    sub   ip, ip, r2, lsl #2
  e8:   e0800003    add   r0, r0, r3
  ec:   eaffffcc    b   24 <mempcpy+0x24>


I don't see any ldmia / stmia in 32 byte chunks anywhere here. I just see it going to word-aligned mode, and nothing further than that.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#173281 - wintermute - Mon Mar 29, 2010 7:30 pm

If you don't have something constructive to say then please don't bother. I'm getting really seriously bored with you.

Something you need to consider, devkitARM is not now, nor has it ever been, only a DS compiler. It is also capable of producing code for any arm4vt+ processor in arm or thumb mode with both little and big endian support, as well as several of the Cortex chips. If you want an "optimised" memcpy and memset feel free to write them yourself. If you want to see them in devkitARM then please ensure that your code will run on any arm chip supported by devkitARM.
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

#173288 - keldon - Mon Mar 29, 2010 11:47 pm

Is there any easy way to select libraries at compile/make time such that optimized code can be linked instead? I've seen it done on one commercial tool chain for an embedded system and it was seamless. That way everyone can be happy!

#173684 - FluBBa - Fri Apr 23, 2010 10:03 am

Code:

#---------------------------------------------------------------------------------
# options for code generation
#---------------------------------------------------------------------------------
ARCH   :=   -mthumb -mthumb-interwork

CFLAGS   :=   -g -Wall -O2\
          -march=armv5te -mtune=arm946e-s -fomit-frame-pointer\
         -ffast-math \
         $(ARCH)

CFLAGS   +=   $(INCLUDE) -DARM9
CXXFLAGS   := $(CFLAGS) -fno-rtti -fno-exceptions -fno-exceptions -fno-rtti

ASFLAGS   :=   -g $(ARCH) $(INCLUDE)
LDFLAGS   =   -specs=ds_arm9.specs -g $(ARCH) -mno-fpu -Wl,-Map,$(notdir $*.map)


Shouldn't the "-march=armv5te " be placed under ARCH so we can use ARM9 opcodes from assembler? (found in all example makefiles...)
Or can I screw something up by doing that?
_________________
I probably suck, my not is a programmer.