#39898 - Stan64 - Wed Apr 13, 2005 9:25 am
i was thinking that with passme, you could make a emulator that triggers the wireless connections and fools the GBA portion that you load through the system that there is a link port. Like, you play on the top screen, choose "rooms" and so on with the bottom screen. Then you have X and Y to do whatever you want with.
It should work. =)
#39903 - bluknight - Wed Apr 13, 2005 11:19 am
On what basis do you say it should work? I agree that it would be nice if we could get it working, but should it work? Where would you even put the code for it? I think there would be too much for in the firmware, and you would already be using the GBA slot...
#39905 - netdroid9 - Wed Apr 13, 2005 11:29 am
Stan64 wrote: |
i was thinking that with passme, you could make a emulator that triggers the wireless connections and fools the GBA portion that you load through the system that there is a link port. Like, you play on the top screen, choose "rooms" and so on with the bottom screen. Then you have X and Y to do whatever you want with. |
OI! That was my idea :)!
My idea is basically the same, except it'd support GB and GBC emulation (Probably not going to happen).
Code would go on a DS flash cart. It'll happen. Use a GBA flash cart to grab the GBA BIOS, flick into the DS cart and download the bios off the flash cart... Easy as pie :). Load a homebrew game, start up some kind of wireless broadcast with the rom and a cut down emulator and get killed at pong due to lag :). (Most commercial roms are probably heaps bigger than 4mb, but simple homebrew games should be small enough).
Simply emulate the hardware with the ARM7 (Like the original GBA emulation) and it'd work. Assuming you can get direct access to the ARM7. And do some basic WiFi processing on the ARM9.
Of course the theory would be simple, but the emulator itself would be hell to program.
#39936 - Stan64 - Wed Apr 13, 2005 6:50 pm
You maybe not need the emulator. The DS have hardware to play gba games. I think there only is a raw code unlock for both gb and GBC games and for the amchine to use somekind of communication with the multiplayer parts in games. The only thing you need then is a app that do load and release those locks and then acts like a wireless host. You get the picture.
We just have to wait for DS flashcarts. =)
#39948 - octopusfluff - Wed Apr 13, 2005 9:14 pm
Stan64 wrote: |
You maybe not need the emulator. The DS have hardware to play gba games. I think there only is a raw code unlock for both gb and GBC games and for the amchine to use somekind of communication with the multiplayer parts in games. The only thing you need then is a app that do load and release those locks and then acts like a wireless host. You get the picture.
We just have to wait for DS flashcarts. =) |
Problem: DS has hardware for GBA games, yes, but that hardware is largely controlled by the GBA application during that time.
Problem: The DS does /not/ have the hardware to execute GB or GBC games.
Problem: An app might well run on the ARM9 while the ARM7 is doing the GBA work, but the WiFi is believed to be only accessible on the ARM7, which is busy running GBA code.
#39963 - lambi1982 - Wed Apr 13, 2005 10:35 pm
I do believe no one really knows anything about the DS YET....
_________________
Who, Me?
#40005 - Stan64 - Thu Apr 14, 2005 12:27 pm
Why not let ARM9 do the GBA stuff then and run on ARM7? There has to be some options.
#40006 - Sebbo - Thu Apr 14, 2005 12:35 pm
your right octopusfluff, the DS doesn't have the hardware for GB and GBC games..the GBA even required a Z80 as well as the ARM7. though you might be able to emulate it
Stan64, that'd be my first guess, but i think there maybe some problems with the different instruction sets on the ARM9 and ARM7, unless you ran an emulator on the ARM9, i dunno
#40010 - Stan64 - Thu Apr 14, 2005 1:37 pm
Yeah. Then u maybe have to use a emulator. then you would be able to read gb/gbc games then?
#40048 - tepples - Thu Apr 14, 2005 7:11 pm
That would be called "porting Goomba to the Nintendo DS". It's already half done, as Goomba runs many monochrome Game Boy games on the GBA.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#40057 - Stan64 - Thu Apr 14, 2005 8:36 pm
I would LOOVE to play F1 in 4 player. =)
#40058 - ampz - Thu Apr 14, 2005 8:41 pm
As pointed out soooo many times before in this forum. The GBA link port can NOT be tunneled over any kind of network. Let alone wireless.
If you want the explanation as to why: Do a search.
#40079 - Stan64 - Thu Apr 14, 2005 11:19 pm
I don't get it. If you emulate the GB/GBC och GBA system you could emulate the wireless part if you know how to use it. It is around 54mb and I don't think that the GBA/GB link cable handled greater speeds.
EDIT: I seearched the forum but didn't find anything.
#40080 - ampz - Thu Apr 14, 2005 11:36 pm
Stan64 wrote: |
I don't get it. If you emulate the GB/GBC och GBA system you could emulate the wireless part if you know how to use it. It is around 54mb and I don't think that the GBA/GB link cable handled greater speeds.
EDIT: I seearched the forum but didn't find anything. |
No, even gigabit ethernet would NOT be fast enough.
I entered the keywords "link" and "tunnel", and look what I found:
http://forum.gbadev.org/viewtopic.php?t=3019
http://forum.gbadev.org/viewtopic.php?t=5114
Now read thoose two threads very carefully, especially the first one. Please?
There are even more threads on the subject.
#40086 - tepples - Fri Apr 15, 2005 12:33 am
But if you emulate all GBs on each host machine (the "TGB Dual method" as Forgotten used to call it), you can just exchange controller data 30 or 60 times a second, which I'm guessing is well within WiFi tolerances.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#40092 - octopusfluff - Fri Apr 15, 2005 1:38 am
tepples wrote: |
But if you emulate all GBs on each host machine (the "TGB Dual method" as Forgotten used to call it), you can just exchange controller data 30 or 60 times a second, which I'm guessing is well within WiFi tolerances. |
Just as long as people realize that's feasible for GB games, MAYBE GBC (that double clock could make it too difficult depending on what other overhead is introduced), and not GBA games.
Since people seem obsessed with GBA multiplayer over wireless.
#40095 - tepples - Fri Apr 15, 2005 1:56 am
octopusfluff wrote: |
Just as long as people realize that's feasible for GB games, MAYBE GBC (that double clock could make it too difficult depending on what other overhead is introduced) |
Target: GB to GBC
Given the more complex DMA model and the more complex video, this might need double the cycles for both the CPU and the PPU.
Target: GBC to dual GBC
You have to do full video emulation only for one of the two GBC units. You can get by with just emulating vblank, vcount, and hblank for the other. Therefore, you'll probably need slightly less than double host CPU speed.
Host: GBA to Nintendo DS
From GB to dual GBC would approximately quadruple the required host CPU speed. Guess what? The ARM9 in the Nintendo DS is clocked at quadruple the rate of the ARM7 in the GBA, plus the ARM9's instructions per clock is greater.
Verdict: If SNES Advance is doable, than Goomba Dual for DS is doable.
Quote: |
and not GBA games. |
Granted. People who want wireless GBA play should go get a GBA SP and the Majesco adapter.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#40096 - Sukanu - Fri Apr 15, 2005 2:32 am
Back when Xlink was still claming that they succesfully tunneled the DS I said it was not feaseably possable due to the latenceys of the internet, praxis and I got in to a small flame war (i beleve he moved to these fourms (No hard feelings praxis)) and he said that the latency of wireless would be much greater than over the internet, so i wrote a report on tunneling with my own experments and put it on the site (This was done very spur of the moment and written at 2am so forgive any small errors)
(also what set me off to write this was praxis' last comment "do you even have a wireless network at home?" hence the quote on the last page)
http://people.msoe.edu/~chambers/writeings/Latency.doc
#40114 - ampz - Fri Apr 15, 2005 6:32 am
tepples wrote: |
[*snip*, plus the ARM9's instructions per clock is greater. |
No, I think it is worse due to the deeper pipeline.
#40117 - Extreme Coder - Fri Apr 15, 2005 7:05 am
Maybe one(host) DS should emulate both GBA games and the cable between them, and the DS sends by wifi to the other DS(slave) the video and sound signals, and recieves from it the button signals. I don't know if the DS is that powerful though.
VBA Link emulates up to 4 GBAs on one PC, and emulates GBA link on LAN. Same could be done with DS.
#40118 - DekuTree64 - Fri Apr 15, 2005 7:10 am
ampz wrote: |
tepples wrote: | [*snip*, plus the ARM9's instructions per clock is greater. |
No, I think it is worse due to the deeper pipeline. |
I believe ARM9 will always get more instructions per clock. Most instructions take the same number of cycles as ARM7, except that you don't have to wait for them to finish before starting on the next one. With clever assembly coding, you get 1 instruction for every clock. It will stop and wait if the result of the instruction is needed before it's finished.
For example, if you load a value and add it to something immediately, it will wait for the load to finish, just like ARM7. But if you load the value, and then run some other instructions unrelated to the result of the load, it will only occupy the processor for 1 cycle.
The pipeline is 5 levels instead of the ARM7's 3, but according to http://www.arm.com/pdfs/DDI0165B_9ES_trm.pdf, branches still only take 3 cycles. Probably because it only has to go through the fetch and decode stages before it can start executing, just like ARM7.
One of these days I'll have to run some of my old speed tests to see exactly how it acts. In particular, I'm curious how loads behave with waitstates, and if you do lots of loads back to back. I'd assume they'd all still take 1 cycle, provided that you wait long enough for the result, but I need to see it for myself to be sure.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#40121 - ampz - Fri Apr 15, 2005 7:33 am
DekuTree64 wrote: |
ampz wrote: | tepples wrote: | [*snip*, plus the ARM9's instructions per clock is greater. |
No, I think it is worse due to the deeper pipeline. |
I believe ARM9 will always get more instructions per clock. Most instructions take the same number of cycles as ARM7, except that you don't have to wait for them to finish before starting on the next one. With clever assembly coding, you get 1 instruction for every clock. It will stop and wait if the result of the instruction is needed before it's finished.
For example, if you load a value and add it to something immediately, it will wait for the load to finish, just like ARM7. But if you load the value, and then run some other instructions unrelated to the result of the load, it will only occupy the processor for 1 cycle.
The pipeline is 5 levels instead of the ARM7's 3, but according to http://www.arm.com/pdfs/DDI0165B_9ES_trm.pdf, branches still only take 3 cycles. Probably because it only has to go through the fetch and decode stages before it can start executing, just like ARM7.
One of these days I'll have to run some of my old speed tests to see exactly how it acts. In particular, I'm curious how loads behave with waitstates, and if you do lots of loads back to back. I'd assume they'd all still take 1 cycle, provided that you wait long enough for the result, but I need to see it for myself to be sure. |
I think you have not entirely understood the principle of a CPU pipeline and how the number of cycles per instruction are documented.
The number of cycles stated in the manual are the number of cycles a instruction requires to execute assuming there is no interlock. The next instruction will not execute in "parallel" with theese cycles.
As you can see, the number of cycles per instruction for the ARM9 is the same as the number of cycles on the ARM7. This means they will perform identically at a given clock speed assuming there is NO interlock.
Interlocks will occur if instruction A is a load or a complicated arithemtic instruction and instruction B need the result from instruction A. Interlock means the instruction will take one additional cycle to complete. That is,if the manual says it will take 2 cycles to complete, it will require 3 cycles instead.
Pipeline flushes is another thing which occur mainly when PC is loaded with a new value (this does not include branches). A pipeline flush takes 4 additional cycles.
#40123 - DekuTree64 - Fri Apr 15, 2005 8:38 am
ampz wrote: |
The number of cycles stated in the manual are the number of cycles a instruction requires to execute assuming there is no interlock. The next instruction will not execute in "parallel" with theese cycles. |
Are you sure about that? To quote [url=http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf]http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf[/url],
Quote: |
By overlapping the various stages of the pipeline, ARM9E-S maximizes the clock rate achievable to execute each instruction. It delivers a throughput approaching one instruction per cycle |
Sounds like parallel execution to me.
Also, in that document I linked in the previous post, it specifically lists all the possible cycle counts for LDR, and the interlock conditions causing them. Where did you read that they take more?
Granted, I'm not entirely confident in my understanding of it, but everything I've read seems to agree on the faster timings. I'll whip up a quick tester tomorrow and find out for sure.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#40124 - ampz - Fri Apr 15, 2005 8:56 am
DekuTree64 wrote: |
ampz wrote: | The number of cycles stated in the manual are the number of cycles a instruction requires to execute assuming there is no interlock. The next instruction will not execute in "parallel" with theese cycles. |
Are you sure about that? To quote [url=http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf]http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf[/url],
Quote: | By overlapping the various stages of the pipeline, ARM9E-S maximizes the clock rate achievable to execute each instruction. It delivers a throughput approaching one instruction per cycle |
Sounds like parallel execution to me. |
Yes, different stages of the pipeline are executed in parallel, but the cycle count refers to the number of cycles required in the execute stage (and the memory stage in case of interlocking) of the pipeline. Only one instruction is in the execute stage at any given time.
Quote: |
Also, in that document I linked in the previous post, it specifically lists all the possible cycle counts for LDR, and the interlock conditions causing them. Where did you read that they take more? |
Interlocking is better documented than I remembered. My argument is still valid: ARM9 instructions can suffer higher interlock penalty than the equivalent ARM7 instruction.
#40165 - tepples - Fri Apr 15, 2005 4:43 pm
ampz wrote: |
Interlocks will occur if instruction A is a load or a complicated arithemtic instruction and instruction B need the result from instruction A. |
You claim that interlock happens in the case of "complicated" expressions. The ARM7 will stall most notably on a load (2 cycles), store (1 cycle), register specified shift (1 cycle), branch (2 cycles), multiply (1 to 4 cycles), or multiply-add (2 to 5 cycles). But which expressions are "complicated" enough to trigger interlock in the ARM9 but not in the ARM7?
Quote: |
ARM9 instructions can suffer higher interlock penalty than the equivalent ARM7 instruction. |
I just looked at the [url=http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf]ARM v5T PDF[/url], and I finally know what "TCM" is: it's what GBA developers have been calling "IWRAM" (p. 3). The 5-stage pipeline (fetch, decode, execute, memory, register write) (p. 9) resembles the MIPS pipeline, which is explained in detail in Computer Organization and Design, Second Edition. The instruction timing table (p. 10) resembles that of ARM7, except for the following: - mul takes about one cycle less on ARM9 if the result isn't immediately used.
- muls, which modifies the flags, may take a cycle longer than it did on ARM7.
The Harvard architecture of the ARM9, which uses separate internal buses for instruction and data streams, allows the following further improvements to memory access speed, which should be very nice to emulators: - str is one cycle faster (1 cycle vs. 2 on ARM7).
- stm is ridiculously faster (1 cycle vs. 1 + n_words on ARM7).
- ldr and ldm are one or two cycles faster (1 + n_words cycles, 1 less if the last loaded value is not immediately used vs. 2 + n_words on ARM7).
- swp is faster (3 cycles, 1 less if result is not immediately used, vs. 4 on ARM7)
Extreme Coder wrote: |
VBA Link emulates up to 4 GBAs on one PC |
A PC also has a 2 GHz CPU, unlike the Nintendo DS.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#40176 - ampz - Fri Apr 15, 2005 6:42 pm
tepples wrote: |
I just looked at the [url=http://www.arm.com/pdfs/DVI0018B_946E_S(R0)_po.pdf]ARM v5T PDF[/url], and I finally know what "TCM" is: it's what GBA developers have been calling "IWRAM" (p. 3). The 5-stage pipeline (fetch, decode, execute, memory, register write) (p. 9) resembles the MIPS pipeline, which is explained in detail in Computer Organization and Design, Second Edition. The instruction timing table (p. 10) resembles that of ARM7, except for the following: - mul takes about one cycle less on ARM9 if the result isn't immediately used.
- muls, which modifies the flags, may take a cycle longer than it did on ARM7.
The Harvard architecture of the ARM9, which uses separate internal buses for instruction and data streams, allows the following further improvements to memory access speed, which should be very nice to emulators: - str is one cycle faster (1 cycle vs. 2 on ARM7).
- stm is ridiculously faster (1 cycle vs. 1 + n_words on ARM7).
- ldr and ldm are one or two cycles faster (1 + n_words cycles, 1 less if the last loaded value is not immediately used vs. 2 + n_words on ARM7).
- swp is faster (3 cycles, 1 less if result is not immediately used, vs. 4 on ARM7)
|
stm taks n_words cycles on the ARM9 if the number of words are >1. It take 2 cycles if the number of words is one.
You are right, the separated instruction and data paths together with the cache will make a difference.
#40186 - Stan64 - Fri Apr 15, 2005 7:38 pm
ok, I understand now that it is a ltanecy problem. But how much latency is it wireless when both are in the same room? In the other threads they talked about internet tunelling. I'm not talking about that.
And I think if you have two good computers with the same NIC and gigabit. It would have enoug low latency for tunneling. ;) But then there is no point. You can have your GBASP and play with a cable anyway. But I'm talking about not to carry around your DS, SP and PSP. >_< I dont have room for all in my pockets.
#40193 - tepples - Fri Apr 15, 2005 8:13 pm
Stan64 wrote: |
ok, I understand now that it is a ltanecy problem. But how much latency is it wireless when both are in the same room? |
I'd assume less than 10ms, where 15ms is about the upper bound for the TGB Dual method.
Quote: |
But I'm talking about not to carry around your DS, SP and PSP. >_< |
How'd you get so rich? I'd pay to rent a PSP to play Lumines, but I don't think one game is worth $290.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#40225 - Stan64 - Sat Apr 16, 2005 11:15 am
I have a job. =) I also paid the releaseprice for a first week psp from lik-sang. 529$
And I have over 20 games to psp and ds. =D
And how low latency do you need to be able to transfer gba/gb/gbc port data?
#40231 - ampz - Sat Apr 16, 2005 1:27 pm
Stan64 wrote: |
And how low latency do you need to be able to transfer gba/gb/gbc port data? |
1?s could work.
#40234 - Stan64 - Sat Apr 16, 2005 2:44 pm
Ouch. =/
#40269 - octopusfluff - Sat Apr 16, 2005 11:15 pm
ampz wrote: |
Stan64 wrote: | And how low latency do you need to be able to transfer gba/gb/gbc port data? |
1?s could work. |
I think with the clock rate of GB/GBC you have a little more tolerance than that. Maybe as much as 4 or 5. =>
#40276 - sillyb - Sun Apr 17, 2005 12:22 am
Are you guys saying that gba sp wireless adapter does less then 4us ??
#40279 - ampz - Sun Apr 17, 2005 12:53 am
sillyb wrote: |
Are you guys saying that gba sp wireless adapter does less then 4us ?? |
No.
#40951 - Fox5 - Sat Apr 23, 2005 10:22 pm
ampz wrote: |
As pointed out soooo many times before in this forum. The GBA link port can NOT be tunneled over any kind of network. Let alone wireless.
If you want the explanation as to why: Do a search. |
Majesco has a wireless mutliplayer adapter for the GBA that works with any link cable game, I wonder how it works.(do they use a custom wireless protocol?)
#40958 - tepples - Sun Apr 24, 2005 12:25 am
Fox5 wrote: |
Majesco has a wireless mutliplayer adapter for the GBA that works with any link cable game, I wonder how it works. |
It seems to tunnel at layer 1 using custom hardware. Wi-Fi hardware would not be suitable.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#41022 - ampz - Sun Apr 24, 2005 8:34 am
That Majesco adapter can't work with much of a distance. A single transmission error would crash your game.
#41024 - tepples - Sun Apr 24, 2005 8:45 am
ampz wrote: |
A single transmission error would crash your game. |
There are occasional bit errors in even wired communication. GBA games are built to handle this, freezing a bit for a resend should the CRC not match.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#41059 - octopusfluff - Sun Apr 24, 2005 7:47 pm
tepples wrote: |
There are occasional bit errors in even wired communication. GBA games are built to handle this, freezing a bit for a resend should the CRC not match. |
Having played Mario Kart Super Circuit over a 100 foot link cable made of cat5, I can vouch for this.
Game LARGELY played correctly, would sometimes stutter a little when a large amount of EMI/RFI in the area.
We couldn't run the air conditioner while playing. :)
#41066 - ampz - Sun Apr 24, 2005 9:51 pm
tepples wrote: |
ampz wrote: | A single transmission error would crash your game. |
There are occasional bit errors in even wired communication. GBA games are built to handle this, freezing a bit for a resend should the CRC not match. |
True. There are, and saying that a single bit error would cause a crash might not be exactly true for all cases.
Bit errors in wired communication can occur due to noise or contact problems (like someone yanking the plug). Sensitivity to noise depends on cable lenth, and GBA link cables are very short. They also have filters near the cable ends, and the communication protocol is syncronous. Syncronous communication is less sensitive than asyncronous communication. Obviously it works, and I have not tried it, so what I say it pure speculation. But due to the virtually error free communication environment provided by GBA link cables, I doubt there is very elaborate error handling code present in all GBA games.
#41106 - tepples - Mon Apr 25, 2005 5:28 am
ampz wrote: |
Bit errors in wired communication can occur due to noise or contact problems (like someone yanking the plug). |
GBA link cables are not screwed in, and the connection doesn't even seem that secure when used in, say, a moving vehicle with a Ford-caliber suspension.
Quote: |
[Official GBA link cables] also have filters near the cable ends, and the communication protocol is syncronous. Syncronous communication is less sensitive than asyncronous communication. |
Really? I'd guess that synchronous communication might be more sensitive to noise on the clock line, especially for one-bit insertions or deletions.
Quote: |
But due to the virtually error free communication environment provided by GBA link cables, I doubt there is very elaborate error handling code present in all GBA games. |
Hence the CRCs and resends, not some sort of Hamming overkill.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#41132 - ampz - Mon Apr 25, 2005 8:32 pm
tepples wrote: |
ampz wrote: | Bit errors in wired communication can occur due to noise or contact problems (like someone yanking the plug). | GBA link cables are not screwed in, and the connection doesn't even seem that secure when used in, say, a moving vehicle with a Ford-caliber suspension. |
Indeed.
I have even had Zelda crash when playing 3 players with link cables. That's why I don't trust the error detection capabilities of GBA games to be that good.
Quote: |
Quote: | [Official GBA link cables] also have filters near the cable ends, and the communication protocol is syncronous. Syncronous communication is less sensitive than asyncronous communication. |
Really? I'd guess that synchronous communication might be more sensitive to noise on the clock line, especially for one-bit insertions or deletions. |
That's what I have read. But I agree with what you say.
However I do imagine that synchronous communication should be more or less immune to delayed signal edges due to minor contact problems.
Asynchronous communications would have more of a problem with that.
Quote: |
Quote: | But due to the virtually error free communication environment provided by GBA link cables, I doubt there is very elaborate error handling code present in all GBA games. |
Hence the CRCs and resends, not some sort of Hamming overkill. |
And perhaps sometimes the error handling code is not put through too many tests before release?