gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS Misc > Dictionary Please!

#92013 - jbullfrog - Tue Jul 11, 2006 3:40 am

could any of you hardy develpers make a homebrew dictonary for the ds?

I don't know how to develop anything, but I believe it could be relatively simple. Couldn't you just copy the entire Merriam Websters Dictionary in some text file, and then include some type of search function ?

The top screen could be used to view the definition of a word. The bottom screen could show a search box where you could use the stylus to write in the word.

This would be handy for all you office people. I bet this would be a great app for DS Organize !

What do you think?


Last edited by jbullfrog on Mon Jul 17, 2006 1:41 pm; edited 1 time in total

#92014 - clone dad - Tue Jul 11, 2006 3:57 am

I've been thinking about this ever since i got into homebrew. It would be frikin awesome!

when i start developig apps for the DS, this will most likely be my first one.
_________________
I don't know anything.

#92015 - CubeGuy - Tue Jul 11, 2006 4:30 am

I'm trying to think of a way to speed up the lookup of words. It seems that not only would that take up a lot of CF space, but scanning a text file that big could take forever.
_________________
It's 'CubeGuy.' One word. No space.

#92016 - Mr. Picklesworth - Tue Jul 11, 2006 4:32 am

One could use DSLinux's Retawq web browser and a dictionary web site :)
Not exactly mobile, but handy nonetheless.

I agree, a dictionary would be great!
I know that the Rogett Thesaurus is available from Project Gutenberg, and it takes up about 1 MB.
_________________
Thanks!
MKDS Friend Code: 511165-679586
MP:H Friend Code: 2105 2377 6896

#92028 - HyperHacker - Tue Jul 11, 2006 5:07 am

I was thinking about this the other day. I figure the longest word you'd need is maybe 21 characters (you could probably get away with base words; no need to have "buzzing" when you already have "buzz"). If you store an index of words as 21 characters plus a 4-byte file address at which that word's definition resides, you'd be able to do lookups much faster, but the index alone could be a few megabytes. You might get even more speed by having a second index for each letter, telling where in the file the index for words starting with each letter is located. Since the two indexes aren't likely to be more than 16MB, the letter index could be one byte for character and 3 for address. Then compress the entire thing, and you might just fit it onto a memory card. The double-indexing would help fit it all into memory:
1) Take the first letter of the word.
2) Decompress the index for words starting with that letter into memory.
3) Scan the index for the word.
4) Decompress its definition and display it.
Since the index of words alone could be >4MB, splitting it up into 26 separate indexes might be the only way to fit it into memory.

As for the source of the definitions... make a bot to download them from dictionary.com?
_________________
I'm a PSP hacker now, but I still <3 DS.

#92034 - Mr. Picklesworth - Tue Jul 11, 2006 7:14 am

For searching, the definitions could be assumed to be in alphabetical order... so it could scan with a collection of loops like this pseudo-code:
Code:

Local SearchString:String //Word being searched for
Local SearchBoundMin:Int=0
Local SearchBoundMax:Int=NumberOfLinesInDictionary
Local StepCount:Int=50

For SearchLength:Int=1 To Len(SearchString)
{
     ScanString:String=Left(SearchString, SearchLength) //ScanString is SearchLength # of characters from the left of SearchString

     For ReadLineNumber:Int=SearchBoundMin To SearchBoundMax Step=StepCount //Step is just BASIC-ish code for the amount that ReadLineNumber is incremented each loop
     {
          CheckWord:String=ReadWordOnLine(ReadLineNumber)
          //Some stuff, much like this, which checks the word against the last searched word to see if the other word lies between them. If it does, SearchBoundMax=ReadLineNumber and SearchBoundMin=LastReadLineNumber. StepCount is decreased, by some fancy calculated value based on by how many letters the other words came to equaling the search query... or something like that. Eventually, we wittle our way down to the final answer.
     }
}
Next

That pitiful excuse for an example assumes that each line in the dictionary is a new word (which assumes no line breaks in definitions, which would be absurd).
Using an Index as per HyperHacker's suggestion could cause such crazy assumptions to be very possible :)

I don't think my example is actually done properly... I just couldn't think of a word for what such an algorithm would be referred to as, so I wrote it.


Here are some dictionaries, btw:
http://www.google.com/search?q=dictionary&domains=gutenberg.org&sitesearch=gutenberg.org&btnG=Google+Search

I'm thinking this would best be combined with a fancy plain text reading program, which I may or may not be doing and finishing at an unknown time assuming it even exists, which may or may not be true in the same way that it may or may not be true whether it may or may not include fancy Search functions.
(I wonder if that even makes sense...)
_________________
Thanks!
MKDS Friend Code: 511165-679586
MP:H Friend Code: 2105 2377 6896

#92037 - Devil_Spawn - Tue Jul 11, 2006 7:58 am

people do translation projects is there any easy way to translate the korean dictionary, or is it word by word, changin everything

#92078 - tepples - Tue Jul 11, 2006 1:53 pm

jbullfrog wrote:
Couldn't you just copy the entire Merriam Websters Dictionary

No piracy talk please.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#92080 - clone dad - Tue Jul 11, 2006 1:55 pm

you caould have the a words in one txt file, the b's in another, and so forth. wouldn't this make scanning faster? you could have the text file defined by what letter the user enters in first.
_________________
I don't know anything.

#92108 - TheYak - Tue Jul 11, 2006 5:36 pm

The search function could be done in a supplemental version. It'd still be quite usable with the separate text files (per suggestion, alist.txt, etc.) and a good line-by-line / page-up/page-down implementation. Search functions could be refined later.

Of course, without a search function you could accomplish pretty much the same thing with text files in Moonshell, DSOrganize or the M3's reader.

#92316 - jbullfrog - Wed Jul 12, 2006 7:39 pm

so...is anyone going to do it? and make the dictionary?

#92351 - tyraen - Thu Jul 13, 2006 12:37 am

Are we talking about a translation thingy here or just a dictionary? I can't find too much on the legal side of these dictionaries:

http://www.mozilla.com/thunderbird/dictionaries.html

But they're probably alright to use, OpenOffice apparently uses them too. I don't know much about the format but it doesn't look very complex.

#92590 - tyraen - Fri Jul 14, 2006 12:40 am

And now that I think about, you were talking about a dictionary with definitions right, not a spell checker. *bonk self*

#92593 - tyraen - Fri Jul 14, 2006 1:20 am

Here's a GNU licensed English dictionary:
ftp://ftp.gnu.org/gnu/dictionary

It's pretty hard to find a downloadable dictionary that has definitions, heh, mostly what I was coming across were bilingual ones...

#92603 - HyperHacker - Fri Jul 14, 2006 1:51 am

Mr. Picklesworth wrote:
That pitiful excuse for an example assumes that each line in the dictionary is a new word (which assumes no line breaks in definitions, which would be absurd).
Using an Index as per HyperHacker's suggestion could cause such crazy assumptions to be very possible :)

You have the right idea, but scanning for line breaks means reading every byte, which would be slow. The reason I suggested an index and padding words to 21 words is that you can compress the entire thing, then just decompress the index into memory and jump ahead 25 bytes at a time (21 chars + 4-byte address) until you find the word you're looking for.

BTW I imagine a dictionary would function as a simple spell checker. Even if it doesn't look for variations of the word, at least if you get a definition, you know you spelled it right. :-p
_________________
I'm a PSP hacker now, but I still <3 DS.

#92812 - tepples - Sat Jul 15, 2006 12:36 am

Perhaps once Wiktionary is more complete, we will have a data set against which someone can consider developing this project.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#92908 - Kir - Sat Jul 15, 2006 2:22 pm

One of our Russian GBA/DS community members has already made a homebrew English-Russian dictionary. Here's the topic about that proggie (Russian language). Download link is in a first post. Author says it's pretty easy to change dictionaries, so it can be used with almost any dictionary in a .txt format. Program development has been paused for a while, since Cluster (author of dictionary) has succesfully passed his English exams :) .

#93193 - dantheman - Mon Jul 17, 2006 4:46 pm

HyperHacker wrote:
BTW I imagine a dictionary would function as a simple spell checker. Even if it doesn't look for variations of the word, at least if you get a definition, you know you spelled it right. :-p

There's actually already a spellchecker out there for the GBA called Scrabble Dictionary, available at http://pdroms.de/file_details.php?fn=1554

Granted, it's not a full dictionary, just a spellchecker, but it works nonetheless. On real hardware however, it does tend to recognize a keystroke both when pressing and releasing a button, so you often end up moving two spaces in one direction instead of one or typing double letters where you meant to only type one.

#101369 - jbullfrog - Mon Sep 04, 2006 7:26 pm

As author of the Bible and Dictionary requests, I humbly ask...

Is anything going on towards these two wonderful apps?

#101385 - IxthusTiger - Mon Sep 04, 2006 9:41 pm

dantheman wrote:
HyperHacker wrote:
BTW I imagine a dictionary would function as a simple spell checker. Even if it doesn't look for variations of the word, at least if you get a definition, you know you spelled it right. :-p

There's actually already a spellchecker out there for the GBA called Scrabble Dictionary, available at http://pdroms.de/file_details.php?fn=1554

Granted, it's not a full dictionary, just a spellchecker, but it works nonetheless. On real hardware however, it does tend to recognize a keystroke both when pressing and releasing a button, so you often end up moving two spaces in one direction instead of one or typing double letters where you meant to only type one.


Must be an old build? mine doesn't do that. I would love the scrabble ditionary on the DS though, typing the letters would be easier.

#101428 - errabes - Tue Sep 05, 2006 12:55 pm

There is already an english -> chinese dictonary, using PALIB. I haven't tried it yet, but I'm interested as I'm learning chinese since a few years. It's supporting sound as well. For english words pronounciation, I guess.
Lastest dev thread: http://www.ndsbbs.com/simple/index.php?t43299.html (Chinese BBS)
Anyway, it's using the DICT "format" (like in stardict), and I invite anyone that would like to write a dictionary to use it. Then you can use tons of free dictionaries for many languages (http://stardict.sourceforge.net/Dictionaries.php)
I hope as well to write some minimalistic dict reader for DSLinux, using zhcon for asian languages support. Though I'm far to have started anything.

#101509 - jojjy - Wed Sep 06, 2006 3:42 am

tyraen wrote:
Here's a GNU licensed English dictionary:
ftp://ftp.gnu.org/gnu/dictionary

It's pretty hard to find a downloadable dictionary that has definitions, heh, mostly what I was coming across were bilingual ones...


hey, I'd love a french/english dictionary for the ds

#101676 - tuLL - Thu Sep 07, 2006 10:52 am

I have tons of dictionaries on my DS.

I have this one for example:

http://www.gutenberg.org/etext/673

Or divided by letters: 660 -> 670

In txt file mode and separated by letters. It's easy to search you just have to use Moonshell's sidebar.

I then have others in Portuguese, and some scientific dictionaries too.

Can't remember where I got them tho.

#102006 - jbullfrog - Sat Sep 09, 2006 7:54 pm

tuLL wrote:
I have tons of dictionaries on my DS.

I have this one for example:

http://www.gutenberg.org/etext/673

Or divided by letters: 660 -> 670

In txt file mode and separated by letters. It's easy to search you just have to use Moonshell's sidebar.

I then have others in Portuguese, and some scientific dictionaries too.

Can't remember where I got them tho.


what do you mean"or divided by letters"

Also, what is up with all the extra gibbersih letters in the text file?

#102109 - tuLL - Sun Sep 10, 2006 10:38 pm

jbullfrog wrote:
tuLL wrote:
I have tons of dictionaries on my DS.

I have this one for example:

http://www.gutenberg.org/etext/673

Or divided by letters: 660 -> 670

In txt file mode and separated by letters. It's easy to search you just have to use Moonshell's sidebar.

I then have others in Portuguese, and some scientific dictionaries too.

Can't remember where I got them tho.


what do you mean"or divided by letters"

Also, what is up with all the extra gibbersih letters in the text file?


Divided by letters is:

A.txt
B.txt
C.txt

...

that you can download on URLs:

http://www.gutenberg.org/etext/660

to

http://www.gutenberg.org/etext/670

:)