gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS Misc > Tesseract OCR Port?

#101497 - OOPMan - Tue Sep 05, 2006 11:40 pm

In case you didn't notice already, Google recently released a cleaned up version of the Tesseract OCR engine developed by HP between 1985 and 1995.

By all accounts it's a damn good OCR system...

Does anyone think this could be used to develop a general purpose writing recognition library for the DS?

I haven't taken a look at it yet, the release is source code I believe. Certainly seems like something to look into, though...

http://sourceforge.net/project/showfiles.php?group_id=158586
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...

#101516 - Lick - Wed Sep 06, 2006 6:31 am

Wonder how fast it will run on the DS. (I don't think it will run that fast, honestly.. :()

I'd rather see a fullblown (yet, easy and small) gesture system. That would be useful to a lot of homebrewers! *drools*

Nice find anyway! Thanks,
- Lick
_________________
http://licklick.wordpress.com

#101554 - OOPMan - Wed Sep 06, 2006 1:01 pm

I think you could be wrong...

Things to keep in mind...

Tesseract was developed between 1985 and 1995. Hence, I somehow doubt it was targetted at machines boasting ridiculous levels of processing power.

The specific scenario on the DS would involve processing only one or two characters at a time. Take a look at the Opera DS stylus entry system and you'll see what I mean...

While an HW recognition system would be nice, I think this could be a good place to start looking, since HWR is a similar problem to OCR in many ways (You could say that one is a variant of the other...)

Anyway, I'm going to grab the Tesseract source now and just see what the documentation (If any) says...

EDIT: I've downloaded the source and consulted the docs. They're pretty sparse, to say the least. So I've emailed the project leader at his SourceForge address. If/when he replies, I'll post the details here :-)
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...

#101645 - Dannon - Wed Sep 06, 2006 11:42 pm

I'm pretty sure this has already been done, not using Tesseract though, it was a while ago now but I think searching for DS Merlin might be in the right area, someone else can probably give you more exact details.

--Dannon

#102312 - OOPMan - Tue Sep 12, 2006 9:55 pm

Hmmmm, I did a google for that a while back Dannon and didn't find anything :-)

In other news, Ray Smith, the lead coder maintaining Tesseract at the moment finally got a chance to reply. Printed below is my original mail with his replies to my questions inlined (I have removed my own name. I'm paranoid :-) ). Enjoy :-)

OOPMan wrote:

Hi [removed],

I have inlined my answers below. Hope it helps.
Ray Smith

On 9/6/06, [removed] <oop_man_za@ananzi.co.za> wrote:
Hi there, hopefully I'm not bothering you with this
email...

I noticed the Tesseract OCR release thanks to Slashdot and
headed over to grab a copy. I was immediately interested in
the possibility of using Tesseract to develop a character
recognition library for use by homebrew Nintendo DS
developers. However, I'm still not certain whether this is
viable, Hopefully you can answer some of my questions on
this matter?

Anyway, here goes.

1: What kind of processing power does Tesseract require?

The NDS features a 33mhz ARM7 secondary processor and a
66mhz ARM9 primary processor. If Tesseract were to run on
the NDS, it would probably run off the ARM9 (Since the ARM7
is used mainly for input handling and sound). Furthermore,
Tesseract would probably be used to recognise characters
one by one, in a sequential fashion.

An example of a possible input interface is illustrated the
following image from a commercial NDS program:
http://files.myopera.com/DotEd/albums/89050/dsbrowser7.jpg

While the interface in question actually demonstrates
hand-writing recognition, it does give a good basic idea of
the kind of input system.

Anyway, do you think Tesseract could function on the NDS?
Would it be able to do so in real-time?

Ray Smith wrote:


Real-time relative to what kind of input?
In 1998, on a 200Mhz Pentium II, it managed about 100 characters per second.
It has been run successfully on both big and little-endian architectures, but not
on 64 bit yet, so you should be OK there.

2: What are the memory requirements of Tesseract?

The NDS features 4mb of primary memory, a pretty small
amount. It is possible to extend this memory via externally
attached devices, to a maximum of 36mb, including the
built-in 4mb of memory. The extension memory does not
function at quite the same speed, however, and so treating
it as being entirely contigious and uniform would be
problematic.

Do you think Tesseract would be able to function in the
low-memory environment of the NDS?

Ray Smith wrote:
The memory requirement is heavily dependent on the size of the input image and number of characters it contains. You could probably run it in as little as 4-8MB, but for a full letter page of 4000 characters you will need more like 20-30MB. Somebody is working on some hacks to make it easier to run on an embedded system, which you might find useful.


3: Assuming Tesseract could be adapted to work on the NDS
in an efficient manner, do you think it would be possible,
at all, to adapt it to the task of hand-writing
recognition. I realise this may not be possible, but it is
nevertheless something I'd like to be sure of before
embarking on any kind of project involving Tesseract and
the NDS...

Ray Smith wrote:
Unlikely. From the look of your screen-shot you are interested in online handwriting recognition and Tesseract has no mechanism to make use of the time domain information that you have available. If you could constrain the users to isolated handprint and make them go through a training phase, then you might stand a chance at static handprint recognition, but people hate to write that way and there is grafiti for that any way. That said, I have never tried it with handprint, and would be interested to know if you could train it to recognize handprint successfully. You will need the trainng system, which is not yet available, but coming soon.


_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...

#102329 - josath - Wed Sep 13, 2006 12:27 am

OOPMan wrote:
Hmmmm, (I have removed my own name. I'm paranoid :-) ). Enjoy :-)


So you're paranoid, OOPMan? Or should I say... A. J.? *cackles manically*

#102361 - OOPMan - Wed Sep 13, 2006 8:14 am

Argh!!!

That's exactly what I'm talking about...

Now, now, calm down OOP, there's a perfectly good reason why he might *know* things...

Perfectly good...

Argh, I give up, he's wid der guvermunt, wun away!!!

Anyway, back on topic...

If anyone's keen to do a Tesseract port then it does at least seem to be within the bounds of reality...
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...