Dňa 28.07.2010 17:02, Jimmy O'Regan wrote / napísal(a): > >>>> I grepped the code and it seems to be looking for something called >>>> LANG.user-words, but that didn't seem to do anything -- I got the same >>>> garbled text when I ran Tesseract 3 the second time. >>>> >> Turns out T3 doesn't even access $LANG.user-words. I suspect it's looking >> for it in the traineddata file... >> >> > Hmm... probably... which is quite a stupid thing to do, really, but I > presume nobody in Google actually uses this, so it's probably quite > neglected. > > I'm toying with the idea of adding support for an actual *user* list - > i.e., that tesseract would check $HOME/.tesseract/lang.user-words - > because assuming a single user system that the user has full control over is > still a braindamaged assumption. > just idea: maybe this should be handled by environment variable. If I set up: export TESSDATA_PREFIX=~/. tesseract will try to get ALL files from "$HOME/.tessdata"
Problem is that if tesseract did not find all needed files (e.g. eng.traineddata) in $TESSDATA_PREFIX it stops... (e.g. it will not look at "standard" installation directories like /usr/share/tessdata or /usr/local/share/tessdata). I tried to use "export TESSDATA_PREFIX=~/.:/usr/local/share/tessdata" but it did not worked (tesseract tried to open file "/home/zdeno/.:/usr/local/share/tessdatatessdata/eng.traineddata" that is not correct) Zd. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-...@googlegroups.com. To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.