Hm.
I guess i just ship all 3 of them. *lol*
and add the text of the wiki to the readme.

Greetings,
Simon


Am 04.03.2018 um 18:43 schrieb ShreeDevi Kumar:
The traineddata files in tessdata_best are larger in size and OCR takes
more time. They are supposedly slightly more accurate, but there are no
definitive results provided by Ray.

tessdata_fast is what has been shipped for Debian and Ubuntu, so that seems
the way to go for doing OCR. These however cannot be used for fine-tune
training.

Those who want to do training, need to use files from tessdata_best.

On Sun 4 Mar, 2018, 10:55 PM Simon Eigeldinger, <simon.eigeldin...@vol.at>
wrote:

Hi ShreeDevi,

I have scraped the cygwin builds.
i am using now the builds i get from the appveyor builds which just
needs me to repackage the resulting stuff.

so tessdata_best isn't like the wiki says for better accuracy?

greetings,
Simon

Am 03.03.2018 um 05:12 schrieb ShreeDevi Kumar:
Hi Simon,

If you are planning to package using 4.00alpha from master branch, please
use traineddata files from tessdata_fast. These are the files that have
been shipped for Ubuntu 18.04 and included in Debian. See
https://github.com/tesseract-ocr/tesseract/wiki for some links.

You can update the wiki page re cygwin.

FYI - tessdata repo supports both --oem 0 and --oem 1, but the files are
older and may NOT be fully compatible with current code.

tessdata_best has files which can be used for further finetune/plusminus
type training.

*tessdata_fast has faster integer models and is the recommended one to be
used for OCR. *

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sat, Mar 3, 2018 at 2:04 AM, Simon Eigeldinger <
simon.eigeldin...@vol.at>
wrote:

Hi all,

Just looked at the git commits for tesseract and read that there has
been
changes to the OCR modes.
are the 3 tessdata sets still valid?
tessdata_fast and tessdata_best have been updated so i guess those
reflect
the latest developments but tessdata hasn't an update since september.
is that 3rd set still useable or shouldn't that ome not be used anymore?
on the wiki
https://github.com/tesseract-ocr/tesseract/wiki/Data-Files
it's still listed as useable.

Any suggestions?

Greetings and thanks,
Simon

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

--
You received this message because you are subscribed to the Google
Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/ms
gid/tesseract-ocr/3c4c0b75-b411-3227-26e1-d1d2485b9572%40vol.at.
For more options, visit https://groups.google.com/d/optout.



--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/fe81e363-72c4-9a16-de35-d5c0ad38491c%40vol.at
.
For more options, visit https://groups.google.com/d/optout.



--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d4581884-4b65-c3d6-17db-1594eba8e35f%40vol.at.
For more options, visit https://groups.google.com/d/optout.

Reply via email to