Hi Simon, If you are planning to package using 4.00alpha from master branch, please use traineddata files from tessdata_fast. These are the files that have been shipped for Ubuntu 18.04 and included in Debian. See https://github.com/tesseract-ocr/tesseract/wiki for some links.
You can update the wiki page re cygwin. FYI - tessdata repo supports both --oem 0 and --oem 1, but the files are older and may NOT be fully compatible with current code. tessdata_best has files which can be used for further finetune/plusminus type training. *tessdata_fast has faster integer models and is the recommended one to be used for OCR. * ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Mar 3, 2018 at 2:04 AM, Simon Eigeldinger <simon.eigeldin...@vol.at> wrote: > Hi all, > > Just looked at the git commits for tesseract and read that there has been > changes to the OCR modes. > are the 3 tessdata sets still valid? > tessdata_fast and tessdata_best have been updated so i guess those reflect > the latest developments but tessdata hasn't an update since september. > is that 3rd set still useable or shouldn't that ome not be used anymore? > on the wiki > https://github.com/tesseract-ocr/tesseract/wiki/Data-Files > it's still listed as useable. > > Any suggestions? > > Greetings and thanks, > Simon > > --- > Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. > https://www.avast.com/antivirus > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ms > gid/tesseract-ocr/3c4c0b75-b411-3227-26e1-d1d2485b9572%40vol.at. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW8Fzpm-Pff1Oq3AdPuQAaSvzPfgn7fb0mGty6qHcDJ0Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.