Re: [tesseract-ocr] Questions about recognize Chinese characters

2019-04-09 Thread Aaron Shieh
I get '焊接' with the following: tesseract 67.png o -l chi_tra --oem 0 --psm 7 i'm using tesseract 4.1.0 64-bit build on windows 10, and traineddata from https://github.com/tesseract-ocr/tessdata -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.

Re: [tesseract-ocr] Questions about recognize Chinese characters

2019-04-10 Thread Aaron Shieh
I tried using --oem 1 but the results are really bad, that's why I resorted to legacy mode. Do you have any luck with LSTM models? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it

[tesseract-ocr] Need advice for training_text.txt

2019-04-10 Thread Aaron Shieh
Hi, I noticed in the langdata_lstm/chi_tra repo the training text contains long lines of text, my application requires only identifying single line text with only max of 15 chinese characters, so my question is how should I make my training text? I was thinking something like this, where each