I’m getting the “phantom character” issue as well using the OCRB that Shree
trained on MRZ lines. For example for a 0 it will sometimes add both a 0
and an O to the output , thus outputting 45 characters total instead of 44.
I haven’t looked at the bounding box output yet but I suspect a phantom
th
Can someone explain, what does lstm_use_matrix option do?
On Thu, Jul 18, 2019 at 11:36 AM Shree Devi Kumar
wrote:
> Binarize and invert the images to get black text on white. I tried with
> latest code from master branch on github, gives correct results.
>
> tesseract 2-bw.png stdout --psm 6 -
also
>>>> improved.
>>>>
>>>> I wrote some code that uses symbols iterator to discard symbols that
>>>> are clearly duplicated: too small, overlapping, etc. But it was not easy to
>>>> make it work decently and it is not 100% reliable w
gt;
> See
> https://github.com/tesseract-ocr/tesseract/wiki/APIExample#getcomponentimages-example
>
>
> On Fri, Jul 19, 2019 at 2:50 PM Claudiu wrote:
>
>> Is there any way to pass bounding boxes to use to the LSTM? We have an
>> algorithm that cleanly gets boundi
Please provide an SSCCE -
http://sscce.org . We can’t help you without it. How about the image , the
command line you ran, the expected output , and the output you got?
On Fri, Jul 26, 2019 at 9:19 PM Raghwendra Dey
wrote:
> My output type is known, means the digits can be of two digits or one
Did you ever resolve the difference between the two commands? I am having
the same issue - lstm training gives 0 error, but when run using tesseract
it gives errors
On Thursday, May 31, 2018 at 1:13:43 PM UTC+2, Julien Jemine wrote:
>
> Hi,
>
> I've trained a LSTM model for a custom language fro
specify that only letters and
numbers are used in the images and also there's only one block/word of text?
Thank you,
Claudiu
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving email
7 matches
Mail list logo