Following up: try uploading images of real world docs. Please avoid taking photos of photos ( that is photos of computer screen which has documents). Don't take photos of computer screen containing documents. Capture real document and upload them.
On Sat, Jun 5, 2021, 12:38 AM Ajinkya Bobade <ajinkyabobad...@gmail.com> wrote: > Hi Timo, > > Results are in low resolution because the image that you uploaded must be > taken from sample set, this image that you uploaded is not taken from a > real mobile phone camera. > > I recommend you to upload image captured from good quality phone camera > and retry few more times with different images captured from phone camera. > My software works poorly for sample images which are not real world. It > works excellent for images in real world. > > Feel free to reach out to me if you have any questions or concerns. > > Regards > Ajinkya > > > > > > > On Thu, Jun 3, 2021, 4:38 PM Timo Richter <timo.j...@gmail.com> wrote: > >> Hi Ajinkya, >> >> the result looks better than mine. But it looks like a very low >> resolution, the text is not readable. How did you do it? >> Still the Google AI website is a lot more accurate. How can they have >> done this? >> >> >> ajinkya...@gmail.com schrieb am Mittwoch, 2. Juni 2021 um 17:23:44 UTC+2: >> >>> Hello, >>> I have created a web extension which solves this problem. Upload image >>> to https://imagescanner-online.com/ it will clear your noise and >>> pixel-segment text so that you get a very good quality input, which you can >>> feed to tesseract and get good output >>> >>> Regards >>> Ajinkya >>> >>> On Wed, Jun 2, 2021 at 12:13 AM Timo Richter <timo...@gmail.com> wrote: >>> >>>> Hi everyone, >>>> >>>> I have tried to ocr an identity card [1] and big parts were not >>>> recognised. I do not get anything from the headline nor the first few rows. >>>> From the middle, Tesseract partially finds correct text. There are lines >>>> and things in the background, as usual. In the monochrome picture I could >>>> not completely extract the letters from the background. Some gray pixels >>>> stay there. But there is a website that does OCR and it works perfectly >>>> [2]. Why do I get bad results and my Tesseract does not read the text? What >>>> will the website do another way? >>>> >>>> >>>> Thank you in advance, >>>> >>>> Timo >>>> >>>> >>>> [1] >>>> https://en.wikipedia.org/wiki/Philippine_passport#/media/File:Philippine_passport_(2016_edition)_data_page.jpg >>>> (public domain) >>>> [2] https://cloud.google.com/document-ai#section-2 >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/4f6d0261-5e0a-49c8-b6db-3e2b0e4ad9f5n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/4f6d0261-5e0a-49c8-b6db-3e2b0e4ad9f5n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/9e83609b-1bad-4134-950a-025357e092b5n%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/9e83609b-1bad-4134-950a-025357e092b5n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAHy6iNN0Z1U5gXgxDCvOepa4Szb9tv4wt-qW6y7q%2Br8ci8iV6Q%40mail.gmail.com.