Apologies. Python file in the google groups but for some reason didn’t come down with the email.
Also, I now have a sample program (nearly) working in C++. My last step was to copy all the dlls from the vcpkg install into the source directory, otherwise they weren’t found when running. I’m left with setting the location of the language file and it should work. But the python will be helpful nonetheless. Iain From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On Behalf Of Dominic Mukilan Sent: 13 July 2024 17:42 To: tesseract-ocr@googlegroups.com Subject: Re: [tesseract-ocr] Tessarct won't recognise single characters Attaching the python file, the supporting files, and requirements.txt On Sat, 13 Jul 2024 at 21:56, Iain Downs <i...@idcl.co.uk <mailto:i...@idcl.co.uk> > wrote: Can you give me some example code? I'm currently trying to get tesseract working for C++ in Visual Studio and it's a bit of a nightmare. python seems easier though it's not one of my main languages - I can try it out though! Iain On Saturday, July 13, 2024 at 11:20:54 AM UTC+1 renec...@gmail.com <mailto:renec...@gmail.com> wrote: Hi, I try your example with tesseract for python - it works well Le jeu. 11 juil. 2024 à 20:35, Iain Downs <ia...@idcl.co.uk> a écrit : I'm trying to extract page numbers from scanned pages of text. Page Numbers are either at the top or at the bottom - sometimes with titles / authors / chapters. Occasionally elsewhere, but I don't care about the exceptions. I've loaded tesseract 5.4 (windows) and run some tests using the executable. I'm finding that if the page number is a single digit on the line, tesseract ignores it (but otherwise does a fantastic job of OCR even with skewed and noisy images). I've isolated the single line used that as input and tesseract tells me 'the page is empty'. Here is a sample of a single line with a '1' in it resolution is 300dpi. Ultimately I would be writing a program using tesseract, but in the first instance I'd like to see it work with the exe. So, can I tell tesseract to be less fussy with individual characters and if not how would I do so programatically - if possible? Thanks Iain -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c42d435c-4db5-48b5-94d3-5b761d340731n%40googlegroups.com <https://groups.google.com/d/msgid/tesseract-ocr/c42d435c-4db5-48b5-94d3-5b761d340731n%40googlegroups.com?utm_medium=email&utm_source=footer> . -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com <mailto:tesseract-ocr+unsubscr...@googlegroups.com> . To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2e56b599-4dcf-4b93-8e1b-40a57b36d3e9n%40googlegroups.com <https://groups.google.com/d/msgid/tesseract-ocr/2e56b599-4dcf-4b93-8e1b-40a57b36d3e9n%40googlegroups.com?utm_medium=email&utm_source=footer> . -- You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/AI48y7_QMlg/unsubscribe. To unsubscribe from this group and all its topics, send an email to tesseract-ocr+unsubscr...@googlegroups.com <mailto:tesseract-ocr+unsubscr...@googlegroups.com> . To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAOrS2tW_CUVUsOv%3DAXanD2947Q29xC8hO1z6kzXLciix8XHbJA%40mail.gmail.com <https://groups.google.com/d/msgid/tesseract-ocr/CAOrS2tW_CUVUsOv%3DAXanD2947Q29xC8hO1z6kzXLciix8XHbJA%40mail.gmail.com?utm_medium=email&utm_source=footer> . -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/001a01dad5b9%245fcdcd80%241f696880%24%40idcl.co.uk.