Apologies.  Python file in the google groups but for some reason didn’t come 
down with the email.

 

Also, I now have a sample program (nearly) working in C++.  My last step was to 
copy all the dlls from the vcpkg install into the source directory, otherwise 
they weren’t found when running.  I’m left with setting the location of the 
language file and it should work.  But the python will be helpful nonetheless.

 

Iain

 

From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On 
Behalf Of Dominic Mukilan
Sent: 13 July 2024 17:42
To: tesseract-ocr@googlegroups.com
Subject: Re: [tesseract-ocr] Tessarct won't recognise single characters

 

Attaching the python file, the supporting files, and requirements.txt

 

On Sat, 13 Jul 2024 at 21:56, Iain Downs <i...@idcl.co.uk 
<mailto:i...@idcl.co.uk> > wrote:

Can you give me some example code?  I'm currently trying to get tesseract 
working for C++ in Visual Studio and it's a bit of a nightmare.  python seems 
easier though it's not one of my main languages - I can try it out though!

 

Iain

On Saturday, July 13, 2024 at 11:20:54 AM UTC+1 renec...@gmail.com 
<mailto:renec...@gmail.com>  wrote:

Hi,

I try your example with tesseract for python - it works well

 

Le jeu. 11 juil. 2024 à 20:35, Iain Downs <ia...@idcl.co.uk> a écrit :

I'm trying to extract page numbers from scanned pages of text.  Page Numbers 
are either at the top or at the bottom - sometimes with titles / authors / 
chapters.  Occasionally elsewhere, but I don't care about the exceptions.

 

I've loaded tesseract 5.4 (windows) and run some tests using the executable.  
I'm finding that if the page number is a single digit on the line, tesseract 
ignores it (but otherwise does a fantastic job of OCR even with skewed and 
noisy images).

 

I've isolated the single line used that as input and tesseract tells me 'the 
page is empty'.

 

Here is a sample of a single line with a '1' in it resolution is 300dpi.



 

Ultimately I would be writing a program using tesseract, but in the first 
instance I'd like to see it work with the exe.

 

So, can I tell tesseract to be less fussy with individual characters and if not 
how would I do so programatically - if possible?

 

Thanks

 

Iain

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c42d435c-4db5-48b5-94d3-5b761d340731n%40googlegroups.com
 
<https://groups.google.com/d/msgid/tesseract-ocr/c42d435c-4db5-48b5-94d3-5b761d340731n%40googlegroups.com?utm_medium=email&utm_source=footer>
 .

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com 
<mailto:tesseract-ocr+unsubscr...@googlegroups.com> .
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2e56b599-4dcf-4b93-8e1b-40a57b36d3e9n%40googlegroups.com
 
<https://groups.google.com/d/msgid/tesseract-ocr/2e56b599-4dcf-4b93-8e1b-40a57b36d3e9n%40googlegroups.com?utm_medium=email&utm_source=footer>
 .

-- 
You received this message because you are subscribed to a topic in the Google 
Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/tesseract-ocr/AI48y7_QMlg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
tesseract-ocr+unsubscr...@googlegroups.com 
<mailto:tesseract-ocr+unsubscr...@googlegroups.com> .
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAOrS2tW_CUVUsOv%3DAXanD2947Q29xC8hO1z6kzXLciix8XHbJA%40mail.gmail.com
 
<https://groups.google.com/d/msgid/tesseract-ocr/CAOrS2tW_CUVUsOv%3DAXanD2947Q29xC8hO1z6kzXLciix8XHbJA%40mail.gmail.com?utm_medium=email&utm_source=footer>
 .

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/001a01dad5b9%245fcdcd80%241f696880%24%40idcl.co.uk.

Reply via email to