I can't help with tesseract advice - when I wanted to do the same thing I
found it easier to write a custom OCR for this specific problem from
scratch.  It's very much an experiment and a work-in-progress (although
I've not worked on it for about a year I'm afraid) but you might find
something helpful from the discussion or the code:
https://retrocomputingforum.com/t/custom-ocr-for-printer-listings/4016 and
http://gtoal.com/src/OCR/

However you *will* need to do better scans using a flatbed scanner if you
still have access to the originals. Those scans are unusable - the pages in
the recent one had not been laid flat - it looks like they were taken with
an overhead camera..

Graham

On Mon, Feb 10, 2025 at 4:29 AM Mixotricha <connolly.dam...@gmail.com>
wrote:

>
> I have a question about using Tesseract for trying to recover some source
> code of a printed listing that most likely would have come off a line
> printer in the early 70's probably scanned in by photocopier and them more
> recently by a more modern digital scanner.
>
> I have two copies of the document. One the original scan and another that
> was recently scanned for me by the archive area of the University that
> houses the document. Unfortunately both have different problems!
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CABwQhLki6huDtaGDGgSi_rwySEkFBwSzKGQLdV3iBAoHwjLJSw%40mail.gmail.com.

Reply via email to