Hi Mark,
On 08/03/2024 20:24, Mark Pellegrino wrote:
Thank you Merlijn, this is very helpful. I'm very interested in IA's
process so I'll have a deep dive through those tools. This confirms my
suspicions that there's no way to use an off-the-shelf text editor with
a glyphless font. I'll explo
Thank you Merlijn, this is very helpful. I'm very interested in IA's
process so I'll have a deep dive through those tools. This confirms my
suspicions that there's no way to use an off-the-shelf text editor with a
glyphless font. I'll explore these hOCR editor options. All the best,
On Fri, Mar
Thanks Zedenko, PyMuPDF is an intriguing option. I'll check it out further.
On Fri, Mar 8, 2024 at 6:14 AM Zdenko Podobny wrote:
> Hello,
>
>
> I am not sure if OCRmyPDF(https://ocrmypdf.readthedocs.io/en/latest/)
> allows redaction.
>
> If you would to implement text layer by yourself with cust
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Error, data/eng/eng_num_vert.lstm is an integer (fast) model, cannot
continue training
Failed to continue from: data/eng/eng_num_vert.lstm
make: *** [Makefile:351:
data/eng_num_vert/checkpoints/eng_num_vert_checkpoint] Error 1
i need to Fine t
Hi Mark,
On 07/03/2024 20:53, Mark Pellegrino wrote:
I found more info here:
https://github.com/tesseract-ocr/tesseract/issues/1769#issuecomment-509490277
Glyphless appears to be an 'invisible font' and all that Tesseract
supports. It seems like the solution it to use Tesseract to generate
hO
Hello,
I am not sure if OCRmyPDF(https://ocrmypdf.readthedocs.io/en/latest/)
allows redaction.
If you would to implement text layer by yourself with custom font, have a
look at PyMuPDF:
- https://github.com/pymupdf/PyMuPDF/discussions/775 (Adding text layer
to a scanned PDF)
- https://
6 matches
Mail list logo