Re: [tesseract-ocr] OCR add extra characters from image file

2025-01-28 Thread Farokh Irani
I'm using everything as provided in the download. I was able to get some success by enlarging the image a bit when I cropped and converted it from PDF to TIF, but it still occurs on other images. On Tuesday, January 28, 2025 at 1:41:35 AM UTC-5 sara.el...@gmail.com wrote: > I'm also facing the

Re: [tesseract-ocr] OCR add extra characters from image file

2025-01-27 Thread Sara Elshobaky
I'm also facing the same problem. - Which model are you using? - Is it from the original tessdata models or a new one you tuned? - Also, is the original model from the tessdata folder, or from the tessdata/scripts folder? On Mon, Jan 27, 2025 at 5:56 PM Farokh Irani wrote: > I have a small .TIF

[tesseract-ocr] OCR add extra characters from image file

2025-01-27 Thread Farokh Irani
I have a small .TIF file with only around 28 characters. It's 300 DPI, B&W, no compression. The issue is that in the image I have the following text: 04-50288 2 and after OCR, I wind up with the text 0464-502882. I've tried using different --psm (6, 7, 11, 13), all produce the same output. Any i