I know there are some similar posts - I've read them all! - but they don't seem to provide an answer. I'm in Windows 11 with Tesseract 5.2.0.20220712.
I was having trouble applying a user word list instead of the dawg list so I made a very simple example with one is not correctly detected plus a user-words file with one entry of a close match. So, here's the image, temp.png, which is a slightly blurred image of "testW0rd", and using this command: "C:\Program Files\Tesseract-OCR\tesseract" temp.png output --psm 3 I get the result "testwurd" in output.txt. OK, so following instructions in now when I put a file called eng.user-words with one entry - "testWord" in C:\Program Files\Tesseract-OCR\tessdata and a text file called bazaar in C:\Program Files\Tesseract-OCR\tessdata\configs with the following lines: load_system_dawg F load_freq_dawg F user_words_suffix user-words language_model_penalty_non_dict_word 1 And run again, I get the same result as before: "testwurd". It doesn't seem to be using the user-words file? Or rather since it errors if it's not there, it is accessing it but possibly not doing anything with it? Any ideas why this is not working, would really appreciate some help with this from an expert. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b9450ec9-f943-40dd-8948-c2071e0f96f1n%40googlegroups.com.
eng.user-words
Description: Binary data
bazaar
Description: Binary data