Hey folks, I downloaded tesseract tonight and I'm having an issue I can't 
get past. The error output is as follows: Deserialize header failed: ☺
First document cannot be empty!!
num_pages_per_doc_ > 0:Error:Assert failed:in file 
../../../src/ccstruct/imagedata.cpp, line 704

I am using a tif file as my raw image source. I have tried 2 different 
methods of generating the tif file. The first method is taking a screenshot 
with snipping tool, pasting it into gimp and saving as a tif. I also tried 
print screening instead of snipping tool. The second method is taking a 
screenshot with snipping tool, saving as a .png, then converting to .tif 
via ImageMagick commandline. I am creating the box file like so:

tesseract 9.tif 9 makebox

I then editing the box file to make sure it is an accurate representation 
of the characters on the screen. I have also tried creating the box file 
and just leaving it to see if that resolves the issue, it does not. I then 
proceed to create the lstmf file like so:

tesseract 9.tif 9 --psm 6 lstm.train

I then try to run lstmtraining or lstmeval and i get the header error every 
time. I am using version 5.3.3, but I have also tried using v4.1, 
recreating all the files and I still got the same issue. Does anyone know 
why I'm getting this issue, and how to resolve it? About to give up with 
tesseract because this shit does not work out of the box. I am following 
google instructions to a T so I either overlooked something crucial that is 
ruining my lstmf file or this shit just does not work for me. Appreciate 
any help that can be provided.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ff9e7700-ca32-4692-84d1-623ebe353b9dn%40googlegroups.com.

Reply via email to