First of all:
Unless you share input image, it does not make sense  to share output.

Next - read the doc. You can start here
If you fail with image preprocessing and document analysis/text detection,
training will not help you.

If you need to know the model in detail - you will need to read the source
code (I am afraid) .


ut 5. 10. 2021 o 16:51 Ruchika Tyagi <> napísal(a):

> hi Zdenko,
> Thanks for your feedback!
> I have implemented the following things in Colab:
> 1/ installed tesseract ocr and pytesseract
> 2/ Used pytesseract.image_to_string to convert the image of scanned
> document to text.
> The output text is like:
> sae S\Pewnowet refer Yo We Uniovetha, Bops don't a where MWAH ple
> Commvadityer gre. Avediarie tee wode Onden OMe wol ' and On Wigs kcale. of
> Oferakin, nee. es: [rer Bat Chain in Prd Vegelanie “roger | SP in Pst
> Vegelasie “Wieder | ; AD Me ]8 inc ug Maer Contumneg hom Nes “I —> ty Uae |
> . Mere ed Serigh Soma)
> Which is not making sense.
> So I was asking if there are ways to dig deeper into tesseract built in
> model and understand the output of each layer. And then try some
> enhancements to decode this better.
> But for that, I need to know the model in detail and should be able to use
> it in Colab. and I am not able to find any relevant text around it. All I
> could find is tuning of model from command line that too on Linux machines.
> So if there is any, would request you to provide a reference.
> Ruchika
> On Tuesday, October 5, 2021 at 3:12:02 PM UTC+5:30 zdenop wrote:
>> Generally: new user + "i want to train tesseract" = fail
>> If you are asking for help/support, provide information about what you
>> have already tried, some examples of input images, tools you are able/plan
>> to use...
>> Zdenko
>> ut 5. 10. 2021 o 11:36 Ruchika Tyagi <> napísal(a):
>>> hello,
>>> I am new to Tesseract and trying to use it for one of the use case.
>>> I wonder if there is any way to use the already trained models through
>>> Colab? And further train them if required.
>>> I am actually looking for outputs after layers and may be remove the top
>>> layer for further processing. However, till now I have not found anything
>>> relevant around this.
>>> Can anyone please help?
>>> Thanks
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to
>>> To view this discussion on the web visit
>>> <>
>>> .
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> To view this discussion on the web visit
> <>
> .

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To view this discussion on the web visit

Reply via email to