See tesstrain_utils.sh
On Thu, 20 Jun 2019, 10:55 hrishikesh kaulwar, wrote:
>
> Hey shree could you tell me what line in tesstrain.sh takes care of user
> provided tiff box pairs. Like what is the line which creates lstmf files
> from those pairs and then puts the name of lstmf files in trainin
Hey shree could you tell me what line in tesstrain.sh takes care of user
provided tiff box pairs. Like what is the line which creates lstmf files
from those pairs and then puts the name of lstmf files in training_list.
Thanks in advance.
On Tuesday, June 18, 2019 at 2:54:09 PM UTC+5:30, hrishik
Yes there are few cases like this in my data due to partially skewed
scanning. can anyone tell me how to train tesseract or how to proceed with
ocr in such cases?
Dictionary is nice idea I think since I and l have very indistinguishable
shapes in this font. Thanks for helping.
On Thursday, June
Yes there are few cases like this in my data due to partially skewed
scanning. can anyone tell me how to train tesseract or how to proceed with
ocr in such cases?
On Wednesday, June 19, 2019 at 5:07:18 PM UTC+5:30, hrishikesh kaulwar
wrote:
>
> Dear all,
> In the above image tesseract
There are some cases like this in my data due to partiall skewed scanning
On Wednesday, June 19, 2019 at 5:07:18 PM UTC+5:30, hrishikesh kaulwar
wrote:
>
> Dear all,
> In the above image tesseract could not detect the first letter S
> which is important for my purpose.Also there are fe
Does it have to be distorted like that? It's amazing that human being can
take it as an S. Is neural network ever capable of doing the same thing?
If I and l do not take the same shape, I'd think of dictionary or post
processing to switch them around.
2019年6月19日水曜日 20時37分18秒 UTC+9 hrishikesh ka
I'm fine tuning for chi_sim, not eng. Which seems to be more complicated.
在 2019年6月19日星期三 UTC-4下午4:22:29,Jingjing Lin写道:
>
> We know that we do fine tuning for tesseract based on tessdata_best, but
> what do we inherit from tessdata_best? Is it just the weights of the neural
> nets?
>
> From wh
We know that we do fine tuning for tesseract based on tessdata_best, but
what do we inherit from tessdata_best? Is it just the weights of the neural
nets?
>From what I have it looks like the new .unicharset only contains those
characters in the .training_text I created. I guess this means the
Would you be able to provide an example of said table?
On Wed, Jun 19, 2019 at 8:40 AM Momene Vigal wrote:
> Hello, please im a beginner with tesseract actually using it with java
> please can anyone help me with how to do the ocr of a table with
> tesseract
> in python or java
>
> --
> You rec
Thanks for your comments.
So did you mean we cannot use the method to add a special character to eng
to add a special character to chi_sim? We'll have to retrain the top layer
to achieve this?
Another question is, when we use a smaller .training_text, the .unicharset
only contains a limited a
Old thread
https://groups.google.com/forum/#!searchin/tesseract-ocr/layer$20chi_sim%7Csort:date/tesseract-ocr/iFMg7Gjczq4/f7_XRop2BAAJ
On Wed, Jun 19, 2019 at 9:13 PM Shree Devi Kumar
wrote:
> Update:
>
> 1. When using a smaller training_text for chi_sim for plus training, the
> unicharset gets
Update:
1. When using a smaller training_text for chi_sim for plus training, the
unicharset gets restricted. So, merge the lstm-unicharset with it.
2. The unicharset for chi_sim using langdata is different from the one
extracted from tessdata_best. so using training_text from langdata will add
mo
Hello, please im a beginner with tesseract actually using it with java
please can anyone help me with how to do the ocr of a table with tesseract
in python or java
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group
>
>
> @Mox Betex
>
>> Did you train Tesseract?
>>
>
>>
Yes, I have.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com
In lsmttraining process, is say
Can't encode transcription: 'ကင်းသည် ဖြစ်ရာ၏၊ မင်းမြတ် *ခြင်္သေ့*၏
ရှေးဦးစွာသောX ဤအင်္ဂါကို ယူအပ်၏။' in language '
when there have kinzi in string.
On Wednesday, 19 June 2019 11:28:00 UTC+6:30, Pndaza wrote:
>
> I wrongly gave old traineddata (mya-layer.trained
Dear all,
In the above image tesseract could not detect the first letter S
which is important for my purpose.Also there are few cases where I(capital
i) and l(small L) are detected wrongly. what training or method I can use
to improve tesseract results in such cases.
Thanks in
Thanks all for your answers!
@Mox Betex
> Did you train Tesseract?
>
@ElGato ElMago
> Those images and fonts obviously are not for OCR. Need to improve images
> and train fonts.
No, I use tesseract vanilla, only binary tuning parameters.
I'd like to avoid training my own model at first, but I
Okay I will ignore it. Just wanted to know what the generation of text file
signifies in lstm train step since its unusual. Is it some decoding
encoding error? Is it showing incomplete lstm training? I have attached a
sample text file. You can check out the file. Tell me if you know what is
w
Okay.
On Wednesday, June 19, 2019 at 3:18:12 PM UTC+5:30, shree wrote:
>
> >Also one more doubt is when I use lstm.train command a text file also
> gets generated with lstmf file
> You can ignore that txt file. Only lstmf is used for further processing.
>
> On Wed, Jun 19, 2019 at 2:44 PM hrishi
Hi Nicolas, I think what you did is good, you just need to play with
pre-processing more.
I usually process the images with Gimp until I can get a good results, then
I try to do the same processing with opencv/PIL.
You do not strictly need to threshold the image, a very very strong
contrast is en
>Also one more doubt is when I use lstm.train command a text file also gets
generated with lstmf file
You can ignore that txt file. Only lstmf is used for further processing.
On Wed, Jun 19, 2019 at 2:44 PM hrishikesh kaulwar
wrote:
> Hello shree,
> I tried again with .tif and lstm.train co
Hello shree,
I tried again with .tif and lstm.train command generated .txt file
again along with lstmf file. I don't think that's the error. Thanks for
helping.
On Wednesday, June 19, 2019 at 2:02:54 PM UTC+5:30, shree wrote:
>
> > eng.Arial_Regular.exp0.png
>
> The script expects tif file
> eng.Arial_Regular.exp0.png
The script expects tif files not png.
On Wed, Jun 19, 2019 at 1:42 PM hrishikesh kaulwar
wrote:
> Thank you for your help. I have checked it many times. Could you tell me
> where I am doing wrong? It takes my 3 tiff box pairs for example and copies
> it into train d
Thank you for your help. I have checked it many times. Could you tell me
where I am doing wrong? It takes my 3 tiff box pairs for example and copies
it into train directoey. Then it overwrites exp0.tif file with randomly
generated text and text2image tool. Although 3 tiff box pairs are accepted
I mean to train with OCR-D.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tes
25 matches
Mail list logo