Re: Having traindata files uncombined

2012-11-15 Thread Zdenko Podobný
estions? Thanks! On Sat, Aug 11, 2012 at 6:48 PM, Chathuri Gunawardhana < lanch.gunawardh...@gmail.com> wrote: -- Forwarded message -- From: zdenko podobny Date: Sat, Aug 11, 2012 at 6:38 PM Subject: Re: Having traindata files uncombined To: tesseract-ocr@googlegroups.com

Re: Having traindata files uncombined

2012-08-12 Thread Chathuri Gunawardhana
Chathuri Gunawardhana < >> lanch.gunawardh...@gmail.com> wrote: >> >>> >>> >>> -- Forwarded message -- >>> From: zdenko podobny >>> Date: Sat, Aug 11, 2012 at 6:38 PM >>> Subject: Re: Having traindata files unc

Re: Having traindata files uncombined

2012-08-12 Thread zdenko podobny
lt; > lanch.gunawardh...@gmail.com> wrote: > >> >> >> -- Forwarded message -- >> From: zdenko podobny >> Date: Sat, Aug 11, 2012 at 6:38 PM >> Subject: Re: Having traindata files uncombined >> To: tesseract-ocr@googlegroups.com >> >> >> Yea

Re: Having traindata files uncombined

2012-08-12 Thread Chathuri Gunawardhana
ug 11, 2012 at 6:48 PM, Chathuri Gunawardhana < lanch.gunawardh...@gmail.com> wrote: > > > -- Forwarded message -- > From: zdenko podobny > Date: Sat, Aug 11, 2012 at 6:38 PM > Subject: Re: Having traindata files uncombined > To: tesseract-ocr@googlegr

Re: Having traindata files uncombined

2012-08-11 Thread Chathuri Gunawardhana
Really thanks a lot! On Sat, Aug 11, 2012 at 6:38 PM, zdenko podobny wrote: > Yeah - it is much better ;-) > Unfortunately at the moment I do not have time for deep testing so here > are my suggestions: > >- if you are using tesseract via api, try to set rectangles (instead >of whole ima

Re: Having traindata files uncombined

2012-08-11 Thread zdenko podobny
Yeah - it is much better ;-) Unfortunately at the moment I do not have time for deep testing so here are my suggestions: - if you are using tesseract via api, try to set rectangles (instead of whole image) with coords of city names to avoid "noise" (e.g. contours) from map. tesseract is "

Re: Having traindata files uncombined

2012-08-11 Thread Chathuri Gunawardhana
actually you can use this image under http://www.taprobanetravels.com/images/map-of-sri-lanka.jpg. It is high quality than above. On Sat, Aug 11, 2012 at 4:40 PM, zdenko podobny wrote: > > On Sat, Aug 11, 2012 at 12:58 PM, Chathuri Gunawardhana < > lanch.gunawardh...@gmail.com> wrote: > >> Imag

Re: Having traindata files uncombined

2012-08-11 Thread zdenko podobny
On Sat, Aug 11, 2012 at 12:58 PM, Chathuri Gunawardhana < lanch.gunawardh...@gmail.com> wrote: > Image that I'm trying to identify is attached. Most words in here are not > identified correctly. I added these words to user words and combined. But > still didn't get the expected output. > > your at

Re: Having traindata files uncombined

2012-08-11 Thread Chathuri Gunawardhana
Image that I'm trying to identify is attached. Most words in here are not identified correctly. I added these words to user words and combined. But still didn't get the expected output. On Sat, Aug 11, 2012 at 4:24 PM, zdenko podobny wrote: > > > On Sat, Aug 11, 2012 at 12:46 PM, Chathuri Gunawa

Re: Having traindata files uncombined

2012-08-11 Thread zdenko podobny
On Sat, Aug 11, 2012 at 12:46 PM, Chathuri Gunawardhana < lanch.gunawardh...@gmail.com> wrote: > Yes I was able to unpack them, added words to wordlist and word-freq files > created dawg from these 2 files and then pack all to create traindata. But > with newly created traindata also, tesseract do

Re: Having traindata files uncombined

2012-08-11 Thread Chathuri Gunawardhana
Yes I was able to unpack them, added words to wordlist and word-freq files created dawg from these 2 files and then pack all to create traindata. But with newly created traindata also, tesseract does not identify these words. Can you please help me? On Fri, Aug 10, 2012 at 11:35 PM, Nick White wr

Re: Having traindata files uncombined

2012-08-11 Thread Nick White
See my reply to you sent yesterday, subject: 'Re: Tesseract does not identify local words written in English'. Basically you can extract the needed files then recombine them using combine_tessdata. Look at the man page for details. Nick

Having traindata files uncombined

2012-08-10 Thread Chathuri Gunawardhana
Dear sir, Can we have the files mentioned in http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 that has been trained for english. I need to add some new words to user-words. But since user word file is not available with tesseract installation I can't do it. I just need to add so