I have exactly the same problem for Amharic. I find three characters 
missing; and they are screwing the Ocr result. 
Dear Shree, can you help me please?

On Friday, January 6, 2017 at 3:50:38 PM UTC+3 shree wrote:

> I have uploaded modified nor.traineddata at
>
> https://github.com/Shreeshrii/tessdata4alpha/blob/master/nor.traineddata
>
> See attached log and info file for commands used in training. It took 
> about 9 hours on my pc - about 1700 iterations only and then my PC froze so 
> I rebooted and created the traineddata for norlayer0.853_1615.lstm i.e. 
> 0.853 % character error rate at iteration number 1615.
>
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Fri, Jan 6, 2017 at 5:59 PM, ShreeDevi Kumar <shree...@gmail.com> 
> wrote:
>
>> @Peter, Have you tried the 4.0.0alpha version yet?
>>
>> @Ludvig F. Aarstad - Add a layer training worked for adding 'Æ' - I will 
>> upload the new traineddata so that you can test. You will need 4.0.alpha 
>> version for testing.
>>
>> Here is couple of the training tifs and OCRed text.  
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Fri, Jan 6, 2017 at 5:01 PM, Peter <pe...@peterkrantz.se> wrote:
>>
>>>
>>>
>>> Den torsdag 5 januari 2017 kl. 04:39:01 UTC+1 skrev shree:
>>>>
>>>> Ray is planning to retrain the languages for the new 4.0.0 version 
>>>> sometime in January. So it would be helpful if you could open an issue on 
>>>> https://github.com/tesseract-ocr/langdata/issues with this information.
>>>>
>>>
>>> Is it possible to contribute training data for this effort? I realise 
>>> swedish will not be on top of the list but I think it would be easy to 
>>> involve some of the research community here in contributing training data 
>>> if it could improve the language model.
>>>
>>> /Peter 
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to tesseract-ocr+unsubscr...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/9788db26-bb8a-4861-b29e-80db2b5a687f%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/9788db26-bb8a-4861-b29e-80db2b5a687f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/be8e5df8-1283-4aa1-9b92-b3a4afc585f3n%40googlegroups.com.

Reply via email to