please share sample of image you're trying to recognize

суббота, 1 апреля 2023 г. в 10:56:58 UTC-4, ali8a...@gmail.com: 

> Is it best to train a new language? 
>
> On Saturday, April 1, 2023 at 7:54:30 a.m. UTC-7 shree wrote:
>
>> Aurebesh seems to be different symbols mapped to the English alphabet 
>> rather than a new font for English, hence training would need to be for a 
>> new language rather than just fine-tuning.
>>
>> On Sat, Apr 1, 2023, 10:47 Ali Abedian <ali8a...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> Thank you for providing the references, but I'm still a bit confused. I 
>>> have trained tesseract using the same method as described in 
>>> https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip, 
>>> with 100,000 sentences and a maximum iteration of 10,000. However, it still 
>>> cannot recognize a 6-letter word that I input from a TIF file using the 
>>> same font and settings. I have tried using fewer iterations, such as 1,000, 
>>> as well as more iterations, such as 20,000 and 100,000, but still no 
>>> results. Additionally, the BCER (Character Error Rate) doesn't seem to 
>>> change significantly with largere iterations, remaining at 3.56%. I'm 
>>> unsure of what I'm doing wrong or what I should do next, but any help would 
>>> be appreciated.
>>>
>>> Thank you.
>>> On Saturday, April 1, 2023 at 12:05:36 a.m. UTC-7 zdenop wrote:
>>>
>>>> Please have a look  at https://github.com/tesseract-ocr/tesstrain 
>>>> (especially 
>>>> https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip)
>>>>
>>>>
>>>> Zdenko
>>>>
>>>>
>>>> pi 31. 3. 2023 o 7:03 Ali Abedian <ali8a...@gmail.com> napísal(a):
>>>>
>>>>> Hey everyone! I'm currently working on a personal project where I'm 
>>>>> training a new font for the English language using Tesseract. The font is 
>>>>> called Aurebesh and it's from the Star Wars universe. Basically, each 
>>>>> letter in Aurebesh corresponds to a letter in English. I've collected 
>>>>> close 
>>>>> to 100,000 images and their corresponding translations, but I'm not sure 
>>>>> how many iterations I should run for a file of this size. I've tried 
>>>>> training with only 100 images, but it didn't work out. Can anyone advise 
>>>>> me 
>>>>> on how many iterations I should run and whether it's even possible to 
>>>>> train 
>>>>> a new font like this?
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to tesseract-oc...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/1b20c2e0-76b2-41a0-bc9f-e1a16b9c67a2n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/1b20c2e0-76b2-41a0-bc9f-e1a16b9c67a2n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to tesseract-oc...@googlegroups.com.
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/2cab8f1d-b81e-4926-a21b-8065a4178d04n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/2cab8f1d-b81e-4926-a21b-8065a4178d04n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/56e4beb3-644b-4be6-8c21-84e9856ec013n%40googlegroups.com.

Reply via email to