Aurebesh seems to be different symbols mapped to the English alphabet
rather than a new font for English, hence training would need to be for a
new language rather than just fine-tuning.

On Sat, Apr 1, 2023, 10:47 Ali Abedian <ali8abed...@gmail.com> wrote:

> Hello,
>
> Thank you for providing the references, but I'm still a bit confused. I
> have trained tesseract using the same method as described in
> https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip,
> with 100,000 sentences and a maximum iteration of 10,000. However, it still
> cannot recognize a 6-letter word that I input from a TIF file using the
> same font and settings. I have tried using fewer iterations, such as 1,000,
> as well as more iterations, such as 20,000 and 100,000, but still no
> results. Additionally, the BCER (Character Error Rate) doesn't seem to
> change significantly with largere iterations, remaining at 3.56%. I'm
> unsure of what I'm doing wrong or what I should do next, but any help would
> be appreciated.
>
> Thank you.
> On Saturday, April 1, 2023 at 12:05:36 a.m. UTC-7 zdenop wrote:
>
>> Please have a look  at https://github.com/tesseract-ocr/tesstrain
>> (especially
>> https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip)
>>
>>
>> Zdenko
>>
>>
>> pi 31. 3. 2023 o 7:03 Ali Abedian <ali8a...@gmail.com> napísal(a):
>>
>>> Hey everyone! I'm currently working on a personal project where I'm
>>> training a new font for the English language using Tesseract. The font is
>>> called Aurebesh and it's from the Star Wars universe. Basically, each
>>> letter in Aurebesh corresponds to a letter in English. I've collected close
>>> to 100,000 images and their corresponding translations, but I'm not sure
>>> how many iterations I should run for a file of this size. I've tried
>>> training with only 100 images, but it didn't work out. Can anyone advise me
>>> on how many iterations I should run and whether it's even possible to train
>>> a new font like this?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/1b20c2e0-76b2-41a0-bc9f-e1a16b9c67a2n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/1b20c2e0-76b2-41a0-bc9f-e1a16b9c67a2n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/2cab8f1d-b81e-4926-a21b-8065a4178d04n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/2cab8f1d-b81e-4926-a21b-8065a4178d04n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUQWE6_ifz1ShNNGTQPQDmAb%2BtpPUQDJZNrpGMHvpdyJQ%40mail.gmail.com.

Reply via email to