Re: [tesseract-ocr] Tesseract Training: Error 'Integer (fast) model' When Using Apex.lstm

2025-05-07 Thread ZeroCool Zero
You should use eng.traineddata file from the tesseract "best" repository as your requirement https://github.com/tesseract-ocr/tessdata_best for that error you may use a wrong eng.traineddata file ในวันที่ วันอาทิตย์ที่ 23 มีนาคม ค.ศ. 2025 เวลา 1 นาฬิกา 59 นาที 37 วินาที UTC+7 zdenop เขียนว่า:

Re: [tesseract-ocr] Tesseract training with Custom Dataset

2025-04-20 Thread TheComplete BookOfMormon
Yes you can. This video is very good. https://www.youtube.com/watch?v=SvhoBT-PnME&lc=UgyKAwYjRNAb0P45CYp4AaABAg You should use the most recent ara.traineddata file from the tesseract "best" repository as your basis https://github.com/tesseract-ocr/tessdata_best I found that training it further ac

[tesseract-ocr] Tesseract training with Custom Dataset

2025-04-18 Thread Ishak DÖLEK
Hello, I am writing to inquire about the possibility of training a Tesseract model using my custom dataset. This dataset consists of Arabic image lines paired with corresponding Latin-based text lines. Specifically, I have the following questions: Is it possible to train Tesseract with a dataset

Re: [tesseract-ocr] Tesseract Training: Error 'Integer (fast) model' When Using Apex.lstm

2025-03-22 Thread Zdenko Podobny
Hello, I notice there may be some gaps in your understanding of Tesseract and its training requirements. Training Tesseract effectively requires careful adherence to its documentation and established processes. Proceeding without this foundation risks wasting both your time and ours. Anyway I put

[tesseract-ocr] Tesseract Training: Error 'Integer (fast) model' When Using Apex.lstm

2025-03-21 Thread Mitya
I’ve been following this tutorial from YouTube: Guide to Tesseract Training https://www.youtube.com/watch?v=KE4xEzFGSU8&t=13s and its corresponding GitHub repository: astutejoe/tesseract_tutorial. https://github.com/astutejoe/tesseract_tutori

[tesseract-ocr] Tesseract training ground truth: I'm confused about the box files

2024-07-10 Thread Mateusz Matela
Hi all, Sorry if double posting, my previous message didn't appear and I don't see any info about waiting for acceptance or something. I was searching for this topic in this forum and it was mentioned a few times, but I couldn't find a clear and definitive explanation. How does the information

Re: [tesseract-ocr] Tesseract training for New font/language

2023-10-02 Thread Fish Money
please share sample of image you're trying to recognize суббота, 1 апреля 2023 г. в 10:56:58 UTC-4, ali8a...@gmail.com: > Is it best to train a new language? > > On Saturday, April 1, 2023 at 7:54:30 a.m. UTC-7 shree wrote: > >> Aurebesh seems to be different symbols mapped to the English alpha

Re: [tesseract-ocr] Tesseract training for New font/language

2023-04-01 Thread Ali Abedian
Is it best to train a new language? On Saturday, April 1, 2023 at 7:54:30 a.m. UTC-7 shree wrote: > Aurebesh seems to be different symbols mapped to the English alphabet > rather than a new font for English, hence training would need to be for a > new language rather than just fine-tuning. > >

Re: [tesseract-ocr] Tesseract training for New font/language

2023-04-01 Thread Shree Devi Kumar
Aurebesh seems to be different symbols mapped to the English alphabet rather than a new font for English, hence training would need to be for a new language rather than just fine-tuning. On Sat, Apr 1, 2023, 10:47 Ali Abedian wrote: > Hello, > > Thank you for providing the references, but I'm st

Re: [tesseract-ocr] Tesseract training for New font/language

2023-04-01 Thread Ali Abedian
Hello, Thank you for providing the references, but I'm still a bit confused. I have trained tesseract using the same method as described in https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip, with 100,000 sentences and a maximum iteration of 10,000. However, it still canno

Re: [tesseract-ocr] Tesseract training for New font/language

2023-04-01 Thread Zdenko Podobny
Please have a look at https://github.com/tesseract-ocr/tesstrain (especially https://github.com/tesseract-ocr/tesstrain/blob/main/ocrd-testset.zip) Zdenko pi 31. 3. 2023 o 7:03 Ali Abedian napísal(a): > Hey everyone! I'm currently working on a personal project where I'm > training a new font

[tesseract-ocr] Tesseract training for New font/language

2023-03-30 Thread Ali Abedian
Hey everyone! I'm currently working on a personal project where I'm training a new font for the English language using Tesseract. The font is called Aurebesh and it's from the Star Wars universe. Basically, each letter in Aurebesh corresponds to a letter in English. I've collected close to 100,

[tesseract-ocr] Tesseract Training Error

2022-03-30 Thread Hasan Kuray
Before starting the training process, I encountered an error like this: APPLY_BOXES: boxfile line 1/4 ((10,6),(60,90)): FAILURE! Couldn't find a matching blob I did some research but couldn't figure it out. I am attaching my training picture below. What should i do? -- You received this message

[tesseract-ocr] Tesseract training

2022-02-11 Thread b h
Hi, I'm trying to train new model that will recognize ids/dates (Indic digits + space + /). I've generated 10 K 300 DPI single line images (random combinations of the above) with the font I need in 2 different font sizes found on the documents I need to process. I've split the images into 2 set

Re: [tesseract-ocr] tesseract training in ubuntu (please help me!)

2021-09-03 Thread Saman Kurdi
Hello, Follow attached link. This might help https://github.com/tesseract-ocr/tesstrain On Fri, Sep 3, 2021 at 11:05 Nebiye Bulan wrote: > hello, can you please tell me briefly about the steps to train tesseract > in ubuntu? > https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.

[tesseract-ocr] tesseract training in ubuntu (please help me!)

2021-09-03 Thread Nebiye Bulan
hello, can you please tell me briefly about the steps to train tesseract in ubuntu? https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html#additional-libraries-required ,the steps in this link are mixed. -- You received this message because you are subscribed to the Google

[tesseract-ocr] tesseract training

2021-09-01 Thread Nebiye Bulan
[image: WhatsApp Image 2021-09-01 at 14.39.54.jpeg]hi, i made license plate recognition system in linux. I used yolov5 to detect the plate. I used tesseract to read characters but it doesn't read some letters. I want to train tesseract for this. https://michaeljaylissner.com/posts/2012/02/11/ad

[tesseract-ocr] Tesseract training dataset

2021-07-06 Thread Kumar Rajwani
May i know on which dataset tesseract is trained on. If you know any other dataset of ocr on black and white images then please provide a link. Thanks -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop rece

[tesseract-ocr] Tesseract training

2021-02-20 Thread karim abed el hadi
Hello everyone, My name is Karim Abed El Hadi, I am a telecommunications engineering student at the Holy Spirit University of Kaslik, Lebanon. I am working on a Syriac OCR project. For the first part of my project, I did a GUI that lets the user select an image or a PDF file from the device,

[tesseract-ocr] Tesseract Training Expert Needed

2020-01-16 Thread Dave Wood
I am looking for a genuine expert in Tesseract training to hire on a contract basis. I am using Tesseract 5.0 (Windows) LSTM engine with the latest training data available from Github for Tesseract. What I think is needed is a 'tune-up' for the existing training data for the single specific fo

[tesseract-ocr] Tesseract training - font size

2019-08-29 Thread Honza
Hi all, Is the font size important for a training of a new model from scratch? I found that it is not important for the Tesseract v3.0 but I was not able to find anything about Tesseract v.4 and later. Right now I use default parameters: --xsize 3600 for tesstrain.sh script --ptsize 12 for tex

[tesseract-ocr] Tesseract Training

2019-03-10 Thread Shobhit Kapil
Hi Team, I am .net developer and i am currently using Tesseract but i have no idea at all about tesseract training so could you please provide me the basics of the training and what are all the advantages of training. -- *Thanks & RegardsShobhit kapil M: 9703597601* -- You received this mes

[tesseract-ocr] Tesseract training has an upper limit on the use of cpu?Is the more cpu, the faster the training?

2018-11-13 Thread bruce
Is the more cpu, the faster the training? Tesseract training has an upper limit on the use of cpu? Two other questions: What is the best value for parameter *--ptsize* when training Chinese? 36 or 40 or other? What is the best value for parameter *--leading *when training Chinese? 40 or 50 or ot

[tesseract-ocr] Tesseract Training using basic characters only

2018-06-25 Thread James Q
The text I want Tesseract to read will only contain the most basic characters. Is there a way of finetuning it therefore so as to only include basic upper/lower case letters, digits and punctuation marks? That way I could avoid 'c' getting misinterpreted as '¢' etc.? Would simply passing in a n

[tesseract-ocr] Tesseract training

2017-02-01 Thread Bharathkumar Muthupandian
Can i use tesseract for training the traffic symbol with the inbuilt Adaptive classifier in tesseract? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesserac

[tesseract-ocr] Tesseract training details

2015-11-22 Thread Chen
I am trying to generate .traindata myself. I have some questions related to the training procedure. We can find langdata and tessdata on github. Is there an official document introducing how to convert langdata to the final .traindata? I'm not saying the basic procedure here in wiki/TrainingTes

[tesseract-ocr] Tesseract training for korean country

2015-04-21 Thread chan
Hi everyone i'm trying to train for korean character by following the procedure from this link https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 but i'm stuck, could any one guide me to train for korean character .. -- You received this message because you are subscribed to the

[tesseract-ocr] Tesseract training for Arabic

2014-11-06 Thread iram akbar
I am using jtessbox editor for Tiff generation and Serak for training Arabic data. but i am getting some issues in both tools. e.g getting shape clustering error during training in Serak as attached. Question: Anyone suggest me any other alternative tools for TIFF generation and Training Arabi

[tesseract-ocr] Tesseract Training (Empty Page issue)

2014-05-13 Thread Awsomo
Hi there, i have installed Tesseract 3.02 i am busy now to train tesseract. I created 32 png width boxfilepairs. There is really a problem width the tr file creation, i started training for this 32 pairs with following result. 12 TR_Files could be generated from this.. what leads me to the qu

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-05-02 Thread Awsomo :(
there is a problem also, -Visual C+ 2005 installed. -Visual C+ 2008 installed. -Visual C+ 2010 installed. I tried allready to install all of them, is missing or the debugging should be activated https://code.google.com/p/tesseract-ocr/wiki/AddOns , within the installation where errors reported

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-29 Thread Nick White
On Fri, Apr 25, 2014 at 03:40:13AM -0700, Awsomo :( wrote: > and now its getting worse, > i tried again now i am getting message but that aint better.. The "FAILURE! Couldn't find a matching blob" message means that the places in the box file that a character is described don't appear to match u

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-29 Thread Nick White
On Fri, Apr 25, 2014 at 02:18:32AM -0700, Awsomo :( wrote: > the issue now is that in the _Tessdata Folder_ is no _tr_File_generated/shown > and it doesn´t show and report to the training like this The .tr files are saved in your working directory. So in this example they will be in the folder:

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-25 Thread Awsomo :(
and now its getting worse, i tried again now i am getting message but that aint better.. -- You r

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-25 Thread Awsomo :(
> > Thank you i will keep that in Mind. After i renamed the tif and boxfiles to: deu.handwriting.exp0.tif and deu.handwriting.exp0.box i have an result now. The Image/Tif file opens up after exicuting the command: deu.handwriting.exp0.tif deu.handwriting.exp0 box.train

Re: [tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-23 Thread Nick White
Hi Tobias, You're misreading the wiki slightly. The parts in square brackets in commands mean "replace this with your actual names as appropriate". So on the wiki: tesseract [lang].[fontname].exp[num].tif [lang].[fontname].exp[num] box.train Means something like this (for example): tesseract

[tesseract-ocr] Tesseract Training Error (How can i train Handwriting)

2014-04-22 Thread Tobias Schwarz
Hi, I just started the first steps in tesseract so I am really a nobbie , the idea is to teach tesseract for Handwriting. I use tesseract v3.02 cowboxer v1.02 ___ after i failed to install the the most of the boxtools for tesseract because of missing