As you suggested, I installed Tesseract v5.0.0 on my Windows machine (Index of /tesseract (uni-mannheim.de) <https://digi.bib.uni-mannheim.de/tesseract/>). This included training tools too. I performed all the previous steps(boxfile, lstmf file,unicharset)
But still after running tesstrain.sh command in Cygwin, I am getting following error, $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts --lang eng --linedata_only --noextract_font_properties --langdata_dir "C:/Program Files/Tesseract-OCR/langdata" --tessdata_dir "C:/Program Files/Tesseract-OCR/tessdata" --output_dir D:/Bugs/1206806/folder/trainneddata Creating new directory D:/Bugs/1206806/folder/trainneddata === Starting training for language 'eng' which: no text2image in (/usr/local/bin:/usr/bin:/cygdrive/c/Program Files/Microsoft MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/cygdrive/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/iCLS:/cygdrive/c/Program Files/Intel/Intel(R) Management Engine Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program Files/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program Files/Microsoft SQL Server/100/DTS/Binn:/cygdrive/c/Program Files/Microsoft/Web Platform Installer:/cygdrive/c/Program Files (x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0:/cygdrive/c/Program Files/Microsoft SQL Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files (x86)/Windows Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program Files/Microsoft SQL Server/130/Tools/Binn:/cygdrive/c/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit:/cygdrive/c/Program Files (x86)/Oracle/Berkeley DB 12cR1 6.0.20/bin:/cygdrive/c/Program Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program Files/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program Files/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/Client SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/140/DTS/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools) which: no text2image in (./api) which: no text2image in (./training) ERROR: 'text2image' not found Am I missing something? Can you please guild me? Regards, Samruddhi On Tuesday, August 24, 2021 at 5:59:49 PM UTC+5:30 Samruddhi Dhake wrote: > > Can you please provide link for steps to install Tesseract and training > tools on Windows? > > Samruddhi > On Tuesday, August 24, 2021 at 3:42:48 PM UTC+5:30 Samruddhi Dhake wrote: > >> How to install tesseract and training tools on Windows? >> Do I have to install Tesseract Windows exe? >> >> Samruddhi >> >> On Tuesday, August 24, 2021 at 3:20:37 PM UTC+5:30 zdenop wrote: >> >>> So there are only 2 possibilities: >>> >>> 1. Install tesseract and training tools >>> 2. Learn how to handle & use not installed sw. This option is not >>> related to tesseract. >>> >>> >>> Zdenko >>> >>> >>> ut 24. 8. 2021 o 9:17 Samruddhi Dhake <sam22...@gmail.com> napísal(a): >>> >>>> I haven't installed Tesseract. I have kept in a folder and I am running >>>> exe by giving its path. I have generated training tools through source >>>> code. >>>> >>>> To create box file, command->(I gave absoulute path of tesseract.exe) >>>> ..\tesseract.exe Dim4.tif Dim4 lstmbox >>>> >>>> To create box file, command-> >>>> tesseract.exe Dim4.tif Dim4 lstm.train >>>> >>>> To create unicharset, command-> >>>> unicharset_extractor.exe --output_unicharset ..\own.unicharset >>>> ..\langdata\eng\eng.training_text >>>> >>>> And to create trainned data, using tesstrain.sh command, >>>> .\src\training\tesstrain.sh --fonts_dir C:\Windows\Fonts --lang eng >>>> --linedata_only --noextract_font_properties --langdata_dir langdata >>>> --tessdata_dir tessdata --output_dir trainneddata >>>> >>>> >>>> Regards, >>>> Samruddhi >>>> On Tuesday, August 24, 2021 at 12:24:29 PM UTC+5:30 Samruddhi Dhake >>>> wrote: >>>> >>>>> I have generated training tools through source code. >>>>> >>>>> On Monday, August 23, 2021 at 7:09:02 PM UTC+5:30 zdenop wrote: >>>>> >>>>>> How did you install tesseract? Did you also install training tools? >>>>>> >>>>>> Zdenko >>>>>> >>>>>> >>>>>> po 23. 8. 2021 o 15:34 Samruddhi Dhake <sam22...@gmail.com> >>>>>> napísal(a): >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I am creating my own trainneddata using tesseract v4.1.1 on Windows >>>>>>> 10. >>>>>>> I am referring documentation >>>>>>> https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html >>>>>>> >>>>>>> I have successfully created .box file and .lstmf file using lstmbox >>>>>>> and lstm.train respectively. >>>>>>> So next step, I installed Cygwin to run tesstrain.sh command to >>>>>>> create training data. >>>>>>> But I am getting below error. >>>>>>> >>>>>>> $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts --lang >>>>>>> eng --linedata_only --noextract_font_properties --langdata_dir >>>>>>> ./langdata >>>>>>> --tessdata_dir ./tessdata --output_dir ./trainneddata >>>>>>> >>>>>>> === Starting training for language 'eng' >>>>>>> which: no text2image in (/usr/local/bin:/usr/bin:/cygdrive/c/Program >>>>>>> Files/Microsoft MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program >>>>>>> Files >>>>>>> (x86)/NVIDIA Corporation/PhysX/Common:/cygdrive/c/Program Files >>>>>>> (x86)/Intel/Intel(R) Management Engine >>>>>>> Components/iCLS:/cygdrive/c/Program >>>>>>> Files/Intel/Intel(R) Management Engine >>>>>>> Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program >>>>>>> >>>>>>> Files (x86)/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>> Files/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>> Files/Microsoft SQL Server/100/DTS/Binn:/cygdrive/c/Program >>>>>>> Files/Microsoft/Web Platform Installer:/cygdrive/c/Program Files >>>>>>> (x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0:/cygdrive/c/Program >>>>>>> Files/Microsoft SQL >>>>>>> Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program >>>>>>> >>>>>>> Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files (x86)/Windows >>>>>>> Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program >>>>>>> Files/Microsoft >>>>>>> SQL Server/130/Tools/Binn:/cygdrive/c/Program Files (x86)/Windows >>>>>>> Kits/10/Windows Performance Toolkit:/cygdrive/c/Program Files >>>>>>> (x86)/Oracle/Berkeley DB 12cR1 6.0.20/bin:/cygdrive/c/Program >>>>>>> Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL Server/Client >>>>>>> SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files >>>>>>> (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program >>>>>>> >>>>>>> Files (x86)/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program >>>>>>> Files/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program Files >>>>>>> (x86)/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program >>>>>>> Files/Microsoft >>>>>>> SQL Server/150/DTS/Binn:/cygdrive/c/Program Files (x86)/Microsoft SQL >>>>>>> Server/Client SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program Files >>>>>>> (x86)/Microsoft SQL Server/140/Tools/Binn:/cygdrive/c/Program Files >>>>>>> (x86)/Microsoft SQL Server/140/DTS/Binn:/cygdrive/c/Program Files >>>>>>> (x86)/Microsoft SQL >>>>>>> Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools) >>>>>>> which: no text2image in (./api) >>>>>>> which: no text2image in (./training) >>>>>>> ERROR: 'text2image' not found >>>>>>> >>>>>>> >>>>>>> I found text2image comes after running command 'make training'. >>>>>>> Can you please help me how this can be done in WIndows 10? >>>>>>> >>>>>>> Regards, >>>>>>> Samruddhi >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/79bf5824-5f74-4dc9-b2da-269840d1dc7fn%40googlegroups.com.