I checked this link. It has no tesstrain reference.
Tesstrain internally calls text2image.exe.
So if I ran text2image.exe, how will I get trainned data? What are the 
further steps to get trainneddata?


On Wednesday, September 1, 2021 at 6:25:13 PM UTC+5:30 P007 wrote:

>
> Check this 
> https://github.com/tesseract-ocr/tesseract/issues/1685
>
> On Wed, 1 Sep 2021 at 6:22 PM, Samruddhi Dhake <sam22...@gmail.com> wrote:
>
>> For images.
>> I have to create my own trainneddata for my images. So for that I am 
>> following steps mentioned in this documentation 
>> https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html
>> As per the steps I have created box file, lstm file and unicharset file. 
>> And next step is to create traineddata using tesstrain.sh followed by the 
>> next step i.e. lstmtraining.exe .
>> I am getting such errors while performing at step tesstrain.sh.
>>
>> On Wednesday, September 1, 2021 at 6:11:27 PM UTC+5:30 P007 wrote:
>>
>>> I mean working with font only?  
>>> Or images??
>>>
>>> On Wed, 1 Sep 2021 at 6:09 PM, Samruddhi Dhake <sam22...@gmail.com> 
>>> wrote:
>>>
>>>> Yes, I am working for eng language.
>>>> I am using tessdata.(C:\Program Files\Tesseract-OCR\tessdata)
>>>>
>>>> On Wednesday, September 1, 2021 at 5:57:24 PM UTC+5:30 P007 wrote:
>>>>
>>>>> Okay, 
>>>>>
>>>>> Wait you are working for English language right?
>>>>> What kind of dataset you used here.
>>>>>
>>>>> On Wed, 1 Sep 2021 at 5:53 PM, Samruddhi Dhake <sam22...@gmail.com> 
>>>>> wrote:
>>>>>
>>>>>> No. Tessstrain.sh didn't work. I am running tesstrain.sh on cygwin.
>>>>>>  Command->
>>>>>> *$ ./src/training/tesstrain.sh --fonts_dir %WINDIR%/Fonts/ --lang eng 
>>>>>> --linedata_only --noextract_font_properties --langdata_dir 'C:/Program 
>>>>>> Files/Tesseract-OCR/langdata' --tessdata_dir 'C:/Program 
>>>>>> Files/Tesseract-OCR/tessdata' --output_dir D:/Test/trainneddata 
>>>>>> --fontlist 
>>>>>> 'Arial'*
>>>>>>
>>>>>> After hitting enter for tesstrain.sh, it is processing text2image and 
>>>>>> giving following error
>>>>>> === Starting training for language 'eng'
>>>>>> [Tue Aug 31 19:19:05 IST 2021] /cygdrive/c/Program 
>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=%WINDIR%/Fonts/ --ptsize 12 
>>>>>> --font=Arial --outputbase=/tmp/font_tmp.0doGBqWc3I/sample_text.txt 
>>>>>> --text=/tmp/font_tmp.0doGBqWc3I/sample_text.txt 
>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.0doGBqWc3I
>>>>>> Unable to open '/tmp/font_tmp.0doGBqWc3I/fonts.conf' for writing
>>>>>> Fontconfig error: Cannot load default config file
>>>>>> Could not find font named 'Arial'.
>>>>>> Please correct --font arg.
>>>>>> ERROR: Program Program failed. Abort.
>>>>>>
>>>>>> As per previous suggestions, I ran text2image.exe command on cmd and 
>>>>>> its working and giving me all available fonts.
>>>>>>
>>>>>> Then after running tesstrain.sh, why text2image command is failing 
>>>>>> and it is not creating tempfolder under /tmp/ and I am getting 
>>>>>> fonts.config 
>>>>>> error.
>>>>>> It is expected that fonts.config file which gets created in 
>>>>>> tempfolder(here in my case font_tmp.0doGBqWc3I) should gets written and 
>>>>>> it 
>>>>>> should include font 'Arial' and then Arial font can be found.
>>>>>> Don't why it is not creating..
>>>>>>
>>>>>> Regards,
>>>>>> Samruddhi
>>>>>>
>>>>>> On Wednesday, September 1, 2021 at 5:31:10 PM UTC+5:30 P007 wrote:
>>>>>>
>>>>>>>
>>>>>>> Tesstrain.sh work for you ?
>>>>>>>
>>>>>>> On Wed, 1 Sep 2021 at 5:09 PM, Samruddhi Dhake <sam22...@gmail.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> In this text2image, there is an rgument --fontconfig_tempdir which 
>>>>>>>> creates temp folder where fonts.conf gets added.
>>>>>>>>
>>>>>>>> I checked /tmp/, no other tempfolder is created( 
>>>>>>>> font_tmp.0doGBqWc3I)
>>>>>>>>
>>>>>>>> Has anybody this issue?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Samruddhi
>>>>>>>>
>>>>>>>> On Tuesday, August 31, 2021 at 7:24:46 PM UTC+5:30 Samruddhi Dhake 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> >"C:\Program Files\Tesseract-OCR\text2image.exe" 
>>>>>>>>> --fonts_dir=%WINDIR%/Fonts --fontconfig_tmpdir=/tmp 
>>>>>>>>> --list_available_fonts
>>>>>>>>> This worked. I got list of available fonts which contains Arial 
>>>>>>>>> and Arial Bold too.
>>>>>>>>>
>>>>>>>>> Now this time,in Cygwin Bash, I tried giving --fontlist 'Arial' 
>>>>>>>>> for tesstrain.sh
>>>>>>>>> Command->
>>>>>>>>> *$ ./src/training/tesstrain.sh --fonts_dir %WINDIR%/Fonts/ --lang 
>>>>>>>>> eng --linedata_only --noextract_font_properties --langdata_dir 
>>>>>>>>> 'C:/Program 
>>>>>>>>> Files/Tesseract-OCR/langdata' --tessdata_dir 'C:/Program 
>>>>>>>>> Files/Tesseract-OCR/tessdata' --output_dir D:/Test/trainneddata 
>>>>>>>>> --fontlist 
>>>>>>>>> 'Arial'*
>>>>>>>>>
>>>>>>>>> === Starting training for language 'eng'
>>>>>>>>> [Tue Aug 31 19:19:05 IST 2021] /cygdrive/c/Program 
>>>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=%WINDIR%/Fonts/ --ptsize 
>>>>>>>>> 12 
>>>>>>>>> --font=Arial --outputbase=/tmp/font_tmp.0doGBqWc3I/sample_text.txt 
>>>>>>>>> --text=/tmp/font_tmp.0doGBqWc3I/sample_text.txt 
>>>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.0doGBqWc3I
>>>>>>>>> Unable to open '/tmp/font_tmp.0doGBqWc3I/fonts.conf' for writing
>>>>>>>>> Fontconfig error: Cannot load default config file
>>>>>>>>> Could not find font named 'Arial'.
>>>>>>>>> Please correct --font arg.
>>>>>>>>> ERROR: Program Program failed. Abort.
>>>>>>>>>
>>>>>>>>> Still I am getting this font.conf error. Any idea how to resolve 
>>>>>>>>> this font.conf error?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Samruddhi
>>>>>>>>>
>>>>>>>>> On Tuesday, August 31, 2021 at 4:50:14 PM UTC+5:30 zdenop wrote:
>>>>>>>>>
>>>>>>>>>> try run this:
>>>>>>>>>> "C:\Program Files\Tesseract-OCR\text2image.exe" 
>>>>>>>>>> --fonts_dir=%WINDIR%/Fonts --fontconfig_tmpdir=/tmp 
>>>>>>>>>> --list_available_fonts
>>>>>>>>>>
>>>>>>>>>> Zdenko
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> po 30. 8. 2021 o 16:45 Samruddhi Dhake <sam22...@gmail.com> 
>>>>>>>>>> napísal(a):
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I am running command ->
>>>>>>>>>>>
>>>>>>>>>>> ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts --lang 
>>>>>>>>>>> eng --linedata_only --noextract_font_properties --langdata_dir 
>>>>>>>>>>> "C:/Program 
>>>>>>>>>>> Files/Tesseract-OCR/langdata" --tessdata_dir "C:/Program 
>>>>>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir D:\Test\trainneddata
>>>>>>>>>>>
>>>>>>>>>>> And after hitting enter -> (processing)
>>>>>>>>>>> === *Starting training for language 'eng'*
>>>>>>>>>>> *[Mon Aug 30 16:51:10 IST 2021] /cygdrive/c/Program 
>>>>>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts/ 
>>>>>>>>>>> --ptsize 12 
>>>>>>>>>>> --font=Arial Bold 
>>>>>>>>>>> --outputbase=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt 
>>>>>>>>>>> --text=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt 
>>>>>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.s9cdSHrzKS*
>>>>>>>>>>> *Unable to open '/tmp/font_tmp.s9cdSHrzKS/fonts.conf' for 
>>>>>>>>>>> writing*
>>>>>>>>>>> *Fontconfig error: Cannot load default config file*
>>>>>>>>>>> *Could not find font named 'Arial Bold'.*
>>>>>>>>>>> *Please correct --font arg.*
>>>>>>>>>>> *ERROR: Program Program failed. Abort.*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I will break it to ask few queries.
>>>>>>>>>>>
>>>>>>>>>>> *[Mon Aug 30 16:51:10 IST 2021] /cygdrive/c/Program 
>>>>>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts/ 
>>>>>>>>>>> --ptsize 12 
>>>>>>>>>>> --font=Arial Bold 
>>>>>>>>>>> --outputbase=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt 
>>>>>>>>>>> --text=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt 
>>>>>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.s9cdSHrzKS*
>>>>>>>>>>> *Unable to open '/tmp/font_tmp.s9cdSHrzKS/fonts.conf' for 
>>>>>>>>>>> writing*
>>>>>>>>>>> ----> Here, I am not giving input as Arial Bold. Outputbase , 
>>>>>>>>>>> this should create temp folder 'font_tmp.s9cdSHrzKS' but its not 
>>>>>>>>>>> creating.
>>>>>>>>>>> And so does fontconfig_tmpdir'. So it is giving writing error
>>>>>>>>>>>
>>>>>>>>>>> *Fontconfig error: Cannot load default config file*
>>>>>>>>>>> ----> To resolve this error, I added 
>>>>>>>>>>> FONTCONFIG_FILE=%WINDIR%\fonts.conf to environment 
>>>>>>>>>>> variables(referring 
>>>>>>>>>>> https://forums.wesnoth.org/viewtopic.php?t=22821) 
>>>>>>>>>>> But still not resolved.
>>>>>>>>>>>
>>>>>>>>>>> I was checking-> *text2image.exe ----list_available_fonts*
>>>>>>>>>>> And after hitting enter, I got -> Fontconfig warning: 
>>>>>>>>>>> "/tmp\fonts.conf", line 4: empty font directory name ignored
>>>>>>>>>>>
>>>>>>>>>>> The contents of the fonts.conf file which gets created are->
>>>>>>>>>>> <?xml version="1.0"?>
>>>>>>>>>>> <!DOCTYPE fontconfig SYSTEM "fonts.dtd">
>>>>>>>>>>> <fontconfig>
>>>>>>>>>>> <dir></dir>
>>>>>>>>>>> <cachedir>/tmp</cachedir>
>>>>>>>>>>> <config></config>
>>>>>>>>>>> </fontconfig>
>>>>>>>>>>>
>>>>>>>>>>> Can you please help me how can this be resolved? Or Am I giving 
>>>>>>>>>>> correct tesstrain.sh command with its args?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Samruddhi
>>>>>>>>>>> On Monday, August 30, 2021 at 5:12:21 PM UTC+5:30 zdenop wrote:
>>>>>>>>>>>
>>>>>>>>>>>> First of all: use quotes for multi word names, or escape 
>>>>>>>>>>>> space/special symbols (e.g. --font="Arial Bold")
>>>>>>>>>>>> Next: fix error message: "Unable to open 
>>>>>>>>>>>> '/tmp/font_tmp.hbC9F3LEQX/fonts.conf' for writing" 
>>>>>>>>>>>> Next: check available font for text2image with option 
>>>>>>>>>>>> --list_available_fonts
>>>>>>>>>>>> etc...
>>>>>>>>>>>>
>>>>>>>>>>>> PS: I would suggest using linux for training instead of windows 
>>>>>>>>>>>> (e.g. in WSL[1])
>>>>>>>>>>>> [1] https://docs.microsoft.com/en-us/windows/wsl/install-win10
>>>>>>>>>>>>
>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> po 30. 8. 2021 o 12:12 Samruddhi Dhake <sam22...@gmail.com> 
>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Text2Image error is gone. I am getting *font-config error*.
>>>>>>>>>>>>>
>>>>>>>>>>>>> SDE26@DTP-SDE26-IND /cygdrive/c/Program Files/Tesseract-OCR
>>>>>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts 
>>>>>>>>>>>>> --lang eng --linedata_only --noextract_font_properties 
>>>>>>>>>>>>> --langdata_dir 
>>>>>>>>>>>>> "C:/Program Files/Tesseract-OCR/langdata" --tessdata_dir 
>>>>>>>>>>>>> "C:/Program 
>>>>>>>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir D:\Test\trainneddata
>>>>>>>>>>>>> Creating new directory D:Testtrainneddata
>>>>>>>>>>>>>
>>>>>>>>>>>>> === Starting training for language 'eng'
>>>>>>>>>>>>> [Mon Aug 30 15:34:53 IST 2021] /cygdrive/c/Program 
>>>>>>>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts 
>>>>>>>>>>>>> --ptsize 12 
>>>>>>>>>>>>> --font=Arial Bold 
>>>>>>>>>>>>> --outputbase=/tmp/font_tmp.hbC9F3LEQX/sample_text.txt 
>>>>>>>>>>>>> --text=/tmp/font_tmp.hbC9F3LEQX/sample_text.txt 
>>>>>>>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.hbC9F3LEQX
>>>>>>>>>>>>> Unable to open '/tmp/font_tmp.hbC9F3LEQX/fonts.conf' for 
>>>>>>>>>>>>> writing
>>>>>>>>>>>>> Fontconfig error: Cannot load default config file
>>>>>>>>>>>>> Could not find font named 'Arial Bold'.
>>>>>>>>>>>>> Please correct --font arg.
>>>>>>>>>>>>> ERROR: Program Program failed. Abort.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have Arial Bold font on my machine. Don't know why it cannot 
>>>>>>>>>>>>> find. And in /tmp/ folder there is no font_tmp.hbC9F3LEQX where 
>>>>>>>>>>>>> fonts.conf 
>>>>>>>>>>>>> cannot be opened for writing.
>>>>>>>>>>>>> How can I resolve this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wednesday, August 25, 2021 at 8:18:47 PM UTC+5:30 zdenop 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Honestly, I have no clue what you are doing: text2image is at 
>>>>>>>>>>>>>> the same location as the tesseract executable. So if you have 
>>>>>>>>>>>>>> tesseract in 
>>>>>>>>>>>>>> the path, text2image must work too. 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> st 25. 8. 2021 o 16:26 Samruddhi Dhake <sam22...@gmail.com> 
>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As you suggested, I installed Tesseract v5.0.0 on my Windows 
>>>>>>>>>>>>>>> machine  (Index of /tesseract (uni-mannheim.de) 
>>>>>>>>>>>>>>> <https://digi.bib.uni-mannheim.de/tesseract/>). This 
>>>>>>>>>>>>>>> included training tools too.
>>>>>>>>>>>>>>> I performed all the previous steps(boxfile, lstmf 
>>>>>>>>>>>>>>> file,unicharset)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But still after running tesstrain.sh command in Cygwin, I am 
>>>>>>>>>>>>>>> getting following error,
>>>>>>>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts 
>>>>>>>>>>>>>>> --lang eng --linedata_only --noextract_font_properties 
>>>>>>>>>>>>>>> --langdata_dir 
>>>>>>>>>>>>>>> "C:/Program Files/Tesseract-OCR/langdata" --tessdata_dir 
>>>>>>>>>>>>>>> "C:/Program 
>>>>>>>>>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir 
>>>>>>>>>>>>>>> D:/Bugs/1206806/folder/trainneddata
>>>>>>>>>>>>>>> Creating new directory D:/Bugs/1206806/folder/trainneddata
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> === Starting training for language 'eng'
>>>>>>>>>>>>>>> which: no text2image in 
>>>>>>>>>>>>>>> (/usr/local/bin:/usr/bin:/cygdrive/c/Program Files/Microsoft 
>>>>>>>>>>>>>>> MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/NVIDIA 
>>>>>>>>>>>>>>> Corporation/PhysX/Common:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Intel/Intel(R) 
>>>>>>>>>>>>>>> Management Engine Components/iCLS:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Intel/Intel(R) 
>>>>>>>>>>>>>>> Management Engine 
>>>>>>>>>>>>>>> Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL 
>>>>>>>>>>>>>>> Server/100/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft SQL Server/100/DTS/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft/Web Platform Installer:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files 
>>>>>>>>>>>>>>> (x86)/Microsoft ASP.NET/ASP.NET Web 
>>>>>>>>>>>>>>> Pages/v1.0:/cygdrive/c/Program Files/Microsoft SQL 
>>>>>>>>>>>>>>> Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Windows 
>>>>>>>>>>>>>>> Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft 
>>>>>>>>>>>>>>> SQL Server/130/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Windows 
>>>>>>>>>>>>>>> Kits/10/Windows Performance Toolkit:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Oracle/Berkeley DB 12cR1 6.0.20/bin:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL 
>>>>>>>>>>>>>>> Server/Client 
>>>>>>>>>>>>>>> SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL 
>>>>>>>>>>>>>>> Server/150/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files 
>>>>>>>>>>>>>>> (x86)/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files/Microsoft 
>>>>>>>>>>>>>>> SQL Server/150/DTS/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>> Server/Client SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>> (x86)/Microsoft SQL Server/140/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files 
>>>>>>>>>>>>>>> (x86)/Microsoft SQL Server/140/DTS/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>> Files 
>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>> Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools)
>>>>>>>>>>>>>>> which: no text2image in (./api)
>>>>>>>>>>>>>>> which: no text2image in (./training)
>>>>>>>>>>>>>>> ERROR: 'text2image' not found
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am I missing something? Can you please guild me?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 5:59:49 PM UTC+5:30 Samruddhi 
>>>>>>>>>>>>>>> Dhake wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can you please provide link for steps to install Tesseract 
>>>>>>>>>>>>>>>> and training tools on Windows?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 3:42:48 PM UTC+5:30 
>>>>>>>>>>>>>>>> Samruddhi Dhake wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> How to install tesseract and training tools on Windows? 
>>>>>>>>>>>>>>>>> Do I have to install Tesseract Windows exe?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 3:20:37 PM UTC+5:30 zdenop 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So there are only 2 possibilities:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    1. Install tesseract and training tools
>>>>>>>>>>>>>>>>>>    2. Learn how to handle & use not installed sw. This 
>>>>>>>>>>>>>>>>>>    option is not related to tesseract.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ut 24. 8. 2021 o 9:17 Samruddhi Dhake <sam22...@gmail.com> 
>>>>>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I haven't installed Tesseract. I have kept in a folder 
>>>>>>>>>>>>>>>>>>> and I am running exe by giving its path. I have generated 
>>>>>>>>>>>>>>>>>>> training tools 
>>>>>>>>>>>>>>>>>>> through source code.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To create box file, command->(I gave absoulute path of 
>>>>>>>>>>>>>>>>>>> tesseract.exe)
>>>>>>>>>>>>>>>>>>> ..\tesseract.exe Dim4.tif Dim4 lstmbox
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To create box file, command->
>>>>>>>>>>>>>>>>>>> tesseract.exe Dim4.tif Dim4 lstm.train
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To create unicharset, command->
>>>>>>>>>>>>>>>>>>> unicharset_extractor.exe --output_unicharset 
>>>>>>>>>>>>>>>>>>> ..\own.unicharset ..\langdata\eng\eng.training_text
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> And to create trainned data, using tesstrain.sh command,
>>>>>>>>>>>>>>>>>>> .\src\training\tesstrain.sh --fonts_dir C:\Windows\Fonts 
>>>>>>>>>>>>>>>>>>> --lang eng --linedata_only --noextract_font_properties 
>>>>>>>>>>>>>>>>>>> --langdata_dir 
>>>>>>>>>>>>>>>>>>> langdata --tessdata_dir tessdata --output_dir trainneddata
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 12:24:29 PM UTC+5:30 
>>>>>>>>>>>>>>>>>>> Samruddhi Dhake wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have generated training tools through source code.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Monday, August 23, 2021 at 7:09:02 PM UTC+5:30 
>>>>>>>>>>>>>>>>>>>> zdenop wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> How did you install tesseract? Did you also install 
>>>>>>>>>>>>>>>>>>>>> training tools?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> po 23. 8. 2021 o 15:34 Samruddhi Dhake <
>>>>>>>>>>>>>>>>>>>>> sam22...@gmail.com> napísal(a):
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am creating my own trainneddata using tesseract 
>>>>>>>>>>>>>>>>>>>>>> v4.1.1 on Windows 10.
>>>>>>>>>>>>>>>>>>>>>> I am referring documentation 
>>>>>>>>>>>>>>>>>>>>>> https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I have successfully created .box file and .lstmf file 
>>>>>>>>>>>>>>>>>>>>>> using lstmbox and lstm.train respectively.
>>>>>>>>>>>>>>>>>>>>>> So next step, I installed Cygwin to run tesstrain.sh 
>>>>>>>>>>>>>>>>>>>>>> command to create training data.
>>>>>>>>>>>>>>>>>>>>>> But I am getting below error.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir 
>>>>>>>>>>>>>>>>>>>>>> C:/Windows/Fonts --lang eng --linedata_only 
>>>>>>>>>>>>>>>>>>>>>> --noextract_font_properties 
>>>>>>>>>>>>>>>>>>>>>> --langdata_dir ./langdata --tessdata_dir ./tessdata 
>>>>>>>>>>>>>>>>>>>>>> --output_dir 
>>>>>>>>>>>>>>>>>>>>>> ./trainneddata
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> === Starting training for language 'eng'
>>>>>>>>>>>>>>>>>>>>>> which: no text2image in 
>>>>>>>>>>>>>>>>>>>>>> (/usr/local/bin:/usr/bin:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft 
>>>>>>>>>>>>>>>>>>>>>> MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/NVIDIA 
>>>>>>>>>>>>>>>>>>>>>> Corporation/PhysX/Common:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Intel/Intel(R) 
>>>>>>>>>>>>>>>>>>>>>> Management Engine Components/iCLS:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Intel/Intel(R) 
>>>>>>>>>>>>>>>>>>>>>> Management Engine 
>>>>>>>>>>>>>>>>>>>>>> Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program
>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/100/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/100/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/100/DTS/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft/Web Platform 
>>>>>>>>>>>>>>>>>>>>>> Installer:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft ASP.NET/ASP.NET Web 
>>>>>>>>>>>>>>>>>>>>>> Pages/v1.0:/cygdrive/c/Program Files/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program
>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Windows 
>>>>>>>>>>>>>>>>>>>>>> Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft 
>>>>>>>>>>>>>>>>>>>>>> SQL Server/130/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Windows 
>>>>>>>>>>>>>>>>>>>>>> Kits/10/Windows Performance Toolkit:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Oracle/Berkeley DB 12cR1 
>>>>>>>>>>>>>>>>>>>>>> 6.0.20/bin:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/Client 
>>>>>>>>>>>>>>>>>>>>>> SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program
>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/150/Tools/Binn:/cygdrive/c/Program 
>>>>>>>>>>>>>>>>>>>>>> Files/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/150/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/150/DTS/Binn:/cygdrive/c/Program Files/Microsoft 
>>>>>>>>>>>>>>>>>>>>>> SQL Server/150/DTS/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/Client 
>>>>>>>>>>>>>>>>>>>>>> SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/140/Tools/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/140/DTS/Binn:/cygdrive/c/Program Files 
>>>>>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL 
>>>>>>>>>>>>>>>>>>>>>> Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools)
>>>>>>>>>>>>>>>>>>>>>> which: no text2image in (./api)
>>>>>>>>>>>>>>>>>>>>>> which: no text2image in (./training)
>>>>>>>>>>>>>>>>>>>>>> ERROR: 'text2image' not found
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I found text2image comes after running command 'make 
>>>>>>>>>>>>>>>>>>>>>> training'.
>>>>>>>>>>>>>>>>>>>>>> Can you please help me how this can be done in 
>>>>>>>>>>>>>>>>>>>>>> WIndows 10?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>> Samruddhi
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>>>>>>>> You received this message because you are subscribed 
>>>>>>>>>>>>>>>>>>>>>> to the Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving 
>>>>>>>>>>>>>>>>>>>>>> emails from it, send an email to 
>>>>>>>>>>>>>>>>>>>>>> tesseract-oc...@googlegroups.com.
>>>>>>>>>>>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com
>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>>>>> You received this message because you are subscribed to 
>>>>>>>>>>>>>>>>>>> the Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails 
>>>>>>>>>>>>>>>>>>> from it, send an email to 
>>>>>>>>>>>>>>>>>>> tesseract-oc...@googlegroups.com.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com
>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails 
>>>>>>>>>>>>>>> from it, send an email to tesseract-oc...@googlegroups.com.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/79bf5824-5f74-4dc9-b2da-269840d1dc7fn%40googlegroups.com
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/79bf5824-5f74-4dc9-b2da-269840d1dc7fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>>> it, send an email to tesseract-oc...@googlegroups.com.
>>>>>>>>>>>>>
>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/6492b2e2-060c-41a5-97bd-dfc238656cb4n%40googlegroups.com
>>>>>>>>>>>>>  
>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/6492b2e2-060c-41a5-97bd-dfc238656cb4n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to tesseract-oc...@googlegroups.com.
>>>>>>>>>>>
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a274f441-5986-415c-a0a0-e05de6a3e790n%40googlegroups.com
>>>>>>>>>>>  
>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a274f441-5986-415c-a0a0-e05de6a3e790n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>>>>
>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/4d0f22e4-cc3f-4487-a024-363e79ad8598n%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4d0f22e4-cc3f-4487-a024-363e79ad8598n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/3fbe32ef-5477-42c4-911b-b980b24cea9cn%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/3fbe32ef-5477-42c4-911b-b980b24cea9cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>
>>>>>
>>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to tesseract-oc...@googlegroups.com.
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/595017f3-630a-4707-b4b3-a5aeed9e7a53n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/595017f3-630a-4707-b4b3-a5aeed9e7a53n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/89197941-16d3-4747-b280-95ddb9979b40n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/89197941-16d3-4747-b280-95ddb9979b40n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c8140d82-1cf1-410b-af44-747d11ffef1fn%40googlegroups.com.

Reply via email to