Re: Disable Special characters?

2010-04-18 Thread zdenko podobny
Hello, if I correctly understood "Comment by ffournel, Mar 30, 2010" on http://code.google.com/p/tesseract-ocr/wiki/FAQ we can achieved the same behavior by creating config file (e.g. digits in directory tessdata/configs/) with line: tessedit_char_whitelist 0123456789 and than to run: C:>tesser

Re: Training Tesseract 3

2010-04-23 Thread zdenko podobny
Hello, tesseract 3.0 is in svn: http://code.google.com/p/tesseract-ocr/source/checkout0 (source code). Some information can be found in http://code.google.com/p/tesseract-ocr/wiki/ReadMe (Installation Notes - 3.00 Prerelease) Zd. On Thu, Apr 22, 2010 at 10:48 AM, Ayatullah wrote: > What is the

Re: Tesseract 3.0 without page layout analysis?

2010-04-23 Thread zdenko podobny
Hello, http://code.google.com/p/tesseract-ocr/wiki/ReadMe, section Installation Notes - 3.00 Prerelease: In the executable, page layout analysis is enabled by default. You may need to turn it off to process small images. No command-line control for this yet. Sorry. See tesseractmain.cpp. Zd. On

Re: Extracting files from .tessdata

2010-04-28 Thread zdenko podobny
Hello Ramon, for extending existing language you need "Tif/Box pairs" see http://code.google.com/p/tesseract-ocr/wiki/FAQ and there "How do I add just one character or one font to my favourite language, without having to retrain from scratch?" Unfortunately tif/box pairs are provided only for eng

Re: Tesseract 3.0 without page layout analysis?

2010-04-28 Thread zdenko podobny
If find how to turn it off, please share this info ;-) Zd. On Sun, Apr 25, 2010 at 5:43 PM, Jan wrote: > Thanks for the info, when I will try to change in the > tesseractmain.cpp. > > Jan > > > > On 23 Apr., 09:38, zdenko podobny wrote: > > Hello, > > &g

Re: Cannot run tesseract.exe

2010-05-13 Thread zdenko podobny
how did you installed tesseract? 2010/5/13 Mehmet Can Altıgül > Hi guys, > > I have been tryin to run tesseract.exe but it throws this error: "Unable > to load unicharset file ./tessdata/eng.unicharset" > > I use this command: "tesseract.exe ocr.bmp xx.txt" > > Seems like engilish unicharset

Re: Spaces situation in Training image

2010-05-23 Thread zdenko podobny
Hello, http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract: It is *ABSOLUTLEY VITAL* to space out the text a bit when printing, so up the inter-character and inter-line spacing in your word processor. Not spacing text out sufficiently will cause "FAILURE! box overlaps no blobs or blobs

Re: Integrating Tesseract with another open source project

2010-05-23 Thread zdenko podobny
Hi, it will be better to use forum than to contact me (I am not programmer - I am just user that try to read documentation :-) ) Zd. On Sat, May 22, 2010 at 7:38 PM, Thilanka wrote: > Hi Zdenko, > > Thank you very much for the tips. I'll contact you if > I face any problem on this.

Re: Danish fraktur support in r319

2010-05-25 Thread zdenko podobny
Did you try to use google ;-)? there is plenty of examples e.g.: http://wiki.creativecommons.org/HOWTO_Patch Zd. On Tue, May 25, 2010 at 2:53 PM, Sriranga(77yrsold) wrote: > Jimmy, > How to do? Alternatively will you kindly forward copy of pached cp

Re: PLEASE GIVE ME THE CODE OF VERSION 3 OF TESSERACT-OCR

2010-05-26 Thread zdenko podobny
http://code.google.com/p/tesseract-ocr/source/checkout On Wed, May 26, 2010 at 11:14 AM, sushovon wrote: > please anyone mail me or give the from where i can download the code > along with build of version 3 which is yet to be published.i badly > need this.plz anyone help me. > > -- > You receiv

Re: Call for testers...

2010-05-27 Thread zdenko podobny
It looks like you did not run './runautoconf' you before './configure' Zd. On Thu, May 27, 2010 at 2:34 PM, Karl Wettin wrote: > > 26 maj 2010 kl. 16.23 skrev Jimmy O'Regan: > > > I've just updated the SVN version to use libtool (and shared >> libraries, that sort of thing) but it's only teste

Re: newest release of tesseract

2010-06-08 Thread zdenko podobny
Hello, V3.00 was not released yet. I am not sure what is criteria for stable (you can compile it and run it ;-) ). You can get code via svn: http://code.google.com/p/tesseract-ocr/source/checkout Zd. On Mon, May 31, 2010 at 4:43 PM, butch wrote: > I have seen mention "tesseract 3.0" in some p

Re: Forking tesseract.

2010-06-10 Thread zdenko podobny
No what? BTW: this question was intended to Ray Smith/google or provider of data in svn. Do you speak on their behalf? Zd. On Thu, Jun 10, 2010 at 4:34 AM, Elmer Fittery wrote: > Sorry but no. > > On Wed, 2010-06-09 at 22:17 +0200, Zdenko Podobný wrote: > > Hello, > > > > do you intend to relea

Re: *** glibc detected *** tesseract: double free or corruption

2010-07-12 Thread zdenko podobny
Hello, How did you installed Tesseract? Which version? Please provide more information. Zd. On Sun, Jul 11, 2010 at 6:16 PM, msjs08 wrote: > > I've installed Tesseract on Mandriva 2010 (64 bit) and I can't get it to > run. > It just segfaults. > I installed gimagereader. This is the error I go

Re: Problem using DangAmbigs and user-words files

2010-08-06 Thread zdenko podobny
Did you tried the latest revision (r449)? Zd. On Wed, Aug 4, 2010 at 3:52 PM, caro wrote: > someone to help me? > > thank you > > On Jul 20, 4:18 pm, caro wrote: > > I try to complete these files, after looking at errors appearing > > during the recognition. > > Typically, I have the following

Announcement: new version of pyTesseractTrainer available

2010-08-13 Thread zdenko podobny
Hello, I would like to announce new version 1.01 of pyTesseractTrainer - successor of tesseractTrainer.py Version 1.00 is identical with tesseractTrainer.py. Features: - visual editor of box file - layout of symbol from box fi

Re: Announcement: new version of pyTesseractTrainer available

2010-08-21 Thread zdenko podobny
Hi, your problem is that you use tesseractTrainer.py that was done in 2007 and not pyTesseractTrainer.py (2010) that corrected this issue. I would suggest to use http://code.google.com/p/pytesseracttrainer/downloads/detail?name=pyTesseractTrainer-1.01.pyor (if you are brave enough devel version: h

Re: Which revision of tesseract 3.0 for win7 64bit

2010-08-23 Thread zdenko podobny
On Thu, Aug 19, 2010 at 11:45 PM, Max wrote: > > On Aug 19, 11:49 am, "Jimmy O'Regan" wrote: > > On 19 August 2010 11:23, Joe Degenhardt > wrote: > > > > No, that's the state of things. > > > > hmm... The latest code compiles and works for me :). May be I should > have mentioned that only the

Re: Which revision of tesseract 3.0 for win7 64bit

2010-08-23 Thread zdenko podobny
On Mon, Aug 23, 2010 at 1:19 PM, zdenko podobny wrote: > On Thu, Aug 19, 2010 at 11:45 PM, Max wrote: > >> >> On Aug 19, 11:49 am, "Jimmy O'Regan" wrote: >> > On 19 August 2010 11:23, Joe Degenhardt >> wrote: >> > >> > No, that

Re: Which revision of tesseract 3.0 for win7 64bit

2010-08-26 Thread zdenko podobny
t; tried with -l vie, it again put out another error: > > Could not open file, ./tessdata/vie.user-words > > The program should be able to continue w/o any *.user-words files. > > Thanks. > > On Aug 23, 6:19 am, zdenko podobny wrote: > > > > There is new ver

Re: Tesseract Training Problem (under Mac)

2010-09-05 Thread zdenko podobny
Hello, Tesseract 2.04 do not use "combined" file, so there is no combine_tessdata. Just copy your files to tessdata directory. At the moment http://code.google.com/p/tesseract-ocr/wiki/TestingTesseract describe training for Tesseract 3.0 (with mistakes ;-) - I started to check it so soon there wi

Re: Alternatives to recompiling with libtiff?

2010-09-16 Thread zdenko podobny
Hi, for conversion I use ImageMagick ( http://www.imagemagick.org/script/index.php) There is tool "convert" with option -compress. If you need tool with gui you can also use IrfanView on Windows. During saving of image you select option "Show option dialog" and than you can choose compression ty

Re: FAILURE! box overlaps no blobs or blobs in multiple rows

2010-09-28 Thread zdenko podobny
send also box file for that image. Zd. On Tue, Sep 28, 2010 at 5:51 PM, Bumbi wrote: > Here is the link to the image: > > http://www.sendspace.com/file/wrpke8 > > Thanks for the help! > > On szept. 28, 16:33, "Jimmy O'Regan" wrote: > > On 28 September 2010 13:11, Bumbi wrote: > > > > > If I u

Re: FAILURE! box overlaps no blobs or blobs in multiple rows

2010-09-28 Thread zdenko podobny
orks. If I resize with Mitchell to 200% it get an error message. > > On szept. 28, 18:43, "Jimmy O'Regan" wrote: > > On 28 September 2010 17:13, zdenko podobny wrote: > > > > > send also box file for that image. > > > > No need. There's a

Re: Help on training tesseract for new language

2010-09-29 Thread zdenko podobny
First of all - alway specify version you use. Based on error I guess it it 3.00 (prereleae). On linux I do not need specified TESSDATA_PREFIX (unless you want to use other than standard tessdata directory). I expect that you set wrong TESSDATA_PREFIX. Zd. On Wed, Sep 29, 2010 at 8:18 AM, Tesser

Re: Help on training tesseract for new language

2010-10-01 Thread zdenko podobny
e that I am running tesseract in 2.04 version? > > On Sep 29, 4:07 pm, zdenko podobny wrote: > > First of all - alway specify version you use. Based on error I guess it > it > > 3.00 (prereleae). > > On linux I do not need specified TESSDATA_PREFIX (unless you want to use &

Re: Tesseract 3.00 Released

2010-10-01 Thread zdenko podobny
On Fri, Oct 1, 2010 at 3:21 AM, Jimmy O'Regan wrote: > Tesseract release notes Sep 30 2010 - V3.00 > * Preparations for thread safety: > * Changed TessBaseAPI methods to be non-static > * Created a class hierarchy for the directories to hold instance data, > and began moving code i

Re: Tesseract 3.00 Released

2010-10-02 Thread zdenko podobny
On Sat, Oct 2, 2010 at 5:22 AM, Sriranga(77yrsold) wrote: > Zdenko, > > Downloaded windows binaries and works fine in WinXP Congratulations!!!. > It would have nice if you had > included relevant source code like tesseact.sln for VS2008C++ etc for > windows platform also. > > Did you try to loo

Re: Tesseract 3.00 Released

2010-10-03 Thread zdenko podobny
; > withblessi...@gmail.com> wrote: > >> Zdenko, >> tried to download from the website as suggested and tried to compile in >> VS2008 but error message disaplayed. >> this is brought to your kind notice. . i shall try svn r-498. >> BestRegards, >> -sr

Re: Tesseract 3.00 Released

2010-10-04 Thread zdenko podobny
On Tue, Oct 5, 2010 at 12:36 AM, Malky wrote: > I've compiled tesseract (and it works) but I don't know how to use the > language files from here: > https://code.google.com/p/tesseract-ocr/downloads/list > > I've unpacked language files into /usr/local/share/tessdata/ but I get > the error messag

Re: Tesseract 3.00 Released

2010-10-05 Thread zdenko podobny
On Tue, Oct 5, 2010 at 10:17 AM, Jimmy O'Regan wrote: > On 5 October 2010 07:45, zdenko podobny wrote: > > > > On Tue, Oct 5, 2010 at 12:36 AM, Malky wrote: > >> > >> I've compiled tesseract (and it works) but I don't know how to u

Released Windows installer (tesseract-ocr-setup-3.00.exe)

2010-10-12 Thread zdenko podobny
Windows installer for Tesseract-OCR 3.00 was released ( tesseract-ocr-setup-3.00.exe ). Features: - detection of installed Tesseract-OCR). Tesseract must be installed via installer - English languag

Re: 3.01 code

2010-11-28 Thread zdenko podobny
Just notice - if somebody did not recognize it yet: in svn (http://code.google.com/p/tesseract-ocr/source/checkout revision 527) there is 3.01 code that was build successfully on linux (Mandrivalinux cooker 64bit) and Windows (XP SP3, VC++2008 Express). There is info about additional 3.01 code com

Re: 3.01 code

2010-11-30 Thread zdenko podobny
it linux > systems. I may fix this better tomorrow by removing the dependency on the > function that needs 1.67. This probably also breaks the Windows build. > > Ray. > > > > On Sun, Nov 28, 2010 at 4:21 AM, zdenko podobny wrote: > >> Just notice - if somebody did

Re: First use of tesseract

2011-01-27 Thread zdenko podobny
On Thu, Jan 27, 2011 at 7:21 PM, Grimble wrote: > Mandriva 2010.2 > Compiled tesseract 3.0 and Leptonlib-1.67, and moved eng.traineddata to > /usr/local/share/tessdata. Scanned one sheet with xsane to create out.tiff > When I ran tesseract, I get > [graeme@mozart ~]$ tesseract out.tiff nci.txt >

Re: what am i missing? tesseract runs but no output

2011-02-18 Thread zdenko podobny
Hi, Just a quick reply: I tried it on Windows XP with tesseract 3.00 and it produced bad result (nothing usefull). InfranView informations dialog showed that image has resolution 72x72 DPI -> to low... So I resampled it (with Lanczos algorithm) from 100% to 300% size, set DPI to 300 and decrease

Re: VietOCR v2.0/3.1 & VietOCR.NET v2.0 Releases

2011-02-21 Thread zdenko podobny
Hello, can you please post a link, where I can find "speedy-ocr bash script" Zd. On Tue, Feb 8, 2011 at 10:06 AM, SpeedyChair wrote: > Another way to prepare a PDF document for tesseract is to use the > 'convert' command from the ImageMagick package to split an image only PDF > file into a s

Re: [Tesseract 3] English training text

2011-02-22 Thread zdenko podobny
I doubt that google will release their (full) training set :-( Have a look at svn to file eng.cube.size [1]. You can see there name of fonts that was training for English in 3.01. As far as I understood there is (unpublished/not released) possibility to train language data directly on font files.

Re: [Tesseract 3] English training text

2011-02-22 Thread zdenko podobny
to > appear in the source code but had no enough time to investigate it > thorougly > > Zdenko, would you please kindly share your other findings on Cube? > > Regards, > Dmitry > > On Tue, Feb 22, 2011 at 11:13 AM, zdenko podobny wrote: > > I doubt that google wi

Re: pixReadFromTiffStream: failed to read tiffdata

2011-02-25 Thread zdenko podobny
see: http://code.google.com/p/tesseract-ocr/issues/detail?id=340 http://code.google.com/p/tesseract-ocr/issues/detail?id=391 http://code.google.com/p/tesseract-ocr/issues/detail?id=443 Zdenko On Fri, Feb 25, 2011 at 9:15 AM, Nicolas Raoul wrote: > I get the following error on a TIFF created from

Re: pixReadFromTiffStream: failed to read tiffdata

2011-02-27 Thread zdenko podobny
> I get "failed to read tiffdata", which returns no results in Google, > so I believe it is a very new error, that has never been discussed > before. > > Thanks for your fast reply! > Nicolas Raoul > > On Feb 25, 6:57 pm, zdenko podobny wrote: > > see: >

Re: Tesseract 3.00 Released

2011-03-02 Thread zdenko podobny
On Sun, Oct 24, 2010 at 11:58 PM, Jimmy O'Regan wrote: > On 20 October 2010 23:15, Jimmy O'Regan wrote: > > On 21 October 2010 06:29, Jeffrey Ratcliffe > wrote: > >> Debian requires that each shared library have its own package. At the > >> moment, that would require the following extra package

Re: can't read frequent_words_list file

2011-03-04 Thread zdenko podobny
please provide more information: how you try create dictionary, platform, exact version of Tessaract (maybe how did you get it). Zdenko On Fri, Mar 4, 2011 at 2:50 PM, Sang Đặng Minh wrote: > hi all. my name is Sang. I'm trying to train Tessaract 2.0, everything > is ok, but i can't create DAWG

Re: What is everything I need for the linux version in English?

2011-03-20 Thread zdenko podobny
Did you try to read wiki (http://code.google.com/p/tesseract-ocr/wiki/) e.g. ReadMe? Zdenko On Sun, Mar 20, 2011 at 2:20 AM, LAPIII wrote: > I read through the list on the Downloads page, but couldn't understand > everything I needed for an install on Linux. > > -- > You received this message

Re: Tesseract compilation on code blocks (gcc + mingw)

2011-03-22 Thread zdenko podobny
Hi, I tried (as excercise ;-) ) to use cmake (http://cmake.org/) for building tesseract, because it would enable to use one build system on more platforms. I have first version (not very sophisticated :-) ) that works on linux (with gcc) but I failed on windows with mingw. Problem is that minwg mi

Re: tesseract.exe has stopped working on win2008 r2

2011-03-23 Thread zdenko podobny
Hi, tesseract is command line tool. Item i windows menu is more or less just for testing purpose (it will not be present in next version of tesseract installer). If you need gui have a look on Vietocr, PDF OCR X, lector etc. Zdenko On Wed, Mar 23, 2011 at 4:50 PM, moos3 wrote: > I have been t

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread zdenko podobny
convert it to png - you got smaller picture with the same quality and tesseract should process it without problem. Zdenko On Fri, Mar 25, 2011 at 5:03 PM, Richard Genthner wrote: > Here is the screenshot and the tif file. Dmitri if you rename the .exe that > should work. I'm trying to get the tr

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread zdenko podobny
On Fri, Mar 25, 2011 at 5:40 PM, Lutz, Michael wrote: > Hi, > > I just ran your tif file, I get no results, it must have something to do > with the size of the image. If I try to run a portion of tiff something > smaller than 1000x1000 then I get results. > > Can somebody explain why a tif size

Re: simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread zdenko podobny
On Sat, Mar 26, 2011 at 2:34 PM, rpjday wrote: > long story short, i'm seeing this issue on my ubuntu 10.10 system: > > http://ubuntuforums.org/showthread.php?t=1599686 > > the packages i have installed: > > * tessearct-ocr > * tesseract-ocr-eng > > which version you installed? > i took a si

Re: simple invocation of tesseract on ubuntu generates a single-byte output file

2011-03-26 Thread zdenko podobny
On Sat, Mar 26, 2011 at 3:56 PM, Robert P. J. Day wrote: > On Sat, 26 Mar 2011, zdenko podobny wrote: > > > On Sat, Mar 26, 2011 at 2:34 PM, rpjday wrote: > > long story short, i'm seeing this issue on my ubuntu 10.10 system: > > > >http://ubuntu

Re: tesseract.exe has stopped working on win2008 r2

2011-03-27 Thread zdenko podobny
r message... > > Warm regards, > Dmitri Silaev > > > > > > On Sat, Mar 26, 2011 at 5:42 PM, zdenko podobny wrote: > > > > > > On Fri, Mar 25, 2011 at 5:40 PM, Lutz, Michael wrote: > >> > >> Hi, > >> > >> I just ran your tif

Re: tesseract.exe has stopped working on win2008 r2

2011-03-27 Thread zdenko podobny
On Sun, Mar 27, 2011 at 12:45 AM, TP wrote: > On Sat, Mar 26, 2011 at 7:42 AM, zdenko podobny wrote: > >> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed? > > The test image has 16 bpp. > > Interesting. How did get this information? I tried:

Re: tesseract.exe has stopped working on win2008 r2

2011-03-28 Thread zdenko podobny
utable, just rename it to tesseract.exe if it went through, it > is a release static build using Win7 and WinSDK 7.1 if anyone still wants > it. > > Regards, > Mike > > -Ursprüngliche Nachricht- > Von: Dmitri Silaev [mailto:daemons2...@gmail.com] > Gesendet: Samsta

Re: Newbie tesseract training question

2011-03-28 Thread zdenko podobny
Can you provide example image file (TainingMontage.png)? Zdenko On Mon, Mar 28, 2011 at 11:12 PM, Robin wrote: > Hi, > > I'm reasonably new to tesseract and am trying to train it to recognise > hex characters from a dot matrix LED display. The characters are > clear and well spaced, but the bo

Re: Problem with Tesseract 3.00

2011-03-30 Thread zdenko podobny
Hi, unfortunately some fixes regarding windows build was committed after releasing 3.00 version (=revision 498). I thought about 3.00.1 release (=revision 525) and as "temporary solution" I created 3.00.1 tesseract.exe (somebody ask for it). Than I changed my mind because it looks that developers

Re: disable newline in table layout recognition

2011-03-30 Thread zdenko podobny
On Wed, Mar 30, 2011 at 8:55 AM, Max Cantor wrote: > I had a similar issue. I couldn't get the config to work but basically > added this line to my code and it worked: > >api.SetPageSegMode(tesseract::PSM_SINGLE_COLUMN); > > For some reason, the tesseract binary doesn't pick up the config, b

Re: tesseract-3.01 compiling issue on linux

2011-04-07 Thread zdenko podobny
Did you tried also fresh svn (e.g. delete old local svn version or download to another directory)? What linux distribution you use? Zdenko On Thu, Apr 7, 2011 at 11:30 PM, zl2k wrote: > I tried but no luck, 3.00 is compilable though. > > On Apr 7, 4:05 pm, Zdenko Podobný wrote: > > Did you run

Re: Tesseract OCR in daemon mode?

2011-04-08 Thread zdenko podobny
On Thu, Apr 7, 2011 at 9:33 PM, Mike Sandford wrote: > I don't know if it's strictly necessary for my application, but I am > trying to analyze anywhere from a few characters up to a few lines of > text rapidly. Tesseract is a portion of my application pipeline. > I've got my own document layout

Re: Problem with eng.traineddata after 3 or 4 successful runs against different pdf's

2011-04-14 Thread zdenko podobny
On Wed, Apr 13, 2011 at 2:31 AM, caudex wrote: > After using regedit and pointing tessdata_prefix to the right place > and running again I got an error that referred to unicharset. The > entire contents of my tessdata subdirectory is: > > Directory of C:\tesseract\Tesseract-OCR\tessdata > > 04/0

Re: Build files for the Tesseract OCR for android (Windows Xp)

2011-04-18 Thread zdenko podobny
Hi, the commands mentioned in README are very simple that can be replaced by other windows program (e.g. unpacking) or by gnuwin32 tools. If you can not do it by yourself, that I really suggest you invest your time to something else. You will face much more difficult tasks e.g. as far as i know no

Re: Difficulties to use Tesseract

2011-04-24 Thread zdenko podobny
Hello, I use tesseract on Mandrivalinux without problem. But I compiled it by myself ;-) I am not satiffied with packages provided by Mandriva team ;-) e.g. they included tesseract 3.00 to cooker but without English language data, they did not include leptonica library, that is used for image hand

Re: Difficulties to use Tesseract

2011-04-24 Thread zdenko podobny
Did you recompiled tesseract? Can you send your out.tiff? (search forum for problems/limitation of tiff images) -- Zdenko On Sun, Apr 24, 2011 at 7:21 PM, Giby_the_kid wrote: > Then after cheaking, libtiff is installed... I installed Leptonica, > but it still does not work :( > > On 24 avr, 18:4

Re: creating train data set for Korean

2011-04-28 Thread zdenko podobny
On Thu, Apr 28, 2011 at 6:03 PM, Oleg Tikhonov wrote: > Hi guys, > > I've installed tesseract-ocr 3.0 on Windows 7. All work fine if selected > language is English. > I tried to add/teach the system the Korean. The first step was creating > sample of data, I created some tiff files with Korean in

Re: creating train data set for Korean

2011-04-29 Thread zdenko podobny
usr/share/tessdata/kor.unicharset > > Thanks, > > --Oleg > > 2011/4/29 zdenko podobny > >> 2011/4/29 Oleg Tikhonov >> >>> Zdenko, Quan and Sven, >>> Thanks a lot for your suggestions, I think you nailed the problem, >>> So, I installed

Re: Deskew waves in a document

2011-05-07 Thread zdenko podobny
Hi, I am not sure if I understood your problem (e.g. if you are looking for "dewarp" ("straighten text line") feature. In leptonica there are example programs for dewarping: dewarp_reg.c and dewarptest.c. I try it to on one of my project, but it did not worked on my images (e.g. I plan to play wi

Re: Deskew waves in a document

2011-05-07 Thread zdenko podobny
here is link for leptonica dewarp documentation: http://tpgit.github.com/UnOfficialLeptDocs/leptonica/dewarping.html Zdenko On Sat, May 7, 2011 at 9:19 AM, zdenko podobny wrote: > Hi, > > I am not sure if I understood your problem (e.g. if you are looking for > "dewarp"

Re: Custom Wordlist without Retraining

2011-05-08 Thread zdenko podobny
see [1] or user-words on the same page. [1] http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Putting_it_all_together Zdenko On Sun, May 8, 2011 at 5:53 PM, Max Cantor wrote: > Is there a way to set up a custom wordlist without going through the entire > retraining process? our w

Re: Custom Wordlist without Retraining

2011-05-08 Thread zdenko podobny
to get the component files for the > eng.trainneddata? > > sorry if i'm missing something obvious... > > max > On May 9, 2011, at 1:40 AM, zdenko podobny wrote: > > > see [1] or user-words on the same page. > > > > [1] > http://code.google.com/p/tes

Re: Custom Wordlist without Retraining

2011-05-09 Thread zdenko podobny
no problem :-) I think you will like option "-o" too. Zdenko On Mon, May 9, 2011 at 8:27 AM, Max Cantor wrote: > I feel really dumb now. Sorry for the bother. > > > Thanks, max > > On May 9, 2011, at 14:01, zdenko podobny wrote: > > Please try to re

Re: Catalan language

2011-05-11 Thread zdenko podobny
On Wed, May 11, 2011 at 9:22 PM, jinglada wrote: > In the /usr/share/tesseract-ocr/tessdata I have the following files: > > cat.DangAmbigs spa.DangAmbigs eng.DangAmbigs fra.DangAmbigs > por.DangAmbigs > cat.freq-dawg spa.freq-dawg eng.freq-dawg fra.freq-dawg > por.freq-dawg > cat.inttemp

Re: mftraining produces "Missing font_properties"

2011-05-17 Thread zdenko podobny
On Tue, May 17, 2011 at 9:08 AM, Eyal wrote: > Hi, > > I tried to train some letters & when I ran the *mftraining *with the > parameters*:* > *mftraining -U unicharset -O lang.unicharset font1.tr *I recieved an error > message: "Missing font_properties". > > I'm working on windows 7, visual studi

Re: mftraining produces "Missing font_properties"

2011-05-17 Thread zdenko podobny
On Tue, May 17, 2011 at 11:58 AM, Eyal wrote: > Quite a good guess, but I'm very disappointed to to say - I DID read the > documentation... > > And I even run the following command: > > *mftraining -F font_properties -U unicharset font1.tr* > > And I got results which don't show any error... : >

Re: About the jpn.traindata

2011-05-17 Thread zdenko podobny
On Tue, May 17, 2011 at 5:01 PM, Илья wrote: > IMHO alphabets can't be protected by copyright. > > Mostafa did not asked for an alphabets. He asked for 'all the tif files that used for creating...' and content of tiff file (e.g. scanned books) could be protected by copyright. -- > Best regards

Re: Issue 490 - Exception while training with mftraining & cntraining

2011-05-19 Thread zdenko podobny
On Thu, May 19, 2011 at 9:38 AM, Eyal wrote: > I've opened this issue: > http://code.google.com/p/tesseract-ocr/issues/detail?id=490&start=100 > Afterward > I've noticed that there's alreadya similar issue 382: > http://code.g

Re: Issue 490 - Exception while training with mftraining & cntraining

2011-05-19 Thread zdenko podobny
I did it. Zdenko On Thu, May 19, 2011 at 11:52 AM, Eyal wrote: > I didn't find a way to mark an issue as duplicate. > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to tesseract-ocr@googlegroups.com >

Re: About the jpn.traindata

2011-05-19 Thread zdenko podobny
e that contains all supported alphabetics characters. > > Also, Parts of scanned books could not be protected by copyright. > > > > Can you give any contacts of "jpn.traindata" dev team? > > > > -- > > Best regards, > > Ilia. > > &

Re: mftraining produces "Missing font_properties"

2011-05-19 Thread zdenko podobny
On Wed, May 18, 2011 at 1:15 PM, Eyal wrote: > WOW!!! > > It worked. > > If you'll look again at the training manual, you'll see that there wasn't a > combination of both -F & -O and that's why I didn't write such command. > > I will try to improve wiki pages (e.g. AddOns) in next days. If you or

Re: can't compile tesseract on win7 with visual C++ 2010 express

2011-05-20 Thread zdenko podobny
Hi, It is written on main page: supported platform is Windows (x86/32) with Visual C++ Express 2008 [1]. As I heard it is not a big problem to compile VS2008 project files (in directory vs2008) in VS2010 ;-) In svn version there is also initial support for VS2010 (directory vs2010) created Micha

Re: can't compile tesseract on win7 with visual C++ 2010 express

2011-05-20 Thread zdenko podobny
help > > Thanks > > Sarel > > > > > > > On Fri, May 20, 2011 at 2:15 PM, zdenko podobny wrote: > >> Hi, >> >> It is written on main page: supported platform is Windows (x86/32) with >> Visual C++ Express 2008 [1]. As I heard it is not a big proble

Re: Tesseract 3.01 Training and Error opening unicharset file

2011-05-21 Thread zdenko podobny
On Fri, May 20, 2011 at 4:44 PM, Holm Dressler wrote: > Hi there, > > I want to create tessdata files on a given tiff on my Linux system. My > tiff is called k05.tif > > I used the description on > > http://aravindavk.in/view/tesseract_ocr_initial_setup > > which means I do the following step

Re: Create traineddata from different tif and box files

2011-05-26 Thread zdenko podobny
Hi, Problem is that you use the latest version and you do not read the latest manual [1]. If I correctly understood that German manual (via google translate), it is for version 3.00 so it do not follow changes in 3.01 version. Another "problem": 3.01 is not released yet. It is for developers and

Re: Create traineddata from different tif and box files

2011-05-26 Thread zdenko podobny
3.01 :-) Some good information could be found in tesseract forums. All links are on main project page. Surprisingly ;-) Zdenko Thanks > > Sarel > > > > > On Thu, May 26, 2011 at 1:33 PM, zdenko podobny wrote: > > Hi, > > Problem is that you use the latest version

Re: Create traineddata from different tif and box files

2011-06-01 Thread zdenko podobny
EE: step 6 was missing (with >> which means you should have > two lines in your font_properties) > > > So Jimmi: now it is your turn :-) > > Talk soon > > Holm > > > > On May 26, 2:23 pm, zdenko podobny wrote: > > On Thu, May 26, 2011 at 2:02 PM,

Re: Building Tesseract with VC2008

2011-06-06 Thread zdenko podobny
Hello, leptonlib* is leptonica library. You can download it (version 1.67) from http://code.google.com/p/leptonica/downloads/list. Or grab the latest svn version of tesseract (leptonica library is there). There are solved also other issues... Zdenko On Tue, Jun 7, 2011 at 2:05 AM, David Amazin

Re: When make , there are errors !!!!!!!!

2011-06-09 Thread zdenko podobny
I am sorry but I do not have have a crystal ball ;-) Please provide necessary details (what version of OS you use, exact version of tesseract, your compilation steps...) Zdenko On Wed, Jun 8, 2011 at 9:02 PM, ビ wrote: > make all-recursive > Making all in ccstruct > /bin/sh ../libtool --tag=CXX

Re: Tesseract doesn't work with a very simple example

2011-06-17 Thread zdenko podobny
First of all - please read documentation e.g. [1]. It can save your time ;-). [1] http://code.google.com/p/tesseract-ocr/wiki/FAQ#Is_there_a_Minimum_Text_Size?_(It_won't_read_screen_text!) Zdenko On Fri, Jun 17, 2011 at 4:05 PM, Felipe Coutinho wrote: > Hello, > > I'm a new tess user. I'm tryin

Re: Training procedure

2011-06-21 Thread zdenko podobny
If you got error on font_properties file, send also font_properties ;-) Zdenko On Tue, Jun 21, 2011 at 2:45 PM, Esteban Bordón wrote: > For example using these files provides in > http://tesseract-ocr.googlecode.com/files/boxtiff-2.01.spa.tar.gz and the > command lines bellow > > *]$ tesseract

Re: Training procedure

2011-06-21 Thread zdenko podobny
. > > 2011/6/21 zdenko podobny > >> If you got error on font_properties file, send also font_properties ;-) >> >> Zdenko >> >> On Tue, Jun 21, 2011 at 2:45 PM, Esteban Bordón wrote: >> >>> For example using these files provides in >>>

Re: Creating DLL for tessract3

2011-06-22 Thread zdenko podobny
Please read ReadMe [1] Unfortunately tessdll was not removed on time from source so it became part of source code released as version 3.00. But it is not working. Have a look on and search tesseract-dev forum [2] there for 'tessdll' and maybe for 'wrapper' have a better overview). Look on AddOns

Re: Creating DLL for tessract3

2011-07-09 Thread zdenko podobny
On Sat, Jul 9, 2011 at 2:21 PM, Sarel van der Merwe wrote: > look at this thread... > https://mail.google.com/mail/?shva=1#inbox/130fd043420179a8 > > why so send link to (your) gmail inbox? > > On Sat, Jul 9, 2011 at 1:38 PM, Alexander Lubyagin > wrote: > > On Jun 22, 4:09 pm, sisi wrote: >

Re: Compiling Tesseract SVN under windows

2011-07-11 Thread zdenko podobny
Hi, at the moment only Visual C++ Express 2008 (it is for free) is supported on Windows (x86/32). In svn there is also support for VC2010... Zdenko On Mon, Jul 11, 2011 at 10:45 AM, wrote: > Hi all, > > It seems nobody knows how to compile tesseract using cygwin. > > Now I want to ask what is

Re: read_params_file

2011-07-26 Thread zdenko podobny
On Mon, Jul 25, 2011 at 10:50 PM, Donald Hume wrote: > I have used the tesseract.exe from the SVN for Tesseract 3.0. You can > see in this screenshot: http://j.drhu.me/TesseractBox.png > > Image says you use tesseract 3.01 (not 3.0) - e.g. you had to compile Tesseract by yousefl. As Dmitri pointe

Re: Memory management in Tesseract

2011-07-27 Thread zdenko podobny
see: http://code.google.com/p/tesseract-ocr/source/browse/trunk/api/baseapi.h#245 Zdenko On Wed, Jul 27, 2011 at 6:01 AM, Sandeep Parmar wrote: > Hello everyone, > > I am using the following code snippet, within this I would like to know > whether 'GetUTF8Text' will destroy my source image 'arr

Re: Re: query on French Script MT tif images

2011-07-27 Thread zdenko podobny
If you are really interesting in help, than provide example image ;-) Zdenko On Wed, Jul 27, 2011 at 11:45 AM, wrote: > Hi, > > When i run the command tesseract fsmt.tif output > it shows me some junk data "ȉY`I'I/2," for image with having "Mentally" > as the text in this font. > > Any idea p

Re: Problem with training Tesseract 3.01 (svn r596)

2011-07-28 Thread zdenko podobny
As always - can you please send example image + box file? Zdenko On Thu, Jul 28, 2011 at 9:26 AM, Sandeep Parmar wrote: > Hi, > I am using English language fonts like 'Comic sans MS', 'Times','Arial' > etc. > > > On Thu, Jul 28, 2011 at 12:50 PM, Sriranga(78yrsold) < > withblessi...@gmail.com>

Re: [r596] "Error opening data file ./tessdata/xxx.traineddata"

2011-07-28 Thread zdenko podobny
On Thu, Jul 28, 2011 at 11:17 AM, 73r0 wrote: > Hi, > > I downloaded the last revision of Tesseract (r596) form SVN. Then I > build with vs2010 whitout any issues to get the tesseract.exe file. I > added the path of tesseract.exe in the PATH environnement variable so > i can call tesseract in a s

Re: Problem with training Tesseract 3.01 (svn r596)

2011-07-28 Thread zdenko podobny
;-) ) My build can be sound here [1] (build with VS 2008 on Windows XP SP3). I just compress with upx to get smalled exe Zdenko [1] https://github.com/zdenop/qt-box-editor/downloads Thanks > Sandeep > > > On Thu, Jul 28, 2011 at 3:36 PM, zdenko podobny wrote: > >> I run (svn r

Re: Problem with training Tesseract 3.01 (svn r596)

2011-07-28 Thread zdenko podobny
iles? > With warmest Regards, > -sriranga(78yrs) > > you can build it by yourself. this is not official release. it was publish just to test one step of traning. > > On Thu, Jul 28, 2011 at 5:32 PM, zdenko podobny wrote: > >> >> >> On Thu, Jul 28, 2

Re: "Error opening data file ./tessdata/xxx.traineddata"

2011-07-28 Thread zdenko podobny
AFAIR this error means that version of tessdata file(s) (xxx.traineddata) do not match version of tesseract. Check if you have 3.01 data files (traineddata) in tessdata folder. Zdenko On Thu, Jul 28, 2011 at 2:09 PM, 73r0 wrote: > Thanks a lot for the answer that was the problem. It was obvious

Re: Error in Box Train for Tesseract3.01(svn r596)

2011-07-30 Thread zdenko podobny
On Sat, Jul 30, 2011 at 6:43 AM, Sandeep Parmar wrote: > Deal all, > > I am getting following error while training the Box files > > First of all: write always exact command you use! > "read_params_file: parameter not found: tessedit_use_nn" > > This means that you have in your config file para

  1   2   3   4   5   6   7   8   9   10   >