Re: Fonts in tesseract 3.01

2011-08-03 Thread zdenko podobny
Did you try to open 'eng.traineddata' in some text viewer and search for "arial"? Zdenko On Wed, Aug 3, 2011 at 1:51 PM, Sandeep Parmar wrote: > Dear all, > > can anyone tell me which fonts are included in 'eng.traineddata' of > tesseract 3.01? > > Regards > Sandeep > > -- > You received this me

Re: Tesseract for recognising birth dates/names

2011-08-04 Thread zdenko podobny
Hello, first of all - please provide example image. Than people can make you suggest you some improvement. Zdenko On Thu, Aug 4, 2011 at 10:44 PM, sydd wrote: > Hello > > I need to make an OCR application, that can recognise birth dates, for > example: > "John, Smith" > "02/01/2011" > "John, S

Re: Tesseract for recognising birth dates/names

2011-08-05 Thread zdenko podobny
> be OCRed separately) > > On Aug 5, 8:17 am, zdenko podobny wrote: > > Hello, > > > > first of all - please provide example image. Than people can make you > > suggest you some improvement. > > > > Zdenko > > > > > > > > On Thu, A

Re: Configure tesseract-ocr

2011-08-08 Thread zdenko podobny
What you want to configure and what did you try? On Tue, Aug 9, 2011 at 6:53 AM, Cheyvarman wrote: > Anyone, can tell me how to configure tesseract-ocr any version in > windows? > It's not worked to configure it via instruction :( > Thanks in advance > > -- > You received this message because yo

Re: Tesseract 3.01 Training : Error opening unicharset file

2011-08-09 Thread zdenko podobny
can you sent output of command "dir" from folder where you run "combine_tessdata fra."? On Tue, Aug 9, 2011 at 5:07 PM, ReneFR wrote: > Hi there, > > version 3.01 ; Training procedure ; windows XP SP3. > > the command "combine_tessdata fra." (i didn't forget the "." !) , > doesn't work and disp

Re: Configure tesseract-ocr

2011-08-09 Thread zdenko podobny
k google translate for help :-) ) but not to speak (and it's a shame because I learned Russian for many years). Zdenko > Thanks, Cheyvarman! > Best regards! > > 2011/8/9 zdenko podobny > >> What you want to configure and what did you try? >> >> On Tue, Aug 9

Re: Getting different (garbage) text by running tesseract from different folders

2011-08-12 Thread zdenko podobny
can you please provide image file and info what version of tesseract you used? Zdenko On Fri, Aug 12, 2011 at 9:03 AM, Parmeet bhatia wrote: > Hi All, > > I observe something pretty strange but could not figure out whats the > problem. When i run tesseract from command line from the same folder

Re: Getting different (garbage) text by running tesseract from different folders

2011-08-12 Thread zdenko podobny
le. The version is 3.0 >> > Some extra info. : I am doing automatic page layout before giving it to >> > tesseract but sometimes non-text blocks also got detected. The attached >> > image is one example. With proper text blocks, the results are same >> > but surpri

Re: Configure tesseract-ocr

2011-08-17 Thread zdenko podobny
9, 2011 at 11:56 AM, Sovila Srun >> wrote: >> > Thanks a lot, Zdenko! Now, I successfully configured. >> > I have a question to you. I would like to train to system for Khmer >> > language, do you have any comments about this? From what I need to start >> it.

Re: Ang.: Re: logging data with web camera and tessaract

2011-08-18 Thread zdenko podobny
I try it with 3.01 version and: tesseract download\webcam4-2.jpg webcam4-2 produced empty page. BUT: tesseract download\webcam4-2.jpg webcam4-2 -psm 7 produce correct result... Zdenko On Fri, Aug 19, 2011 at 6:54 AM, Andres wrote: > Hi Andriy, > > I'm using Tesseract 2.04 > > I don't remember

Re: Ang.: Re: logging data with web camera and tessaract

2011-08-19 Thread zdenko podobny
it and could not find an > answer. When I try to run "tesseract.exe webcam4-2.jpg text -psm 7" I > get: > read_variables_files: Can't open psm > read_variables_files: Can't open 7 > > I googled that error as well and also got not much help. > Can you try also pr

Re: Configure tesseract-ocr

2011-08-23 Thread zdenko podobny
wLThiMjUtODdlZDI2N2ExMzYx&hl=en_US > > > https://docs.google.com/leaf?id=0B9BTtR5QkyOgZmRhMWFkODAtZDQ5OS00OWY5LTk5ZmUtZWRlZTc0N2ExMGZi&hl=en_US > > > https://docs.google.com/leaf?id=0B9BTtR5QkyOgZTE3N2RlZWMtYjRjNi00NTkyLTljZDQtOTgwNDljNmQ3ZDhi&hl=en_US > > > > > > > > On Wed,

Re: Configure tesseract-ocr

2011-08-23 Thread zdenko podobny
gt; That's my opinion. > > Warm regards, > Dmitri Silaev > www.CustomOCR.com > > > > > > On Tue, Aug 23, 2011 at 11:13 AM, zdenko podobny wrote: > > > > On Tue, Aug 23, 2011 at 9:01 AM, Dmitri Silaev > > wrote: > >> > >> He-he, IMHO is a wa

Re: Windows CreateProcess() scheduling

2011-08-24 Thread zdenko podobny
On Wed, Aug 24, 2011 at 4:26 AM, tteveris wrote: > Installed Tesseract 3.0 and then copied the installed folder to the > following location. > > > C:\Program Files (x86)\XXX\Software\OCR\Tesseract-OCR > C:\Program Files (x86)\XXX\Software\OCR\Tesseract-OCR\doc > C:\Program Files (x86)\XXX\Softwar

Re: training directory not in tarball

2011-09-05 Thread zdenko podobny
On Sun, Sep 4, 2011 at 10:04 PM, Muriel VD wrote: > Hi, > I tried downloading the 3.0 tarball form the site.. > please be specific - can you provide link? > make fails to find (the non existing) training directory. > training directory is part of official 3.00 tarball you can check it ;-) If

Re: Multiple columns text.

2011-09-09 Thread zdenko podobny
On Fri, Sep 9, 2011 at 9:48 AM, Slavko Kocjancic wrote: > Hello > > About digging I find the why didn't work... > In download section there is main download tesseract-ocr-setup-3.00.exe > > A

Re: combine_tessdata do not make a bnm.trainneddata in tessdata folder

2011-09-15 Thread zdenko podobny
Yes, you are right - combine_tessdata do not make a bnm.trainneddata in tessdata folder. It never promiss to do it. If you need reply with more details, first you must povide details :-) E.g. OS, tesseract/combine_tessdata version, exact command you run, content of directory before/after running c

Re: Trainneddata on Linux vs Windows

2011-09-27 Thread zdenko podobny
yes, you can. On Tue, Sep 27, 2011 at 4:23 PM, merve t wrote: > Hello, > I know you are about to kick me from the list for too much asking but > > can i use trainneddata in Linux, even if i generate the trainneddata in > Windows? > > -- > You received this message because you are subscribed to t

Re: Compiling and Running on AIX

2011-09-29 Thread zdenko podobny
First of all - put here information about which version/svn revision you try to use ;-) Next: if you have problem (after changing code, testing new instalation), use example images that are part of tesseract for testing: eurotext.tif and phototest.tif Zdenko On Thu, Sep 29, 2011 at 5:05 AM, Micha

Re: Convert TIF to PDF

2011-09-30 Thread zdenko podobny
tesseract try to convert image to txt or html/hocr. I remember that there are some tools that use tesseract for adding text to djvu. Just quick check with google brought this: http://www.djvu.org/forum/phpbb/viewtopic.php?t=765 http://en.wikisource.org/wiki/Help:DjVu_files/OCR_with_Tesseract http:

Re: Installation

2011-09-30 Thread zdenko podobny
On Fri, Sep 30, 2011 at 6:20 PM, merve t wrote: > Hello, > I installed tesseract-ocr software on ubuntu linux by downloading from svn > and compiling and making and installing. > Now i am looking at /usr/include directory and can not see tesseract > related headers. > > You should not look at "/u

Re: Illegal feature parameter spec!

2011-10-01 Thread zdenko podobny
try to post your files ... On Sat, Oct 1, 2011 at 10:48 AM, merve t wrote: > i have tried to delete first and last char but i can get no positive > result. > Thanks for sharing idea > > > 2011/9/30 Calomer > >> Sriranga, >> >> Quoting original poster of the post I've linked: >> >> "This bug doe

Re: Tesseract crashes while converting image on win7 machine

2011-10-18 Thread zdenko podobny
unsupported image type. On windows only debug version of Leptonica library produces error messages to terminal (e.g. you should build tesseract against debug version of leptonica if you want to see error messages produced by leptonica). Have a look at [1]. On linux I see this: Tesseract Open Sour

Re: Using Tesseract with a typed document

2011-10-22 Thread zdenko podobny
On Fri, Oct 21, 2011 at 11:53 PM, Giby_the_kid wrote: > I tried tesseract on a typed document, an old, huge and very important > document. > There is a long time I haven't used tesseract (and never on this > computer), then I do not know if the trouble come from the document > itself, from xsane w

Re: Using Tesseract with a typed document

2011-10-22 Thread zdenko podobny
it looks like you there is BOM [1]. I test it today and I was able to compile it without problem on Mandrivalinux and Windows XP. Try to open that file in editor and remove whole comment section (from /* to */)... or replace the file with this [2] file. Zdenko [1] http://en.wikipedia.org/wiki/B

Re: Using Tesseract with a typed document

2011-10-22 Thread zdenko podobny
non-MS Windows platforms. [1] http://tesseract-ocr.googlecode.com/svn/trunk/ccutil/strngs.h [2] find . -type f|while read file;do [ "`head -c3 -- "$file"`" == $'\xef\xbb\xbf' ] && echo "found BOM in: $file";done On 22 oct, 14:05, zdenko podobny

Re: Is there minimum of letters?

2011-10-25 Thread zdenko podobny
On Mon, Oct 24, 2011 at 8:41 PM, patrickq wrote: > What's PSM? Alternative spelling for PMS :-)? > > See: $ tesseract Usage:tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfile...] pagesegmode values are: 0 = Orientation and script detection (OSD) only. 1 = Automatic page segme

Re: Double or single Digit detection

2011-10-28 Thread zdenko podobny
'-psm 8' works for me On Fri, Oct 28, 2011 at 1:37 PM, Diez B. Roggisch wrote: > Hi all, > > I'm trying to detect page-numbers in an otherwise empty book. Using > the OpenCV I can extract the page-number, align it, and threshold a > good B/W picture out of it. > > However, when running tesseract

Tesseract 3.01 Released

2011-10-30 Thread zdenko podobny
Hello all, Tesseract 3.01 was released and you can find it in download section [1] or on the Project page in section "Featured". Windows installer was build on Windows XP SP3 with VC++ 2008 Express, so maybe you will need Microsoft Visual C++ 2008 SP1 Redistributable Package (x86) [2]. Tesseract.

Re: Odd behaviour with scan resolution

2011-10-30 Thread zdenko podobny
At the moment I work on digitization (and tesseract training ;-) ) of one old book written with fraktur font. I got it as jpeg ;-) photos (size 1400 x 2127) - one page has usually 6 Mb. I just fixed descew, geometry fixing (base line) and converted it to png. Interesting part (at least for me): an

Re: From the ReadMe - "The dll isn't supported in Tesseract-OCR 3.00"

2011-10-30 Thread zdenko podobny
On Fri, Oct 28, 2011 at 8:17 AM, Slavko Kocjancic wrote: > Just simple question... > > Why dll is removed from 3.00 at all? > Because it was not working and nobody fixed it. Feel free to create it. This is IMHO our task as tesseract community :-) . Those who are interesting could join to tesser

Re: get image direction ?

2011-11-01 Thread zdenko podobny
Have you tried "-psm 1" (e.g. 'tesseract yourImage.png output -l fra -psm 1')? Its it no perfect. I got better result if I rotate image by myself ;-) but... it is there Zdenko On Mon, Oct 31, 2011 at 3:32 PM, speeder wrote: > I will love a release of Tesseract that auto-rotates. > > > On Mon, O

Re: get image direction ?

2011-11-03 Thread zdenko podobny
> > if i use "-psm 1", i became the message: "Warning orientation and > script detection requested, but osd language faild to load". > > What that this mean? I have installed 3.01 on Windows. > > adlerfalke > > On 1 Nov., 21:58, zdenko podobny wrote:

Re: Using other languages - installation goes wrong!!! Need help!

2011-11-06 Thread zdenko podobny
please try new version of installer: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.01-1.exe Zdenko On Sun, Nov 6, 2011 at 4:52 AM, Joe wrote: > Windows-Installer told me more than one time (s. below): > > "tgz_extract: bad header checksum > Error: Failure reading from tarball.

Re: on the iOS platform?

2011-11-08 Thread zdenko podobny
I am not sure what do you mean with SDK for tesseract, but regarding iOS have a look on this article: http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/ Zdenko On Mon, Nov 7, 2011 at 5:29 PM, ArtY wrote: > Anyone know if the SDK for tesseract is ported s

Re: Svn installation

2011-11-10 Thread zdenko podobny
On Wed, Nov 9, 2011 at 6:02 PM, Merve Temizer wrote: > Hi, > I run these commands on ubuntu > > apt-get install autoconf automake > > svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr > > > i can see "Checked out revision 640." > > cd tesseract-ocr > ./runautoconf > > > >

Re: bad detection of numbers with leading zeros - and a bug

2011-11-12 Thread zdenko podobny
can you send example of image where tesseract detect number with leading zero? Zdenko On Fri, Nov 11, 2011 at 9:56 PM, Diez B. Roggisch wrote: > Hi, > > I'm trying to detect page numbers in a book. Contrary to normal page- > numbers, the ones below 10 are written with a leading zero - 01, > 02

Re: tesseract in c++?

2011-11-13 Thread zdenko podobny
On Sun, Nov 13, 2011 at 5:32 AM, cyrt wrote: > Could someone please explain how I can use tesseract in c++? use api/tesseractmain.cpp api/tesseractmain.h as example > I > downloaded the source files via svn and compiled the solution. Which > libraries do I now have to link and where do I find

Re: using opencv image in tesseract

2011-11-15 Thread zdenko podobny
try to have a look at this issue[1] - somebody sent there (python binding for tesseract) a patch to set OpenCV image directly to tesseract Zdenko [1] http://code.google.com/p/python-tesseract/issues/detail?id=8 On Mon, Nov 14, 2011 at 7:33 PM, cyrt wrote: > I'd like to perform OCR on subimage

Re: Hungarian language data

2011-11-24 Thread zdenko podobny
On Wed, Nov 23, 2011 at 11:14 AM, Örs wrote: > Hi there, > > I would like to run tesseract on Ubuntu 11.10 under gImageReader. I > have downloaded hun.traineddata.gz hun.traineddata.gz is tesseract 3.0x language data file. > file and extracted it into the /usr/ > share/tesseract-ocr/tessdata

Re: error while loading shared libraries: libtesseract.so.3: cannot open shared object file: No such file or directory

2011-11-24 Thread zdenko podobny
On Thu, Nov 24, 2011 at 12:36 PM, chethan wrote: > hi, > > i have installed tesseract 3.01 in ubuntu 11.10 from this link given > below. > http://code.google.com/p/tesseract-ocr/wiki/ReadMe - linx > > i have followed all the steps and installed, when i am > executing...command: tesseract ~/input

Re: newbe for ocr

2011-11-27 Thread zdenko podobny
On Sun, Nov 27, 2011 at 8:34 AM, dt wrote: > hi, > > i need to train the software for Hebrew language. > > can someone help on how to do it from the beginning. > > let say i have image with text "everybody! like to ?go to the (sea) of > love_ " > (in hebrew) > how do i train it to work? > > > 1.

Re: using tesseract hocr output to create a searchable PDF

2011-11-30 Thread zdenko podobny
just for remark: Mihail Radu Solcan in 2008 posted 2 articles [1], [2] about adding text to DjVu files. I am not sure if there are such possibilities/tools for pdf. Anyway - he used box file for this task (hocr was not available) You did not specified language but in case of python try to have a

Re: Tesseract 3.00

2011-12-07 Thread zdenko podobny
Regarding "FreeOCR V3" you should ask authors of " FreeOCR V3" why it did not work... Zd. On Tue, Dec 6, 2011 at 5:33 PM, Onion wrote: > Hi, it's me again. Trying to figure this out 9 months later :) > > I opened FreeOCR V3. > I opened a scanned jpeg image. The image is two pages of a book with

Re: Tesseract 3.00

2011-12-08 Thread zdenko podobny
Have a look at http://code.google.com/p/tesseract-ocr/wiki/AddOns#GUI Zdenko On Wed, Dec 7, 2011 at 10:42 PM, Onion wrote: > That occurred to me, but I could not find a link to contact them, ie there > was no FreeOCR forum to be found :( > > Is there another way of achieving the results of tra

Re: Need a Visual Basic support

2011-12-13 Thread zdenko podobny
have a look at https://groups.google.com/group/tesseract-dev/browse_thread/thread/75be5c97eb4d1b3c Zdenko On Tue, Dec 13, 2011 at 5:35 AM, Lahiru Himash Madusanka < lahiru.lahirumadusa...@gmail.com> wrote: > I'm developing a tesseract gui from Visual Basic 2008. I need a Visual > Basic supporti

Re: The most simple use of tesseract in C++ app

2011-12-17 Thread zdenko podobny
You can try this: $ convert -rotate 10 phototest.tif phototest-r.png $ g++ -o test test.cpp -I/usr/local/include/tesseract/ -I/usr/local/include/leptonica/ -L/usr/local/libs -ltesseract $ ./test where: *phototest.tif* is from tesseract source *convert* - is part of imagemagick. First line is not

Re: using tesseract for a credit card reader

2011-12-27 Thread zdenko podobny
http://code.google.com/p/tesseract-ocr/issues/detail?id=574&can=1&q=card Maybe I am wrong - but I can not imagine legal reason to OCR credit cards... For legal reason I guess there are solutions ready... Zdenko On Tue, Dec 27, 2011 at 6:24 PM, Sven Pedersen wrote: > Hi Roy, > I think tesseract

Re: Error In Code

2011-12-30 Thread zdenko podobny
On Fri, Dec 30, 2011 at 5:14 AM, Lahiru Himash Madusanka < lahiru.lahirumadusa...@gmail.com> wrote: > I'm using tesseract in my own written Program. It uses Tesseract.exe > as it's engine. > > Here is my code > {C:\Documents and Settings\CR-PC-01\Desktop\New Folder > (2)\tesseract.exe "C:\Document

Re: Error In Code

2011-12-31 Thread zdenko podobny
er > (2)\tesseract.exe "C:\Documents and Settings\CR-PC-01\My Documents\My > Pictures\VidBlasterWS.jpg" output.txt -l eng -psm 3" > > As I already mentioned - it can not work because of missing quotes... > On 12/30/11, zdenko podobny wrote: > > On Fri, Dec 30, 2

Re: Tesseract Dll For Visual Basic Express 2008

2011-12-31 Thread zdenko podobny
There is a link [1] to project files - you have to build it by yourself. Or you can use Naveen repository [2] that includes also tesseract sources... [1] http://www.mediafire.com/?2qt5b1s1si6qb42

Re: Error In Code

2011-12-31 Thread zdenko podobny
.tif or phototest.tif that are included in tesseract source - make sure that your command works in command line (as I already pointed: your command can not work because of missing quote(s) > On Dec 31, 3:15 pm, zdenko podobny wrote: > > On Sat, Dec 31, 2011 at 11:

Re: Tesseract Dll For Visual Basic Express 2008

2012-01-04 Thread zdenko podobny
t; I have downloaded [1] and try to compile. I followed all the > directions. But it gives me errors. > I have added Build Log with this E-mail. Can you tell me what is the > wrong with that > > On 1/2/12, Lahiru Himash Madusanka > wrote: > > Thank You very much zdenko > >

Re: Error In Code

2012-01-05 Thread zdenko podobny
ra Sinhala OCR\Akshara Sinhala > OCR\bin\Debug\tesseract.exe" "C:\Documents and Settings\CR-PC-01\My > Documents\My Pictures\untitled.JPG" "TEmp" -l sinhala > > On 1/2/12, Lahiru Himash Madusanka > wrote: > > OK. I'll check them and give feedbac

Re: tesseract with cmake

2012-02-02 Thread zdenko podobny
Hi, since nobody replied for a long time ;-) have a look at [1] - I posted there my attempt, but I did not have time to finish it. I am still interesting in cmake build system for tesseract, but it is not very importart for me at the moment. If somebody improves it or creates something better - p

Re: Version 3.02 in alpha

2012-02-03 Thread zdenko podobny
Do you have VS2008 for linux ;-) (as Ray wrote "currently Linux-only") ? PS: I work on patches for VS2008, but there are some problems... I need to made some additional tests... Zdenko On Fri, Feb 3, 2012 at 1:06 PM, Sriranga(78yrsold) wrote: > When tried to generate exe files using VS2008 but

Re: Version 3.02 in alpha

2012-02-03 Thread zdenko podobny
gt; as such I tried- only 24 succeeded. Now I shall wait for patches for >> VS2008 are uploaded. >> With Warmest Regards, >> -sriranga(79yrs) >> >> >> On Fri, Feb 3, 2012 at 6:03 PM, zdenko podobny wrote: >> >>> Do you have VS2008 for linux ;-)

Re: Version 3.02 in alpha

2012-02-03 Thread zdenko podobny
I just uploaded some fixes to VC2008 build - target was to compile and run tesseract.exe ("tesseract.exe eurotext.tif eurotext" produced output :-) ) Please test it. Feel free to improve it. I still continue to support the current "vs2008 structure". When Tom will finalize his contribution[1] I

Re: Tess 3.02 English training set broken?

2012-02-05 Thread zdenko podobny
Just quick tests: I am able to run 'tesseract eurotext.tif eurotext' (it use eng.traineddata) and I got result on linux without any problem... Can you verify downloaded file? In attachment you can find my md5 checksum... tesseract 3.02 works also with 3.01 data file (as I tested it on linux), so t

Re: Tess 3.02 English training set broken?

2012-02-05 Thread zdenko podobny
ote: > Also tested in the r-527 using the eng.trainedata of ver 3.02. I got > error message vide screenshot attached which is self explanatory. > -sriranga(79yrs) > > > On Sun, Feb 5, 2012 at 8:52 PM, zdenko podobny wrote: > >> Just quick tests: >> >> I am ab

Re: Tess 3.02 English training set broken?

2012-02-05 Thread zdenko podobny
;> error message vide screenshot attached which is self explanatory. >> -sriranga(79yrs) >> >> On Sun, Feb 5, 2012 at 8:52 PM, zdenko podobny wrote: >> >>> Just quick tests: >>> >>> I am able to run 'tesseract eurotext.tif eurotext' (it use &

Re: Version 3.02 in alpha

2012-02-07 Thread zdenko podobny
try r668. Zd. On Tue, Feb 7, 2012 at 1:54 PM, Sriranga(78yrsold) wrote: > Zdenko, > Downloaded r-667 from the svn today. Tried to generate exe files using > VS2008, I got result as follows: (first generated *debug* version and > again generated *release* version) > > Debug version = 25 succeed

Re: Version 3.02 in alpha

2012-02-07 Thread zdenko podobny
5* succeeded 0 failed -0 -skipped - in folder "bin" > contains *10 *exe files. > > This is brought to your kind notice for needful. > With Warmest Regards, > -sriranga(79yrs) > > > > On Tue, Feb 7, 2012 at 7:23 PM, zdenko podobny wrote: > >> try r668. &g

Re: Any one successfully build Tesseract 3.0.1 under MSYS+MinGW?

2012-02-08 Thread zdenko podobny
Hi, Have a look at https://github.com/zdenop/tesseract-mingw I was/am testing building tesseract by cmake... just with intention to have one build systems for linux, windows and possibly Mac (but I have no Mac computer for testing ;-) I created script (CMakeLists.txt) that was able to compile te

Re: isinf on Windows

2012-02-08 Thread zdenko podobny
please create issue (in http://code.google.com/p/tesseract-ocr/issues/list ) - also for other problems you reported in other e-mails. Group/forum is good for discussion or asking question etc., but if something should be fixed - it should go to issue list. (and do not post question to issue list

Re: How to build documentation?

2012-02-08 Thread zdenko podobny
You need to do it this way:: mv tesseract-ocr-read-only tesseract-ocr doxygen tesseract-ocr/doc/Doxyfile It is because of script 'makemoredists' - it can create release tar balls. It needs to run on a top of tesseract-ocr. Maybe there is better way - improvements are welcomed. Zdenko On Wed, F

Re: VS2010 and use of ResultIterator

2012-02-09 Thread zdenko podobny
On Thu, Feb 9, 2012 at 11:26 AM, TyDam' wrote: > Dear all, > I succeed to use tesseract with good recognition rate. So first, > thanks a lot for this project! > > But I’m face to an issue when I try to use tesseract::ResultIterator > and tesseract::ChoiceIterator > > I use visual 2010 and trying

Re: Any one successfully build Tesseract 3.0.1 under MSYS+MinGW?

2012-02-12 Thread zdenko podobny
On Sun, Feb 12, 2012 at 6:40 AM, asmwarrior wrote: > > ** > > On Thursday, February 9, 2012 5:47:17 AM UTC+8, zdpo wrote: >> >> Hi, >> >> Have a look at >> https://github.com/zdenop/**tesseract-mingw >> >> I was/am testing building tesseract by cmake...

Re: Training tesseract for hand written letters

2012-02-17 Thread zdenko podobny
Do you use (not released yet) tesseract 3.02 (you can find it out by 'tesseract -v')? This feature (declaring multiple language for OCR) in no available in prior versions. Zdenko On Fri, Feb 17, 2012 at 6:06 AM, Aruna Devi wrote: > Sir i have the trained data file separately for small letters (

Re: detecting word areas

2012-02-17 Thread zdenko podobny
As far as I know there is no config/command line option for it. Have a look at this example[1] - it is not exactly what you need, but it is a good start point for you... Zdenko [1] http://code.google.com/p/tesseract-ocr/issues/attachmentText?id=622&aid=6220004000&name=test_box_word.cpp&token=UZ5y

Re: Cross compile tesseract 3.01 for ios 5

2012-02-19 Thread zdenko podobny
On Mon, Feb 20, 2012 at 3:06 AM, Ricky wrote: > Hey guys, > > I'm trying to compile tesseract 3.01 as ios static library...tried the > way from robert's blog: > > http://robertcarlsen.net/2010/09/24/compiling-tesseract-v3-for-iphone-1299 > > but it is not working for me, I keep seeing error when

Re: tesseract under windows and paths

2012-02-22 Thread zdenko podobny
can you sent result of: echo %TESSDATA_PREFIX% Zd. On Thu, Feb 23, 2012 at 7:59 AM, wrote: > Hi all, > > i successfully compiled tesseract svn r 679 under windows using cygwin and > figured out that tesseract looks in the following directory for > .traineddata files: %programfilesdir%\tesserac

Re: tesseract under windows and paths

2012-02-23 Thread zdenko podobny
t method cause of above reasons with USB sticks or > different installations. > > greetings, > simon > > > On Thu, 23 Feb 2012 08:28:49 +0100 > zdenko podobny wrote: > >> can you sent result of: >> echo %TESSDATA_PREFIX% >> >> Zd. >> >&

Re: Replacing the tesseract 3.02 alpha vs2008 directory

2012-02-26 Thread zdenko podobny
So to make it easier: it is in svn (excluding APItest - it will be included in something like tesseract-3.02-win32-lib-include-dirs.zip - maybe during week I will create and upload some alpha version for testing). Wiki, README etc was not update to 3.02 version. vs2008\doc is not served as html - I

Re: 3.01 doesn't compile

2012-02-26 Thread zdenko podobny
Unfortunately your information are not sufficient. Try to run this: $ make clean $ ./autogen.sh && ./configure && make 2>&1 | tee make.log and than send make.log file (maybe compressed). Zdenko On Sun, Feb 26, 2012 at 11:10 AM, Falke wrote: > I'm on Ubuntu 10.10 . > Tesseract 3.01 is failing

Re: Error during using tesseract-ocr

2012-03-06 Thread zdenko podobny
On Tue, Mar 6, 2012 at 6:20 PM, Ivan Mushketik wrote: > Here is my output: > $ tesseract -v > tesseract 3.02 > leptonica-1.68 > > This means your leptonica have no support for image files. For good installation of leptonica you would see something like this: tesseract 3.02 leptonica-1.68 (Mar 14

Re: How to get a better result with tesseract

2012-03-06 Thread zdenko podobny
First of all: If it is possible - do not use jpeg for OCR. I think you will need to improve image before OCR. Try to have a look at FAQ[1] for some hints. [1] http://code.google.com/p/tesseract-ocr/wiki/FAQ#Output_it_without_result_or_wrong On Wed, Mar 7, 2012 at 7:05 AM, Roast wrote: > Anyone

Re: Is there a way to combine languages?

2012-03-07 Thread zdenko podobny
On Wed, Mar 7, 2012 at 11:51 PM, Falke wrote: > I did search this group but found only old posts regarding multiple > languages (regarding 2.0), but, looking forward to the new features in > 3.01... > > I am assuming it's still impossible, even in 3.01, to recognize a > mixture of languages (dist

Re: Error during python-tesseract installation

2012-03-08 Thread zdenko podobny
As Tom mentioned - you did not provided enough information. As far as I know python-tesseract works with 3.01, but it does not work with (unreleased) tesseract 3.02. Zdenko On Wed, Mar 7, 2012 at 8:27 PM, Ivan Mushketik wrote: > Hello. > > I want to install python-tesseract. To do it I've downl

Re: segfault in svn (3.02) (from three-four days ago)

2012-03-15 Thread zdenko podobny
On Thu, Mar 15, 2012 at 6:22 AM, Falke wrote: > > > On Mar 14, 11:25 pm, "Sriranga(78yrsold)" > wrote: > > tested in latest version r-703. Output as text.txt is attached. also > > image.tif converted to tif format for reference. No error message > > displayed. everything is OK. > > > > I just tr

Re: Preparing images with imagemagick on linux

2012-03-21 Thread zdenko podobny
On Tue, Mar 20, 2012 at 2:32 PM, Gytis wrote: > Hi all, > > first thank you for super software and FAQ on how to install it - went > smoothly. > > The question is though, maybe anyone has any tips on how to prepare > image for tesseract? I've read the FAQ on borders, resolution, etc, > but i have

Re: Problem with GetHOCRText (using OpenFrameworks)

2012-03-21 Thread zdenko podobny
see http://code.google.com/p/tesseract-ocr/issues/detail?id=463 e.g. it is fixed in 3.02. On Wed, Mar 21, 2012 at 4:33 PM, Jesse Fulton wrote: > Adding a call to SetInputName("") in my setup routine seemed to work! > > But may I ask *why* that worked? I'd assume that everything should be > hand

Re: New training only recognize if >3 chars

2012-03-22 Thread zdenko podobny
On Wed, Mar 21, 2012 at 7:22 PM, Jose Garcia wrote: > Hello, > > I've trained tesseract with only this characters: 0123456789-. > > I used one tiff with this characters, with 6 samples of each. > > After the successfully training, tesseract only recognize if in the > input tiff there are more tha

Re: segfaults in libtesseract

2012-03-27 Thread zdenko podobny
On Tue, Mar 27, 2012 at 5:07 PM, Stefan Malte Schumacher < stefanma...@gmail.com> wrote: > Hello > > I have installed tesseract 3.0.1 on my System as part of the > requirements of pyload. Now my system log > is full of these messages: > > [1723412.348162] tesseract[14859]: segfault at 0 ip b77320e

Re: Include Tesseract in C++ code

2012-03-29 Thread zdenko podobny
On Thu, Mar 29, 2012 at 4:52 PM, Gustavo Souto wrote: > Hi everyone, I need you help... > > I want to create a program in C++ with Tesseract, but when I try to > compile the source code some errors appear. I don't know well how to link > the libs to the source code, but I did do like this: > --

Re: problem in using tesseract API in c++ code

2012-04-14 Thread zdenko podobny
You need to build it by yourself. See [1]. [1] http://tesseract-ocr.googlecode.com/svn/trunk/vs2008/doc/building.html Zdenko On Sat, Apr 14, 2012 at 3:17 PM, Morné Jooste wrote: > Hi, > > Where can I find the dll for tesseract? > Sent via my BlackBerry from Vodacom - let your email find you! >

Re: Getting usable source files from traineddata files

2012-04-16 Thread zdenko podobny
On Mon, Apr 16, 2012 at 4:17 PM, Nick White wrote: > Hi there, > > There are lots of situations where it would be really useful to be > able to get some of the source files from a .traineddata file. For > example I am working on improving training of Ancient Greek (grc) - > which is basically the

Re: Specifying different dictionary files [was: Getting usable source files from traineddata files]

2012-04-17 Thread zdenko podobny
On Tue, Apr 17, 2012 at 4:26 PM, Nick White wrote: > On Mon, Apr 16, 2012 at 06:38:01PM +0200, zdenko podobny wrote: > > I think in 3.02 will provide solution this cases: you can use more than > one > > language for OCR. e.g. you can run something like this: > > > >

Re: php exec() and tesseract returns ''Cannot open input file'

2012-04-19 Thread zdenko podobny
On Thu, Apr 19, 2012 at 12:59 PM, droehn wrote: > Hi all, > > I posted this question already in stackoverflow but no one seems to > have a hint at hand. Thats why I repost in this dedicated group: > > I use Ghostscript to strip images from PDF files into jpg and run > Tesseract to save txt conten

Re: Strange blob fatalities that I don't know how to fix

2012-04-20 Thread zdenko podobny
On Thu, Apr 19, 2012 at 10:39 PM, xdhmoore wrote: > I am having exactly the same issue. I am trying to train based on > some very simple czech sentences arranged plainly on black and white. > It seems to not be recognizing the periods when creating the box file, > and it is throwing this error w

Re: segfaulting again (svn 3.02)

2012-04-22 Thread zdenko podobny
On Sat, Apr 21, 2012 at 1:15 PM, Falke wrote: > I think I solved the problem: > > The segfaulting happened because I had old and incompatible > eng.traineddata (probably forgot to do "make install-langs", > mentioned in "INSTALL.SVN"). I was clued in when I tried the old "-l > deu", and, unlik

Re: Tessarct for separating handwritten words

2012-04-25 Thread zdenko podobny
On Wed, Apr 25, 2012 at 11:10 AM, Lucas Swartsenburg wrote: > Excuse me, the correct term would be: segmentation. So this sentence would > be segmented in: > > "So", "this", "sentence", "would", "be", "in". (all of these are images > of the handwritten words). > > > I am not sure if I got your po

Re: Version 3.02 in alpha

2012-04-26 Thread zdenko podobny
On Thu, Feb 2, 2012 at 7:55 PM, Ray Smith wrote: > Tesseract 3.02 is now available in svn for preliminary testing, currently > Linux-only. > > There are now 65 languages and some big improvements in layout analysis > and character accuracy. > This version will with luck make it into Ubunto LTS Pr

Re: Training tessnet2 with custom font

2012-04-26 Thread zdenko podobny
On Wed, Apr 25, 2012 at 8:49 PM, Sebastian Siatkowski < sebastiansiatkow...@gmail.com> wrote: > I find a ton of data on how to train the tesseract 3 software with new > character sets by using the provided tools. However, I am wondering if > I could accomplish this by using the C# API of tessnet2.

Re: where to download apitest? or other sample application?

2012-04-26 Thread zdenko podobny
On Fri, Apr 27, 2012 at 7:24 AM, dev wrote: > Hi, > Where can I download apitest sample mentioned in this page > http://tesseract-ocr.googlecode.com/svn/trunk/vs2008/doc/programming.html > > I manage to download and compile the tesseract source code but i > couldn't find the sample application an

Re: Include Tesseract in C++ code

2012-04-27 Thread zdenko podobny
On Fri, Apr 27, 2012 at 3:01 PM, Pavel Mazniker wrote: > > Hi, > > > >> I have a linking to tesseract problem in Qt C++ project on Windows, >> >> > Hello. > I've build the libraries ( together with one that is already within the > tesseract-3.01-win_vs - > vs2008 ) : > > > 04/27/2012 08:47 AM

Re: Segfault using my own traineddata with latest SVN

2012-04-27 Thread zdenko podobny
On Fri, Apr 27, 2012 at 1:12 PM, Nick White wrote: > Hi, > > I'm encountering an odd segfault whenever I try to use a new > traineddata file with the latest svn release (r724). The same > process produces a perfectly usable traineddata file with 3.01. This > is the case on the two different Linux

Re: Include Tesseract in C++ code

2012-04-27 Thread zdenko podobny
On Fri, Apr 27, 2012 at 6:54 PM, Pavel Mazniker wrote: > > Hi, > > You made great work with tesseract!Thanks! > > I succeed to link it to Qt project using one of the tesseract projects > posted at github ( tesseract using mingw ). Is it full ( does that project > include also training ? ) > > Now

Re: Include Tesseract in C++ code

2012-04-27 Thread zdenko podobny
On Fri, Apr 27, 2012 at 8:37 PM, Pavel Mazniker wrote: > *1. tesseract-ocr 3.01 do not (officially) support/create dll. As far as > I remember, only static linking was successful. Use 3.02 (in svn).* > What libs exactly should I create for linking and compiling on MinGW build > system ? > I do

Re: Include Tesseract in C++ code

2012-04-29 Thread zdenko podobny
On Sun, Apr 29, 2012 at 8:59 AM, Pavel Mazniker wrote: > > Hi, > > trying to build using MinGW + MSYS the 3.02 > > checked from svn r724. Is that right ? > > yes > then runned in msys terminal on the root directory of the checkout > > ./autogen.sh > > and got: > > " > Running aclocal > ./autog

<    1   2   3   4   5   6   7   8   9   10   >