Re: OCR Dvd Subtitles

2012-07-19 Thread Kiran Babu G
Hi Nick, Thanks for the reply. Please see attached sample image I am trying. The output I get is: *A uth orities in s e veral s tates* *are ordering e va cua tians,* * * You can see all the additional spaces that are appearing..I am not sure which config setting I can condition to get correct ou

Re: Newbie: Training tesseract

2012-07-19 Thread Sven Pedersen
Hi Nikola, I would suggest searching through the archives. People have lots of examples of what they've done to train tesseract. --Sven On Thu, Jul 19, 2012 at 8:43 AM, Nikola Ivanovic wrote: > Hi, > I'm new in tesseract and I need help :) > > I've been reading manual but since I'm newbie it's li

Re: Newbie: Training tesseract

2012-07-19 Thread Merve Temizer
I could not find the tutorial which i used a year ago. Could you complete the training process and get a traineddata file and use it as a parameter to tesseract while trying your image for one more time? 2012/7/19 Nikola Ivanovic > Hi, > I'm new in tesseract and I need help :) > > I've been read

Re: developing new program which passes memory buffer with OCR data to be recognized to tesseract library

2012-07-19 Thread Zdenko Podobný
Dňa 19.07.2012 03:32, newtotesseract wrote / napísal(a): > Hi, > > Thanks for the suggestion. > I found the thread "Include Tesseract in C++ > code" > closer to what I am looking for. > > But, did not get how to create static arch

Re: OCR Dvd Subtitles

2012-07-19 Thread Nick White
Hi Kiran, On Thu, Jul 19, 2012 at 04:53:57AM -0700, Kiran Babu G wrote: > Currently, major issue is with the spacings, especially with italics. > I could see several config settings in textord.h.. > Modifying some of them on trial and error basis. > Some fonts have more kerning and some have less

Re: OCR Dvd Subtitles

2012-07-19 Thread Kiran Babu G
Hi Nick, Thanks for the reply. I do transform the subtitle images to be black text on white background before inputting to tesseract. I tried training the text and getting fairly good results. The fonts used in Dvds subtitles change for each Dvd. So, I tried with multiple fonts. Currently, major

Re: [Annoucement] QT Box Editor 1.09

2012-07-19 Thread Nick White
On Thu, Jul 19, 2012 at 12:16:19PM +0200, zdenko podobny wrote: > On Thu, Jul 19, 2012 at 12:06 PM, Sriranga(78yrsold) < > withblessi...@gmail.com> wrote: > > Trust in the forthcoming next version, another feature "generate > > image/box files from the text file" > > will be added as a crown to QT

Re: [Annoucement] QT Box Editor 1.09

2012-07-19 Thread zdenko podobny
On Thu, Jul 19, 2012 at 12:06 PM, Sriranga(78yrsold) < withblessi...@gmail.com> wrote: > Hi Team, > Congratulations. Installed on WinXP with sp3. Tested - It is a pleasure to > use QT Box Editor for Kannada lang for editing box files without any > problems. I am really happy for excellent QTBox E

Re: Improving a tricky character recognition error

2012-07-19 Thread Nick White
Yes, I'm pretty sure it's a bounding box issue. I don't want to go the way of manually specifying bounding box, as I don't really have the time, and want the training to work for the general case anyway. I think I'll just have to declare it as "good enough" for now (and it really is pretty good!)

Re: [Annoucement] QT Box Editor 1.09

2012-07-19 Thread Sriranga(78yrsold)
Hi Team, Congratulations. Installed on WinXP with sp3. Tested - It is a pleasure to use QT Box Editor for Kannada lang for editing box files without any problems. I am really happy for excellent QTBox Editor created by the team of Zdenko Podobny. Trust in the forthcoming next version, another fea

[Annoucement] QT Box Editor 1.09

2012-07-19 Thread zdenko podobny
QT Box Editor 1.09 was released. It is a multi-platform visual editor for tesseract-ocr box files(used for OCR training) based on QT4 library . Some of the featu