Hi,
i did the merge and also updated my repo with build instructions.
changes:
- extended ETEXT_DESC with PROGRESS_FUNC field. So users of the api can
register a callback function to get notified of progress percentage as well
as word bounding boxes.
- (Most people i have shown my app really liked how it highlighted
the current word when doing the ocr)
- changed the percentage progress values to start with 0% instead of 30%
- added row attributes to hocr output so that i can make more straight
lines when creating the pdf files
Cheers
Renard
Am Montag, 20. Mai 2013 11:42:17 UTC+2 schrieb Nick White:
>
> Hi Renard,
>
> Great, I'm glad you're merging them into the latest Tesseract
> revision. Could you then post the patches into the Tesseract bug
> tracker? http://code.google.com/p/tesseract-ocr/issues/list
>
> > But one change that i really need is the option to pass in a monitor to
> the
> > api. One reason is to cancel the ocr process from my app and the other
> reason
> > is so that i can show a progress bar to the user.
>
> That sounds like the sort of thing that others using the API would
> find useful too. If you can make the change in such a way that
> existing code using the API continues to work, I expect that could
> be incorporated into Tesseract too.
>
> Nick
>
> On Sat, May 18, 2013 at 07:01:52AM -0700, Renard Wellnitz wrote:
> > Hi Nick,
> >
> > thanks for the diff. I made the changes a long time ago and i did not
> really
> > remember everything. But the diff greatly helped me to remember :-)
> > Iam currently merging my changes to the latest revision of tesseract.
> Iam also
> > trying to move as much of my code out of your source files.
> > I will also create a new git repo that should be much smaller. I will
> also
> > mavenize the whole project so that it can be built with just one
> command.
> >
> > Cheers
> > Renard
> >
> > Am Donnerstag, 16. Mai 2013 12:40:15 UTC+2 schrieb Nick White:
> >
> > That looks right, thanks for that.
> >
> > I'll try to take a proper look soon and figure out how best to
> > upstream stuff, and where it's worth doing so. In the meantime I'll
> > attach the .diff (very small; only 200 lines), in case anyone else
> > is interested, and so I don't forget ;)
> >
> > Nick
> >
> > On Wed, May 15, 2013 at 07:18:42AM -0700, Renard Wellnitz wrote:
> > > Hi Nick,
> > >
> > > here is the console output:
> > >
> > >
> > > localhost:tesseract-ocr-3.02 renard$ svn log -r COMMITTED
> > > ------------------------------------------------------------
> > ------------
> > > r705 | [email protected] | 2012-03-15 22:05:12 +0100 (Thu, 15
> Mar
> > 2012) | 1
> > > line
> > >
> > > fixed build in java directory; create documentation package
> with
> > 'make
> > > doc-pack'
> > > ------------------------------------------------------------
> > ------------
> > >
> > >
> > > Cheers
> > > Renard
> > >
> > >
> > > Am Mittwoch, 15. Mai 2013 14:28:35 UTC+2 schrieb Nick White:
> > >
> > > I'm no expert with SVN, but I think this command will tell me
> what I
> > > want to know:
> > >
> > > svn log -r COMMITTED
> > >
> > > Thanks.
> > >
> > > On Wed, May 15, 2013 at 04:02:34AM -0700, Renard Wellnitz
> wrote:
> > > > Hi Nick,
> > > >
> > > > i'm not really proficient with svn. Maybe this helps? If you
> want
> > me to
> > > run a
> > > > specific svn command i'll gladly do it.
> > > >
> > > >
> > > > localhost:tesseract-ocr-3.02 renard$ svn ls "^/tags"
> > > > release-2.04/
> > > > release-3.00/
> > > > release-3.00.1/
> > > > release-3.01/
> > > > release-3.02.01/
> > > > release-3.02.02/
> > > > localhost:tesseract-ocr-3.02 renard$ svnversion .
> > > > 705M
> > > > localhost:tesseract-ocr-3.02 renard$
> > > >
> > > >
> > > > I do not remember the exact changes. But my main goals was
> the get
> > > progress
> > > > information during the ocr process so that my app could show
> the
> > bounding
> > > boxes
> > > > of the currently processed word.
> > > >
> > > > Cheers
> > > > Renard
> > > >
> > > >
> > > > Am Mittwoch, 15. Mai 2013 11:37:26 UTC+2 schrieb Nick White:
> > > >
> > > > Ah, I see it's pretty close to 3.02.01 (now only
> available as
> > an SVN
> > > > tag). Am I correct in thinking that's the release you
> used? Or
> > was
> > > > it a SVN revision near it?
> > > >
> > > > Thanks again,
> > > >
> > > > Nick
> > > >
> > > > On Wed, May 15, 2013 at 10:30:29AM +0100, Nick White
> wrote:
> > > > > Hi Renard,
> > > > >
> > > > > This is awesome, great job :)
> > > > >
> > > > > I was interested to see what changes you'd made to
> tesseract,
> > so
> > > ran
> > > > > 'diff -r' on the tesseract-ocr-3.02 directory in
> github, but
> > a
> > > quick
> > > > > look made it seem quite different to the
> > > > > tesseract-ocr-3.02.02.tar.gz currently available from
> > Tesseract.
> > > > >
> > > > > Am I correct in thinking that? Is it based on a
> version from
> > SVN?
> > > If
> > > > > so, which? If not, I'll just have to spend more time
> with
> > diff ;-)
> > > > >
> > > > > I'd be keen to try and isolate and generalise any
> changes you
> > made
> > > > > and get them back into the core code, if I can.
> > > > >
> > > > > Thanks for all this lovely free code!
> > > > >
> > > > > Nick
> > > > >
> > > > > On Tue, May 14, 2013 at 01:51:15PM -0700, Renard
> Wellnitz
> > wrote:
> > > > > > Hi Tom,
> > > > > >
> > > > > > i decided to publish the code of the app under the
> Apache 2
> > > licence.
> > > > However
> > > > > > the c++ code that deals with image processing uses
> the
> > stricter
> > > GLP v3
> > > > since
> > > > > > that is the place where i put a lot of effort into.
> > > > > >
> > > > > > The project still needs a readme and instructions on
> how to
> > build
> > > the
> > > > binaries.
> > > > > > For someone with a bit of Android/NDK experience it
> should
> > be not
> > > a big
> > > > problem
> > > > > > however.
> > > > > > Readme and build instructions will follow in a
> couple of
> > days.
> > > > > >
> > > > > > https://github.com/renard314/textfairy
> > > > > >
> > > > > > Cheers!
> > > > > > Renard
> > > >
> > > > --
> > > > --
> > > > You received this message because you are subscribed to the
> Google
> > > > Groups "tesseract-ocr" group.
> > > > To post to this group, send email to
> [email protected]
> > > > To unsubscribe from this group, send email to
> > > > [email protected]
> > > > For more options, visit this group at
> > > > http://groups.google.com/group/tesseract-ocr?hl=en
> > > >
> > > > ---
> > > > You received this message because you are subscribed to the
> Google
> > Groups
> > > > "tesseract-ocr" group.
> > > > To unsubscribe from this group and stop receiving emails
> from it,
> > send an
> > > email
> > > > to [email protected].
> > > > For more options, visit
> https://groups.google.com/groups/opt_out.
> > > >
> > > >
> > >
> > > --
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "tesseract-ocr" group.
> > > To post to this group, send email to [email protected]
> > > To unsubscribe from this group, send email to
> > > [email protected]
> > > For more options, visit this group at
> > > http://groups.google.com/group/tesseract-ocr?hl=en
> > >
> > > ---
> > > You received this message because you are subscribed to the Google
> Groups
> > > "tesseract-ocr" group.
> > > To unsubscribe from this group and stop receiving emails from it,
> send an
> > email
> > > to [email protected].
> > > For more options, visit https://groups.google.com/groups/opt_out.
> > >
> > >
> >
> > --
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to
> > [email protected]<javascript:>
> > To unsubscribe from this group, send email to
> > [email protected] <javascript:>
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups
> > "tesseract-ocr" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email
> > to [email protected] <javascript:>.
> > For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
>
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
Index: ccmain/ltrresultiterator.h
===================================================================
--- ccmain/ltrresultiterator.h (revision 844)
+++ ccmain/ltrresultiterator.h (working copy)
@@ -110,6 +110,8 @@
int* pointsize,
int* font_id) const;
+ void RowAttributes(float* row_height, float* descenders, float* ascenders) const;
+
// Return the name of the language used to recognize this word.
// On error, NULL. Do not delete this pointer.
const char* WordRecognitionLanguage() const;
Index: ccmain/ltrresultiterator.cpp
===================================================================
--- ccmain/ltrresultiterator.cpp (revision 844)
+++ ccmain/ltrresultiterator.cpp (working copy)
@@ -161,6 +161,14 @@
return 0.0f;
}
+void LTRResultIterator::RowAttributes( float* row_height,
+ float* descenders,
+ float* ascenders) const{
+ *row_height = it_->row()->row->x_height() + it_->row()->row->ascenders() - it_->row()->row->descenders();
+ *descenders = it_->row()->row->descenders();
+ *ascenders = it_->row()->row->ascenders();
+}
+
// Returns the font attributes of the current word. If iterating at a higher
// level object than words, eg textlines, then this will return the
// attributes of the first word in that textline.
Index: ccmain/control.cpp
===================================================================
--- ccmain/control.cpp (revision 844)
+++ ccmain/control.cpp (working copy)
@@ -243,7 +243,11 @@
word_index++;
if (monitor != NULL) {
monitor->ocr_alive = TRUE;
- monitor->progress = 30 + 50 * word_index / stats_.word_count;
+ monitor->progress = 70 * word_index / stats_.word_count;
+ if (monitor->progress_callback!=NULL){
+ TBOX box = page_res_it.word()->word->bounding_box();
+ (*monitor->progress_callback)(monitor->progress,box.left(), box.right(), box.top(), box.bottom());
+ }
if (monitor->deadline_exceeded() ||
(monitor->cancel != NULL && (*monitor->cancel)(monitor->cancel_this,
stats_.dict_words)))
@@ -316,7 +320,10 @@
word_index++;
if (monitor != NULL) {
monitor->ocr_alive = TRUE;
- monitor->progress = 80 + 10 * word_index / stats_.word_count;
+ monitor->progress = 70 + 30 * word_index / stats_.word_count;
+ if (monitor->progress_callback!=NULL){
+ (*monitor->progress_callback)(monitor->progress,0,0,0,0);
+ }
if (monitor->deadline_exceeded() ||
(monitor->cancel != NULL && (*monitor->cancel)(monitor->cancel_this,
stats_.dict_words)))
Index: ccutil/ocrclass.h
===================================================================
--- ccutil/ocrclass.h (revision 844)
+++ ccutil/ocrclass.h (working copy)
@@ -101,6 +101,7 @@
* the OCR engine is storing its output to shared memory.
* During progress, all the buffer info is -1.
* Progress starts at 0 and increases to 100 during OCR. No other constraint.
+ * Additionally the progress callback contains the bounding box of the word that is currently being processed
* Every progress callback, the OCR engine must set ocr_alive to 1.
* The HP side will set ocr_alive to 0. Repeated failure to reset
* to 1 indicates that the OCR engine is dead.
@@ -108,6 +109,7 @@
* user words found. If it returns true then operation is cancelled.
**********************************************************************/
typedef bool (*CANCEL_FUNC)(void* cancel_this, int words);
+typedef bool (*PROGRESS_FUNC)(int progress, int left, int right, int top, int bottom );
class ETEXT_DESC { // output header
public:
@@ -117,6 +119,7 @@
volatile inT8 ocr_alive; // ocr sets to 1, HP 0
inT8 err_code; // for errcode use
CANCEL_FUNC cancel; // returns true to cancel
+ PROGRESS_FUNC progress_callback;/*called whenever progress increases*/
void* cancel_this; // this or other data for cancel
struct timeval end_time; // time to stop. expected to be set only by call
// to set_deadline_msecs()
Index: tessdata/Makefile.am
===================================================================
--- tessdata/Makefile.am (revision 844)
+++ tessdata/Makefile.am (working copy)
@@ -1,4 +1,4 @@
-datadir = @datadir@/tessdata
+dir = @datadir@/tessdata
SUBDIRS = configs tessconfigs
Index: api/baseapi.cpp
===================================================================
--- api/baseapi.cpp (revision 844)
+++ api/baseapi.cpp (working copy)
@@ -71,6 +71,9 @@
#include "version.h"
#endif
+/* Version number of package */
+#define VERSION "3.02"
+
namespace tesseract {
/** Minimum sensible image size to be worth running tesseract. */
@@ -1062,17 +1065,32 @@
* STL removed from original patch submission and refactored by rays.
*/
char* TessBaseAPI::GetHOCRText(int page_number) {
+ return GetHOCRText(NULL,page_number);
+}
+
+
+/**
+ * Make a HTML-formatted string with hOCR markup from the internal
+ * data structures.
+ * page_number is 0-based but will appear in the output as 1-based.
+ * Image name/input_file_ can be set by SetInputName before calling
+ * GetHOCRText
+ * STL removed from original patch submission and refactored by rays.
+ */
+char* TessBaseAPI::GetHOCRText(struct ETEXT_DESC* monitor, int page_number) {
if (tesseract_ == NULL ||
- (page_res_ == NULL && Recognize(NULL) < 0))
+ (page_res_ == NULL && Recognize(monitor) < 0))
return NULL;
int lcnt = 1, bcnt = 1, pcnt = 1, wcnt = 1;
int page_id = page_number + 1; // hOCR uses 1-based page numbers.
+ float row_height, descenders, ascenders;
STRING hocr_str("");
- if (input_file_ == NULL)
+ if (input_file_ == NULL) {
SetInputName(NULL);
+ }
#ifdef _WIN32
// convert input name from ANSI encoding to utf-8
@@ -1121,6 +1139,11 @@
}
if (res_it->IsAtBeginningOf(RIL_TEXTLINE)) {
hocr_str.add_str_int("\n <span class='ocr_line' id='line_", lcnt);
+ res_it->RowAttributes(&row_height,&descenders, &ascenders);
+ hocr_str.add_str_int("' font='", 15);
+ hocr_str.add_str_int("' size='", row_height);
+ hocr_str.add_str_int("' descenders='", descenders * -1);
+ hocr_str.add_str_int("' ascenders='", ascenders);
AddBoxTohOCR(res_it, RIL_TEXTLINE, &hocr_str);
}
Index: api/baseapi.h
===================================================================
--- api/baseapi.h (revision 844)
+++ api/baseapi.h (working copy)
@@ -521,8 +521,20 @@
* Make a HTML-formatted string with hOCR markup from the internal
* data structures.
* page_number is 0-based but will appear in the output as 1-based.
+ * monitor can be used to
+ * cancel the regocnition
+ * receive progress callbacks
*/
+ char* GetHOCRText(struct ETEXT_DESC* monitor, int page_number);
+
+ /**
+ * Make a HTML-formatted string with hOCR markup from the internal
+ * data structures.
+ * page_number is 0-based but will appear in the output as 1-based.
+ */
char* GetHOCRText(int page_number);
+
+
/**
* The recognized text is returned as a char* which is coded in the same
* format as a box file used in training. Returned string must be freed with
Index: api/capi.cpp
===================================================================
--- api/capi.cpp (revision 844)
+++ api/capi.cpp (working copy)
@@ -319,7 +319,7 @@
TESS_API char* TESS_CALL TessBaseAPIGetHOCRText(TessBaseAPI* handle, int page_number)
{
- return handle->GetHOCRText(page_number);
+ return handle->GetHOCRText(NULL,page_number);
}
TESS_API char* TESS_CALL TessBaseAPIGetBoxText(TessBaseAPI* handle, int page_number)