>
> Hello all ,
>
I'm working on similar project , in my case i'm reading bank statements. I
noticed the following
1. when you have a single line of text tesseract performs much better
2. I'm using openCV to cut individual cells from a table (you always know
the order of cells since you cut the
See https://github.com/tesseract-ocr/tesseract/wiki/APIExample
For example of using tesseract in a program.
The training tutorial you refer to is old.
See tesstrain.sh for creating synthetic training data.
On 10-Jan-2018 2:54 PM, "saumitra mallick"
wrote:
> Hello all ,
>>
> I'm working on simi
It works !!
I modified your bash script and executed it. Finally I get different
traineddata size.
But, can I train it from scratch?
It needs starting traineddata which I can get from combine_lang_model,
isn't it?
On Tuesday, January 9, 2018 at 7:36:08 PM UTC+7, shree wrote:
>
>
>> My reason
On Wed, Jan 10, 2018 at 3:56 PM, wrote:
> It works !!
> I modified your bash script and executed it. Finally I get different
> traineddata size.
>
> But, can I train it from scratch?
> It needs starting traineddata which I can get from combine_lang_model,
> isn't it?
>
>
​Starter traineddata will
Here is my code:
string text = "";
string tessDataPath = ConfigurationManager.AppSettings["TessPath"];
using (var engine = new TessBaseAPI(@tessDataPath, @"eng"))
{
engine.SetVariable("tessedit_ocr_engine_mode", "0");
engine.SetPageSegMode(PageSegmentationMode.SINGLE_LINE);
engine.SetV
Just updated again to use Tesseract 4.00 fast data.
On Monday, January 8, 2018 at 5:16:50 PM UTC-6, Quan Nguyen wrote:
>
> Just updated the alpha versions with latest Tesseract 4.00alpha
> executables.
>
> https://sourceforge.net/projects/vietocr/files/
>
> On Monday, April 3, 2017 at 6:26:37 AM
I am trying to solve a similar problem, that of reading forms. Tesseract 4
is doing well but is DROPPING lots of words withing boxes. I thought this
problem of dropping words existed with Indic languages but here I am having
this issue for English too!
I tried to fool around with some paramet
On Wed, Jan 10, 2018 at 8:07 PM, Afreen Ferdoash
wrote:
> I am trying to solve a similar problem, that of reading forms. Tesseract
> 4 is doing well but is DROPPING lots of words withing boxes. I thought
> this problem of dropping words existed with Indic languages but here I am
> having this i
Hi guys , I am working on some degraded text image ( Japanese ) . Is there
any way to adjust Degraded Image on training set ? And should I do this ?
Regard
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and st
it is still not making any difference
On Wednesday, January 10, 2018 at 9:27:20 PM UTC+5:30, shree wrote:
>
>
> On Wed, Jan 10, 2018 at 8:07 PM, Afreen Ferdoash > wrote:
>
>> I am trying to solve a similar problem, that of reading forms. Tesseract
>> 4 is doing well but is DROPPING lots of wor
Hi
Just stumbled on this forum while looking for answers as to why the
Tesseract Demo on the site would fail with my images (using very similar
approach of single digits in images etc etc)
Found that scaling the image height by 50% worked a charm thanks!! Never
thought to do that!! Also cropp
Hi Shree,
The box file uploaded by you as the attachment seems to contradict with the
LSTM4.0 training tutorial guidelines, as there it states that the boxes
should actually be at line level instead of at character level. Please do
correct me if I am wrong. I still am not able to understand ho
12 matches
Mail list logo