On Friday, August 9, 2019 at 7:31:03 AM UTC+2, ElGato ElMago wrote:
>
> Here's my sharing on GitHub. Hope it's of any use for somebody.
>
> https://github.com/ElMagoElGato/tess_e13b_training
>
Thanks for sharing your experience with us.
Is it possible to share your Tesseract model (xxx.trainedda
I added eng.traineddata and LICENSE. I used my account name in the license
file. I don't know if it's appropriate or not. Please tell me if it's not.
2019年8月9日金曜日 16時17分41秒 UTC+9 Mamadou:
>
>
>
> On Friday, August 9, 2019 at 7:31:03 AM UTC+2, ElGato ElMago wrote:
>>
>> Here's my sharing on Git
On Friday, August 9, 2019 at 10:40:15 AM UTC+2, ElGato ElMago wrote:
>
> I added eng.traineddata and LICENSE. I used my account name in the
> license file. I don't know if it's appropriate or not. Please tell me if
> it's not.
>
It's ok.
Thanks. I'll share our dataset (real life samples) in
Try creating a multipage tiff from your pdf and try.
On Fri, 9 Aug 2019, 11:11 ilevy, wrote:
> I'm trying tesseract for the first time with a png of a multipage document
> I saved out of a pdf (which itself was just an image).
>
> When I run tesseract, I get an output of the first page, but that
I suggest to rename the traineddata file from eng. to e13b or another
similar descriptive name and also add a link to it in the data file
contributions wiki page.
On Fri, 9 Aug 2019, 20:08 'Mamadou' via tesseract-ocr, <
tesseract-ocr@googlegroups.com> wrote:
>
>
> On Friday, August 9, 2019 at 10:
That's a good question. The png was exported from a pdf, so there may have
been some notion of pages encoded into it, but that's a guess. What I can
say is that the result is consistent. Running
tesseract Downloads/foundations-of-mathematics.tiff
foundations-of-mathematics
always yields the f
I exported a png from a pdf that seemed to be a scanned image of the
original text. I installed the latest tesseract and leptonica via Homebrew.
I then ran
tesseract Downloads/foundations-of-mathematics.tiff
foundations-of-mathematics
and it consistently outputs the first page only.
On Thursd
That worked, thank you very much Shree!
I could tell right away that it was working because it was writing to
stdout:
Tesseract Open Source OCR Engine v4.1.0 with Leptonica
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
Page 8
Page 9
Page 10
Page 11
Page 12
Page 13
Detected 14 di
8 matches
Mail list logo