Hello,
We have open sourced (BSD license) MRZ/MRP (Machine-readable zone/passport)
dataset and models for Tesseract v4.
The dataset contains more than #7 thousands images (.tif) with ground truth
(.gt.txt) from Google image augmented with few synthetic data.
It's ready to be used to train with T
n(detector, recognizer). These #1376 images can't be
directly used with tesseract and requires a detector and preprocessor.
On Wednesday, May 29, 2019 at 10:08:53 AM UTC+2, Lorenzo Blz wrote:
>
> Hi Mamadou,
> this sounds very interesting. How did you do the training and accuracy
> measur
Hello,
We've open sourced (BSD 3-Clause License) our MICR dataset and
*.traineddata for Tesseract v4.
This was developed as an internal R&D project and never went to production
as we ended using Tensorflow.
Even as a PoC it's already more accurate than many commercial products. The
repo conta
I don't know if it's appropriate or not. Please tell me if
> it's not.
>
> 2019年8月9日金曜日 16時17分41秒 UTC+9 Mamadou:
>>
>>
>>
>> On Friday, August 9, 2019 at 7:31:03 AM UTC+2, ElGato ElMago wrote:
>>>
>>> Here's my sharing on Git
Hello,
Are you planning to release the dataset or models?
I'm working on the same subject and planning to share both under BSD terms
On Tuesday, August 6, 2019 at 10:11:40 AM UTC+2, ElGato ElMago wrote:
>
> Hi,
>
> FWIW, I got to the point where I can feel happy with the accuracy. As the
> images
f Shree's text and mine. The
> instructions and tools I used already exist.
>
If you have a Github account just create a repo and publish the data and
instructions.
>
> ElMagoElGato
>
> 2019年8月7日水曜日 8時20分02秒 UTC+9 Mamadou:
>
>> Hello,
>> Are you planning
On Wednesday, August 7, 2019 at 4:10:44 PM UTC+2, Cristobal Jesus Muñoz
Solano wrote:
>
> hello, I have already tried mrz.trainneddata yes quite good, but it is not
> accurate. How can I do it to improve it? Is it possible to use box / png
> files to improve its accuracy ?.
>
mrz.trainneddata
e bit.
>> Will be out there soon.
>>
>> 2019年8月7日水曜日 21時11分01秒 UTC+9 Mamadou:
>>>
>>>
>>>
>>> On Wednesday, August 7, 2019 at 2:36:52 AM UTC+2, ElGato ElMago wrote:
>>>>
>>>> HI,
>>>>
>>>> I'
share our dataset (real life samples) in the coming days.
>
> 2019年8月9日金曜日 16時17分41秒 UTC+9 Mamadou:
>>
>>
>>
>> On Friday, August 9, 2019 at 7:31:03 AM UTC+2, ElGato ElMago wrote:
>>>
>>> Here's my sharing on GitHub. Hope it's of any use
You can open a ticket on our issue tracker (
https://github.com/DoubangoTelecom/tesseractMICR/issues) and will add to
the roadmap for the coming days
On Thursday, March 12, 2020 at 10:16:54 AM UTC+1, haytham Arori wrote:
>
> hi ti all
>
> I want to know if anyone has the .train data file for CMC
The easiest way to train MICR CMC-7 font for Tesseract would be using OCR-D
(https://github.com/OCR-D/ocrd-train). This is what we've used in our R&D
project (https://github.com/DoubangoTelecom/tesseractMICR). We open sourced
the MICR E-13B traineddata but not the CMC-7. We're not using these mo
les you're attaching won't help. You need thousands of
samples for training. In our case we have 17k samples to train tensorflow.
Try web scraping to collect real life samples instead of using synthetic
data.
On Friday, April 3, 2020 at 7:11:01 PM UTC+2, Ghada Aruri wrote:
>
> hi
an online webapp to check the accuracy at
https://www.doubango.org/webapps/micr/
On Saturday, April 4, 2020 at 11:59:34 AM UTC+2, Essam Zaky wrote:
>
> Hi @mamadou
>
> how did you collected the 17000 image are they real images ,
> also which type of Tensorfolw models you use
13 matches
Mail list logo