other example:

https://www.kaggle.com/code/sreesankar711/table-transformer-demo

Zdenko


so 21. 12. 2024 o 19:37 Zdenko Podobny <zde...@gmail.com> napísal(a):

> Hi,
>
> have a look at this example:
> article:
> https://iamrajatroy.medium.com/document-intelligence-series-part-2-transformer-for-table-detection-extraction-80a52486fa3
> notebook:
> https://nbviewer.org/github/iamrajatroy/Data-Science-Lab/blob/main/notebook/DETR_Document_Intelligence.ipynb
>
> Zdenko
>
>
> so 21. 12. 2024 o 6:56 Riccardo <riccardo.degioan...@gmail.com>
> napísal(a):
>
>> Hello,
>> I am trying to use Tesseract to create a small Windows application that
>> allows the user to:
>>
>>    - Take a screenshot of the monitor and cut a smaller portion
>>    containing a table (the table always has the same format, and the labels
>>    are consistent. The numerical data are different each time).
>>    - Provide the screenshot to Tesseract to extract the data. My
>>    strategy is to remove the vertical and horizontal lines in the table,
>>    extract the entire text, and collect the numerical values corresponding to
>>    the labels I want to capture.
>>    - Finally, generate a text output based on the extracted data.
>>
>> The app works fine, but there are still many errors in data extraction.
>> Sometimes, some values are not extracted at all because the label is not
>> correctly recognized. Other times, even if the labels are recognized
>> correctly and the data are extracted, the numbers are incorrect. Also I
>> noticed that the error quote is higher on my work PC, probably because the
>> screen resolution is lower than my home PC.
>>
>> I am wondering if there is a more reliable way to accomplish my goal.
>>
>> Below I attached some images of the App to give you an idea, an example
>> of the table and the python script I am using for OCR.
>>
>> Thank you very much for your help!!!
>>
>> tesseract v5.4.0.20240606
>>
>> Python 3.13.1
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to tesseract-ocr+unsubscr...@googlegroups.com.
>> To view this discussion visit
>> https://groups.google.com/d/msgid/tesseract-ocr/191d869f-9ff0-4297-b539-aad42fc3c1e3n%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/191d869f-9ff0-4297-b539-aad42fc3c1e3n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xZZ60JE8-j_Db-LgNTgDgGz_hnaiwcCJAZnbC4GAMdFQ%40mail.gmail.com.

Reply via email to