other example: https://www.kaggle.com/code/sreesankar711/table-transformer-demo
Zdenko so 21. 12. 2024 o 19:37 Zdenko Podobny <zde...@gmail.com> napísal(a): > Hi, > > have a look at this example: > article: > https://iamrajatroy.medium.com/document-intelligence-series-part-2-transformer-for-table-detection-extraction-80a52486fa3 > notebook: > https://nbviewer.org/github/iamrajatroy/Data-Science-Lab/blob/main/notebook/DETR_Document_Intelligence.ipynb > > Zdenko > > > so 21. 12. 2024 o 6:56 Riccardo <riccardo.degioan...@gmail.com> > napísal(a): > >> Hello, >> I am trying to use Tesseract to create a small Windows application that >> allows the user to: >> >> - Take a screenshot of the monitor and cut a smaller portion >> containing a table (the table always has the same format, and the labels >> are consistent. The numerical data are different each time). >> - Provide the screenshot to Tesseract to extract the data. My >> strategy is to remove the vertical and horizontal lines in the table, >> extract the entire text, and collect the numerical values corresponding to >> the labels I want to capture. >> - Finally, generate a text output based on the extracted data. >> >> The app works fine, but there are still many errors in data extraction. >> Sometimes, some values are not extracted at all because the label is not >> correctly recognized. Other times, even if the labels are recognized >> correctly and the data are extracted, the numbers are incorrect. Also I >> noticed that the error quote is higher on my work PC, probably because the >> screen resolution is lower than my home PC. >> >> I am wondering if there is a more reliable way to accomplish my goal. >> >> Below I attached some images of the App to give you an idea, an example >> of the table and the python script I am using for OCR. >> >> Thank you very much for your help!!! >> >> tesseract v5.4.0.20240606 >> >> Python 3.13.1 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To view this discussion visit >> https://groups.google.com/d/msgid/tesseract-ocr/191d869f-9ff0-4297-b539-aad42fc3c1e3n%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/191d869f-9ff0-4297-b539-aad42fc3c1e3n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xZZ60JE8-j_Db-LgNTgDgGz_hnaiwcCJAZnbC4GAMdFQ%40mail.gmail.com.