Use tsv output but you will still need to parse it to get line information.

Zdenko


st 21. 4. 2021 o 16:38 Baris Unsal <[email protected]> napĂ­sal(a):

> I want the opposite way. Getting ril_textline like output from passing
> argument to tesseract.
>
> On Wednesday, 21 April 2021 at 17:36:35 UTC+3 Quan Nguyen wrote:
>
>> I think it would need to operate at RIL_SYMBOL level, not RIL_TEXTLINE.
>>
>> On Wednesday, April 21, 2021 at 7:17:04 AM UTC-5 [email protected]
>> wrote:
>>
>>> Hi, when I pass tessedit_create_boxfile 1 argument to tesseract it
>>> outputs individual chars' location. But when I use api like this:
>>>
>>> ```
>>> Boxa* boxes = api->GetComponentImages(tesseract::RIL_TEXTLINE, true,NULL
>>> ,NULL);
>>> for(int i = 0; i < boxes->n; i++){
>>> BOX* box =boxaGetBox(boxes,i,L_CLONE);
>>> api->SetRectangle(box->x,box->y,box->w,box->h);
>>> char* outText = api->GetUTF8Text();
>>> int conf = api->MeanTextConf();
>>> fprintf(stdout,"Box[%d]: x=%d, y=%d, w=%d, h=%d, confidence: %d, text:
>>> %s",
>>> i, box->x, box->y, box->w, box->h, conf, outText);
>>> boxDestroy(&box);
>>> delete[] outText;
>>> }
>>> ```
>>> it outputs whole line like this:
>>> Box[1]: x=36, y=84, w=246, h=14, confidence: 44, text: #Spor #siyaset
>>> Fanket FIliskiler
>>>
>>> Is there any way to combine individual boxes to print like API? Thanks
>>> in advance.
>>>
>>>
>>>
>>>
>>>
>>>
>>> ############
>>> ### Environment
>>>
>>> * **Tesseract Version**: <!-- compulsory. you must provide your version
>>> -->
>>> tesseract 4.1.1-rc2-25-g9707
>>>  leptonica-1.78.0
>>>   libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 :
>>> libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
>>>  Found AVX2
>>>  Found AVX
>>>  Found FMA
>>>  Found SSE
>>>  Found libarchive 3.3.3 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6
>>> liblz4/1.8.3 libzstd/1.3.8
>>>
>>> * **Platform**: <!-- either `uname -a` output, or if Windows, version
>>> and 32-bit or 64-bit -->
>>> Linux pardus 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28)
>>> x86_64 GNU/Linux
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/8e54bc79-113a-4685-9bba-2353216dad2fn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/8e54bc79-113a-4685-9bba-2353216dad2fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wS8XdwKW1eG%2BBW2L2ieVMYt%2B4GjAP59tyf%2BQpcWVOkwA%40mail.gmail.com.

Reply via email to