On Thursday, November 30, 2023 at 11:11:08 AM UTC-5 scott...@gmail.com 
wrote:

I'm running an image through Tesseract via a PHP library (
https://github.com/thiagoalessio/tesseract-ocr-for-php).


There's a bunch of really useful information missing (e.g. version of 
Tesseract), but fortunately this is easily reproducible with the current 
development version.
 

The ouput seems to contain two potential matches for a single character.


That's not what's happening. It's actually recognizing both characters 
separately, although I'm not sure why. The engine does consider the correct 
string, but the incorrect string scores higher. I'm not familiar enough 
with the internals to interpret it, but the debug output is below in case 
someone else wants to give it a go. As you can see it considers both P01.01 
and P0O1.01, but picks the latter because it's got a (marginally) better 
score.

Tom

Processing word with lang eng at:Bounding box=(21,46)->(143,75)
Trying word using lang eng, oem 1
Created window Convolve of size 530, 1700
Created window ConvNL of size 530, 2000
Created window Lfys64 of size 530, 2000
Created window Lfx96 of size 530, 1534
Created window Lrx96 of size 530, 1534
Created window Lfx512 of size 530, 2000
Created window Output of size 530, 1761
Created window LSTMForward of size 1418, 580
<null>=110 On [0, 2), scores= 100(i=83=0.00155) 100(P=28=0.00331), 
Mean=99.9928, max=99.9963
P=28 On [2, 8), scores= 26.8(<null>=110=69) 97.1(p=103=2.5) 
90.3(<null>=110=9.35) 3.18(<null>=110=96.8) 4.89e-05(<null>=110=100) 
6.7e-05(<null>=110=92.2), Mean=36.2217, max=97.0573
0=33 On [8, 9), scores= 67(O=21=31.2), Mean=67.0461, max=67.0461
O=21 On [9, 13), scores= 56.8(0=33=42.5) 10.5(<null>=110=82) 
1.01e-05(<null>=110=100) 7.72e-06(<null>=110=100), Mean=16.8307, max=56.8101
1=34 On [13, 18), scores= 64.2(<null>=110=34.6) 99.4(l=87=0.559) 
98.5(<null>=110=1.46) 5.78(<null>=110=94.2) 1.12e-05(<null>=110=100), 
Mean=53.576, max=99.3538
.=23 On [18, 23), scores= 83.1(<null>=110=16.9) 100(,=15=0.00592) 
94.8(<null>=110=5.24) 0.0434(<null>=110=100) 2.41e-07(<null>=110=100), 
Mean=55.5837, max=99.9923
0=33 On [23, 28), scores= 52.6(<null>=110=47.4) 99.8(O=21=0.124) 
82(<null>=110=17.9) 0.000255(<null>=110=100) 3.13e-11(<null>=110=100), 
Mean=46.8875, max=99.8451
1=34 On [28, 33), scores= 46.1(<null>=110=53.8) 99.8(l=87=0.0986) 
99.1(<null>=110=0.894) 0.586(<null>=110=99.4) 8.5e-09(<null>=110=100), 
Mean=49.1089, max=99.8028
0 null_char score=-0.191493, c=-0.191493, perm=2, hash=0
1 null_char score=-0.382826, c=-0.191333, perm=2, hash=0 prev:null_char 
score=-0.191493, c=-0.191493, perm=2, hash=0
2 label=28, uid=30=P [50 ]A score=-0.66966, c=-0.286834, perm=2, hash=1c 
prev:null_char score=-0.382826, c=-0.191333, perm=2, hash=0
3 label=28, uid=30=P [50 ]A score=-0.928113, c=-0.258453, perm=2, hash=1c 
prev:label=28, uid=30=P [50 ]A score=-0.66966, c=-0.286834, perm=2, hash=1c
4 label=28, uid=30=P [50 ]A score=-1.12845, c=-0.200338, perm=2, hash=1c 
prev:label=28, uid=30=P [50 ]A score=-0.928113, c=-0.258453, perm=2, hash=1c
5 null_char score=-1.39284, c=-0.264391, perm=2, hash=1c prev:label=28, 
uid=30=P [50 ]A score=-1.12845, c=-0.200338, perm=2, hash=1c
6 null_char score=-1.58412, c=-0.191278, perm=2, hash=1c prev:null_char 
score=-1.39284, c=-0.264391, perm=2, hash=1c
7 null_char score=-1.95755, c=-0.373434, perm=2, hash=1c prev:null_char 
score=-1.58412, c=-0.191278, perm=2, hash=1c
8 label=33, uid=35=0 [30 ]0 score=-3.04833, c=-1.09078, perm=2, hash=c45 
prev:null_char score=-1.95755, c=-0.373434, perm=2, hash=1c
9 label=21, uid=23=O [4f ]A score=-4.51186, c=-1.46353, perm=2, hash=55200 
prev:label=33, uid=35=0 [30 ]0 score=-3.04833, c=-1.09078, perm=2, hash=c45
10 label=21, uid=23=O [4f ]A score=-4.87753, c=-0.365671, perm=2, 
hash=55200 prev:label=21, uid=23=O [4f ]A score=-4.51186, c=-1.46353, 
perm=2, hash=55200
11 null_char score=-5.06878, c=-0.191256, perm=2, hash=55200 prev:label=21, 
uid=23=O [4f ]A score=-4.87753, c=-0.365671, perm=2, hash=55200
12 null_char score=-5.26093, c=-0.192142, perm=2, hash=55200 prev:null_char 
score=-5.06878, c=-0.191256, perm=2, hash=55200
13 label=34, uid=36=1 [31 ]0 score=-5.47957, c=-0.218643, perm=2, 
hash=24e8e22 prev:null_char score=-5.26093, c=-0.192142, perm=2, hash=55200
14 label=34, uid=36=1 [31 ]0 score=-5.68541, c=-0.205837, perm=2, 
hash=24e8e22 prev:label=34, uid=36=1 [31 ]0 score=-5.47957, c=-0.218643, 
perm=2, hash=24e8e22
15 label=34, uid=36=1 [31 ]0 score=-5.91023, c=-0.224826, perm=2, 
hash=24e8e22 prev:label=34, uid=36=1 [31 ]0 score=-5.68541, c=-0.205837, 
perm=2, hash=24e8e22
16 label=34, uid=36=1 [31 ]0 score=-6.10178, c=-0.191543, perm=2, 
hash=24e8e22 prev:label=34, uid=36=1 [31 ]0 score=-5.91023, c=-0.224826, 
perm=2, hash=24e8e22
17 null_char score=-6.29319, c=-0.191418, perm=2, hash=24e8e22 
prev:label=34, uid=36=1 [31 ]0 score=-6.10178, c=-0.191543, perm=2, 
hash=24e8e22
18 label=23, uid=25=. [2e ]p score=-6.48452, c=-0.191326, perm=2, 
hash=1000fa0d5 prev:null_char score=-6.29319, c=-0.191418, perm=2, 
hash=24e8e22
19 label=23, uid=25=. [2e ]p score=-6.67594, c=-0.191424, perm=2, 
hash=1000fa0d5 prev:label=23, uid=25=. [2e ]p score=-6.48452, c=-0.191326, 
perm=2, hash=1000fa0d5
20 label=23, uid=25=. [2e ]p score=-6.8672, c=-0.191259, perm=2, 
hash=1000fa0d5 prev:label=23, uid=25=. [2e ]p score=-6.67594, c=-0.191424, 
perm=2, hash=1000fa0d5
21 null_char score=-7.05943, c=-0.192229, perm=2, hash=1000fa0d5 
prev:label=23, uid=25=. [2e ]p score=-6.8672, c=-0.191259, perm=2, 
hash=1000fa0d5
22 null_char score=-7.2507, c=-0.191266, perm=2, hash=1000fa0d5 
prev:null_char score=-7.05943, c=-0.192229, perm=2, hash=1000fa0d5
23 label=33, uid=35=0 [30 ]0 score=-7.44305, c=-0.192357, perm=2, 
hash=6f06c6bc7c prev:null_char score=-7.2507, c=-0.191266, perm=2, 
hash=1000fa0d5
24 label=33, uid=35=0 [30 ]0 score=-7.63779, c=-0.194738, perm=2, 
hash=6f06c6bc7c prev:label=33, uid=35=0 [30 ]0 score=-7.44305, c=-0.192357, 
perm=2, hash=6f06c6bc7c
25 label=33, uid=35=0 [30 ]0 score=-7.83177, c=-0.193978, perm=2, 
hash=6f06c6bc7c prev:label=33, uid=35=0 [30 ]0 score=-7.63779, c=-0.194738, 
perm=2, hash=6f06c6bc7c
26 null_char score=-8.02303, c=-0.19126, perm=2, hash=6f06c6bc7c 
prev:label=33, uid=35=0 [30 ]0 score=-7.83177, c=-0.193978, perm=2, 
hash=6f06c6bc7c
27 null_char score=-8.21431, c=-0.191279, perm=2, hash=6f06c6bc7c 
prev:null_char score=-8.02303, c=-0.19126, perm=2, hash=6f06c6bc7c
28 label=34, uid=36=1 [31 ]0 score=-8.40869, c=-0.194379, perm=2, 
hash=3023f02bb9e6 prev:null_char score=-8.21431, c=-0.191279, perm=2, 
hash=6f06c6bc7c
29 label=34, uid=36=1 [31 ]0 score=-8.60438, c=-0.195692, perm=2, 
hash=3023f02bb9e6 prev:label=34, uid=36=1 [31 ]0 score=-8.40869, 
c=-0.194379, perm=2, hash=3023f02bb9e6
30 label=34, uid=36=1 [31 ]0 score=-8.79638, c=-0.192, perm=2, 
hash=3023f02bb9e6 prev:label=34, uid=36=1 [31 ]0 score=-8.60438, 
c=-0.195692, perm=2, hash=3023f02bb9e6
31 null_char score=-9.00085, c=-0.204469, perm=2, hash=3023f02bb9e6 
prev:label=34, uid=36=1 [31 ]0 score=-8.79638, c=-0.192, perm=2, 
hash=3023f02bb9e6
32 null_char score=-9.1921, c=-0.191251, perm=2, hash=3023f02bb9e6 
prev:null_char score=-9.00085, c=-0.204469, perm=2, hash=3023f02bb9e6

Second choice path:
2 30=P [50 ]A r=1.12845, c=-0.286834, s=0, e=0, perm=2
8 35=0 [30 ]0 r=3.98936, c=-2.11872, s=0, e=0, perm=2
13 36=1 [31 ]0 r=1.86124, c=-0.636992, s=0, e=0, perm=2
18 25=. [2e ]p r=0.765426, c=-0.191424, s=0, e=0, perm=2
23 35=0 [30 ]0 r=0.964568, c=-0.194738, s=0, e=0, perm=2
28 36=1 [31 ]0 r=1.36033, c=-0.204469, s=0, e=0, perm=2
Path total rating = 10.0694
2 30=P [50 ]A r=1.12845, c=-0.286834, s=0, e=0, perm=2
8 35=0 [30 ]0 r=1.91988, c=-1.09078, s=0, e=0, perm=2
9 23=O [4f ]A r=1.8292, c=-1.46353, s=0, e=0, perm=2
13 36=1 [31 ]0 r=1.22425, c=-0.224826, s=0, e=0, perm=2
18 25=. [2e ]p r=0.765426, c=-0.191424, s=0, e=0, perm=2
23 35=0 [30 ]0 r=0.964568, c=-0.194738, s=0, e=0, perm=2
28 36=1 [31 ]0 r=1.36033, c=-0.204469, s=0, e=0, perm=2
Path total rating = 9.1921
Best choice: accepted=0, adaptable=0, done=1 : Lang result : P0O1.01 : 
R=9.1921, C=-10.2447, F=1, Perm=2, xht=[0,3.40282e+38], ambig=0
pos NORM NORM NORM NORM NORM NORM NORM
str P 0 O 1 . 0 1
state: 1 1 1 1 1 1 1 
C -0.287 -1.091 -1.464 -0.225 -0.191 -0.195 -0.204
1 new words better than 0 old words: r: 9.1921 v 0 c: -10.2447 v 0 valid 
dict: 0 v 0



-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9f2d2046-2b24-473c-8ff5-d1325970b03en%40googlegroups.com.

Reply via email to