Are you trying to train for the legacy tesseract engine? On Sun, Nov 1, 2020, 03:29 Cailey McVay <cailey.m.mcvay...@dartmouth.edu> wrote:
> Hello! > I am working on a project that is trying to read borehole video depths. We > trained a new language to read these numbers called NTS. When we use > tesseract on the images without the trained language we receive outputs > that are accurate about 50% of the time. However when we use the new > language, we receive no output at all. Is it possible that we overtrained > tesseract to not recognize any of the images? I will attach below our box > file, unicharset file, box trained file, pffmtable file, and normproto > file. Our shapetable file processes but then returns an empty file. Could > something be wrong with our shapetable? And if so, how could we fix that? > > Box File for the first five images: > 0 3 1 14 19 0 > 9 18 0 29 20 0 > 3 33 1 46 19 0 > . 50 1 56 19 0 > 2 64 1 75 19 0 > 5 76 1 93 19 0 > 2 92 1 111 19 0 > 0 4 1 15 19 1 > 8 19 1 30 19 1 > 3 34 1 46 19 1 > . 54 1 57 5 1 > 4 65 1 77 19 1 > 1 82 1 91 19 1 > 4 96 1 107 19 1 > 0 3 1 15 19 2 > 8 19 1 30 19 2 > 6 34 1 46 19 2 > . 53 1 57 5 2 > 8 65 1 77 19 2 > 3 80 1 91 19 2 > 9 95 1 107 19 2 > 0 4 1 15 19 3 > 8 17 1 31 19 3 > 8 32 1 46 19 3 > . 52 2 58 8 3 > 1 64 0 77 20 3 > 8 80 1 91 19 3 > 5 96 1 107 19 3 > 0 3 1 15 19 4 > 8 19 1 30 19 4 > 7 34 1 47 19 4 > . 53 1 58 9 4 > 5 65 1 77 19 4 > 6 80 1 92 19 4 > 4 95 0 109 20 4 > 0 4 1 15 19 5 > 7 19 1 30 19 5 > 5 34 1 46 19 5 > . 53 1 57 5 5 > 3 65 1 76 19 5 > 1 82 1 90 19 5 > 3 96 1 107 19 5 > > > Unicharset: > 14 > NULL 0 Common 0 > Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined # Joined [4a 6f 69 6e > 65 64 ]a > |Broken|0|1 21 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1 # Broken > 0 8 0,255,0,255,0,0,0,0,0,0 Common 3 2 3 0 # 0 [30 ]0 > 9 8 0,255,0,255,0,0,0,0,0,0 Common 4 2 4 9 # 9 [39 ]0 > 3 8 0,255,0,255,0,0,0,0,0,0 Common 5 2 5 3 # 3 [33 ]0 > . 22 0,255,0,255,0,0,0,0,0,0 Common 6 6 6 . # . [2e ]p > 2 8 0,255,0,255,0,0,0,0,0,0 Common 7 2 7 2 # 2 [32 ]0 > 5 8 0,255,0,255,0,0,0,0,0,0 Common 8 2 8 5 # 5 [35 ]0 > 8 8 0,255,0,255,0,0,0,0,0,0 Common 9 2 9 8 # 8 [38 ]0 > 4 8 0,255,0,255,0,0,0,0,0,0 Common 10 2 10 4 # 4 [34 ]0 > 1 8 0,255,0,255,0,0,0,0,0,0 Common 11 2 11 1 # 1 [31 ]0 > 6 8 0,255,0,255,0,0,0,0,0,0 Common 12 2 12 6 # 6 [36 ]0 > 7 8 0,255,0,255,0,0,0,0,0,0 Common 13 2 13 7 # 7 [37 ]0 > > > NTS.font.exp0.tr file: > font 0 3 1 14 19 0 > 4 > mf 16 > -0.085041896 0.30783021 0.27617577 0 0 0 > -0.25234067 0.27376649 0.089746617 0.13718249 0 0 > -0.28155157 0.0045010448 0.47040343 0.25 0 0 > -0.25234067 -0.26476437 0.08974655 0.36281759 0 0 > -0.085041896 -0.29882804 0.27617577 0.5 0 0 > -0.031931162 -0.21447986 0.1730229 0.96998096 0 0 > -0.11690831 0.020721853 0.43796182 0.75 0 0 > -0.031931162 0.23970276 0.1699543 0.5 0 0 > 0.24424461 0.072628468 0.47339222 0.76789355 0 0 > 0.1353676 0.30783021 0.16464323 0 0 0 > 0.10615671 0.18941826 0.14627755 0.37934926 0 0 > 0.15926743 -0.011719763 0.30170703 0.25 0 0 > 0.10615671 -0.19663697 0.12619166 0.090763755 0 0 > 0.1353676 -0.29882804 0.16464323 0.5 0 0 > 0.27079996 -0.26476437 0.12619169 0.59076369 0 0 > 0.29735535 -0.19663697 0.086383387 0.85538673 0 0 > cn 1 > 0.36328125 0.35781249 0.2421875 0.1484375 > if 73 > 133 69 248 > 119 72 248 > 104 75 248 > 97 82 192 > 97 95 192 > 97 107 192 > 97 120 192 > 97 132 192 > 97 145 192 > 97 157 192 > 97 170 192 > 97 182 192 > 104 188 128 > 119 188 128 > 133 188 128 > 135 206 0 > 123 206 0 > 111 206 0 > 99 206 0 > 88 206 0 > 76 206 0 > 66 201 35 > 59 193 35 > 55 182 64 > 55 168 64 > 55 155 64 > 55 142 64 > 55 128 64 > 55 115 64 > 55 101 64 > 55 88 64 > 55 75 64 > 59 64 93 > 66 55 93 > 76 51 128 > 88 51 128 > 99 51 128 > 111 51 128 > 123 51 128 > 135 51 128 > 145 184 97 > 154 175 97 > 163 167 97 > 168 156 64 > 168 143 64 > 168 130 64 > 168 118 64 > 168 105 64 > 168 92 64 > 163 82 23 > 154 77 23 > 145 71 23 > 148 51 128 > 162 51 128 > 176 51 128 > 187 53 151 > 196 59 151 > 205 65 151 > 207 72 219 > 200 81 219 > 196 92 192 > 196 105 192 > 196 118 192 > 196 130 192 > 196 143 192 > 196 156 192 > 195 168 204 > 191 179 204 > 188 190 204 > 184 200 204 > 176 206 0 > 162 206 0 > 148 206 0 > tb 1 > 64 251 114 > > > pffmtable: > NULL 0 > Joined 0 > |Broken|0|1 0 > 0 0 > 9 0 > 3 0 > . 0 > 2 0 > 5 0 > 8 0 > 4 0 > 1 0 > 6 0 > 7 0 > > NTS.normproto file: > linear essential -0.250000 0.750000 > linear non-essential 0.000000 1.000000 > linear essential 0.000000 1.000000 > linear essential 0.000000 1.000000 > > 0 1 > significant elliptical 34 > 0.364775 0.371404 0.241039 0.150391 > 0.000400 0.000416 0.000400 0.000400 > > 9 1 > significant elliptical 13 > 0.372897 0.418750 0.241286 0.157752 > 0.000400 0.004734 0.000400 0.001087 > > 3 1 > significant elliptical 16 > 0.365479 0.385596 0.247070 0.143799 > 0.000400 0.003148 0.000400 0.000702 > > . 1 > significant elliptical 27 > 0.081019 0.055483 0.060619 0.050492 > 0.000400 0.000400 0.000400 0.000400 > > 2 1 > significant elliptical 10 > 0.354297 0.359492 0.248828 0.138672 > 0.000400 0.000400 0.000400 0.000400 > > 5 1 > significant elliptical 10 > 0.363672 0.350859 0.248047 0.144922 > 0.000400 0.000400 0.000400 0.000400 > > 8 1 > significant elliptical 19 > 0.365543 0.378536 0.234786 0.141653 > 0.000400 0.000400 0.000400 0.000400 > > 4 1 > significant elliptical 9 > 0.325521 0.274219 0.215278 0.128038 > 0.000400 0.000400 0.000400 0.000400 > > 1 1 > significant elliptical 11 > 0.320312 0.217259 0.248580 0.091974 > 0.000400 0.000400 0.000400 0.000400 > > 6 1 > significant elliptical 20 > 0.360156 0.370703 0.238281 0.143164 > 0.000400 0.000400 0.000400 0.000400 > > 7 1 > significant elliptical 20 > 0.448633 0.243359 0.242969 0.113477 > 0.000400 0.000400 0.000400 0.000400 > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/9e3a6851-0311-4148-af1f-b61999f38977n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/9e3a6851-0311-4148-af1f-b61999f38977n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVkKaaxWxV4WBb22o0Z66zszmUnqy%2BsXRQcQNzoPZPt8Q%40mail.gmail.com.