Font attribute recognition is a legacy engine thing only, ie it doesn't 
exist in the new LSTM engine for Tess 4/5.

On Monday, January 1, 2024 at 12:15:27 PM UTC-5 sco...@gmail.com wrote:


The problem is, even after training a few different ways with *tesstrain* (e.g. 
adjusting *exposure* options, *char_spacing *options, etc), when I output 
to hocr (e.g. using the command *tesseract sherlock-holmes-example.png 
output -l ITC-New-Baskerville-Std+ITC-New-Baskerville-Std-Italic -c 
hocr_font_info=1 hocr) *it still seems to get the font info wrong (see 
attached files for a sample input and output). 

As an example, I was hoping the word "*coup-de-maitres*" would be 
recognized with *lang='ITC-New-Baskerville-Std-Italic'*, but it isn't. 
Conversely, the word "testifying" shows with 
*lang='ITC-New-Baskerville-Std-Italic'*, but it is not italic.


You appear to be training the font as a language, which is why it's getting 
output with the `lang=` tag. That's wrong and it should be `x_font <font>` 
in the title, if it's actually recognizing it as a font and outputting it 
as such. The HOCR will also contain <em> tags for italic words if an italic 
font is recognized. 

I tried using `--oem 0` with the eng model 
from https://github.com/tesseract-ocr/tessdata and it did output <strong> 
and <em> tags, but in the wrong places and it's accuracy on the text wasn't 
as good as the LSTM model. When I used eng+fra, it output language tags, 
but at the paragraph level, not the word level, and they were mostly wrong. 
I've attached the output.

You can read more about the state of play of getting font attributes out of 
the current model here (it's possible, but don't look for it any time soon):
https://github.com/tesseract-ocr/tesseract/issues/1074#issuecomment-3278142444

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b61658a4-dfae-41ae-bfcb-e28f8f07a775n%40googlegroups.com.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
 <head>
  <title></title>
  <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
  <meta name='ocr-system' content='tesseract 5.3.1' />
  <meta name='ocr-capabilities' content='ocr_page ocr_carea ocr_par ocr_line ocrx_word ocrp_wconf ocrp_lang ocrp_dir ocrp_font ocrp_fsize'/>
 </head>
 <body>
  <div class='ocr_page' id='page_1' title='image "/Users/tfmorris/Downloads/sherlock-holmes-example.png"; bbox 0 0 368 730; ppageno 0; scan_res 144 144'>
   <div class='ocr_carea' id='block_1_1' title="bbox 10 6 356 714">
    <p class='ocr_par' id='par_1_1' lang='eng' title="bbox 11 6 356 493">
     <span class='ocr_line' id='line_1_1' title="bbox 35 6 356 28; baseline 0.012 -7; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_1' title='bbox 35 6 76 28; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>“My</span>
      <span class='ocrx_word' id='word_1_2' title='bbox 93 7 167 28; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>practice</span>
      <span class='ocrx_word' id='word_1_3' title='bbox 184 8 214 24; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>has</span>
      <span class='ocrx_word' id='word_1_4' title='bbox 229 9 316 25; x_wconf 72; x_font Trebuchet_MS; x_fsize 11'>extended</span>
      <span class='ocrx_word' id='word_1_5' title='bbox 332 14 356 25; x_wconf 94; x_font Times_New_Roman; x_fsize 11'>re-</span>
     </span>
     <span class='ocr_line' id='line_1_2' title="bbox 12 29 355 50; baseline 0.009 -6; x_size 22; x_descenders 6; x_ascenders 6">
      <span class='ocrx_word' id='word_1_6' title='bbox 12 29 69 50; x_wconf 74; x_font Times_New_Roman; x_fsize 11'>cently</span>
      <span class='ocrx_word' id='word_1_7' title='bbox 91 33 109 45; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_8' title='bbox 132 29 161 45; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_9' title='bbox 184 30 296 49; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>Continent,&quot;</span>
      <span class='ocrx_word' id='word_1_10' title='bbox 319 31 355 48; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>said</span>
     </span>
     <span class='ocr_line' id='line_1_3' title="bbox 12 51 355 75; baseline 0.009 -9; x_size 19; x_descenders 3; x_ascenders 6">
      <span class='ocrx_word' id='word_1_11' title='bbox 12 51 85 67; x_wconf 80; x_font Times_New_Roman; x_fsize 10'>Holmes</span>
      <span class='ocrx_word' id='word_1_12' title='bbox 94 51 138 67; x_wconf 73; x_font Times_New_Roman; x_fsize 10'>after</span>
      <span class='ocrx_word' id='word_1_13' title='bbox 147 57 156 67; x_wconf 91; x_font Times_New_Roman; x_fsize 10'>a</span>
      <span class='ocrx_word' id='word_1_14' title='bbox 165 52 221 71; x_wconf 85; x_font Times_New_Roman; x_fsize 10'>while,</span>
      <span class='ocrx_word' id='word_1_15' title='bbox 231 52 287 75; x_wconf 80; x_font Times_New_Roman; x_fsize 10'>filling</span>
      <span class='ocrx_word' id='word_1_16' title='bbox 296 59 320 75; x_wconf 88; x_font Times_New_Roman; x_fsize 10'>up</span>
      <span class='ocrx_word' id='word_1_17' title='bbox 329 53 355 69; x_wconf 87; x_font Times_New_Roman; x_fsize 10'>his</span>
     </span>
     <span class='ocr_line' id='line_1_4' title="bbox 12 73 355 96; baseline 0.012 -8; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_18' title='bbox 12 73 41 89; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>old</span>
      <span class='ocrx_word' id='word_1_19' title='bbox 58 73 149 89; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>brier-root</span>
      <span class='ocrx_word' id='word_1_20' title='bbox 165 74 211 96; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>pipe.</span>
      <span class='ocrx_word' id='word_1_21' title='bbox 234 75 251 91; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>“1</span>
      <span class='ocrx_word' id='word_1_22' title='bbox 268 80 301 91; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>was</span>
      <span class='ocrx_word' id='word_1_23' title='bbox 317 81 355 91; x_wconf 65; x_font Times_New_Roman; x_fsize 11'>con»</span>
     </span>
     <span class='ocr_line' id='line_1_5' title="bbox 12 95 355 118; baseline 0.012 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_24' title='bbox 12 95 69 111; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>suited</span>
      <span class='ocrx_word' id='word_1_25' title='bbox 87 96 119 111; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>last</span>
      <span class='ocrx_word' id='word_1_26' title='bbox 137 96 184 112; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>week</span>
      <span class='ocrx_word' id='word_1_27' title='bbox 202 96 223 118; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>by</span>
      <span class='ocrx_word' id='word_1_28' title='bbox 241 97 321 114; x_wconf 79; x_font Trebuchet_MS_Bold; x_fsize 11'><strong>Francois</strong></span>
      <span class='ocrx_word' id='word_1_29' title='bbox 339 98 355 114; x_wconf 86; x_font Times_New_Roman; x_fsize 11' lang='fra'>lc</span>
     </span>
     <span class='ocr_line' id='line_1_6' title="bbox 12 117 354 141; baseline 0.006 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_30' title='bbox 12 117 84 136; x_wconf 74; x_font Times_New_Roman; x_fsize 11' lang='fra'>Villard,</span>
      <span class='ocrx_word' id='word_1_31' title='bbox 91 118 135 136; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>who,</span>
      <span class='ocrx_word' id='word_1_32' title='bbox 142 124 159 134; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>as</span>
      <span class='ocrx_word' id='word_1_33' title='bbox 165 124 198 140; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>you</span>
      <span class='ocrx_word' id='word_1_34' title='bbox 206 119 291 141; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>probably</span>
      <span class='ocrx_word' id='word_1_35' title='bbox 299 120 354 138; x_wconf 83; x_font DejaVu_Sans_Ultra-Light; x_fsize 11'>know,</span>
     </span>
     <span class='ocr_line' id='line_1_7' title="bbox 12 139 355 164; baseline 0.009 -9; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_36' title='bbox 12 139 42 155; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>has</span>
      <span class='ocrx_word' id='word_1_37' title='bbox 52 145 100 156; x_wconf 80; x_font Times_New_Roman; x_fsize 11'>come</span>
      <span class='ocrx_word' id='word_1_38' title='bbox 110 140 168 156; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>rather</span>
      <span class='ocrx_word' id='word_1_39' title='bbox 179 145 196 156; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_40' title='bbox 207 141 236 157; x_wconf 92; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_41' title='bbox 246 142 293 158; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>front</span>
      <span class='ocrx_word' id='word_1_42' title='bbox 303 142 355 164; x_wconf 80; x_font Times_New_Roman; x_fsize 11'>lately</span>
     </span>
     <span class='ocr_line' id='line_1_8' title="bbox 12 162 354 181; baseline 0.012 -4; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_43' title='bbox 12 162 30 177; x_wconf 91; x_font Times_New_Roman; x_fsize 11'>in</span>
      <span class='ocrx_word' id='word_1_44' title='bbox 38 162 66 178; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_45' title='bbox 75 163 140 179; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>French</span>
      <span class='ocrx_word' id='word_1_46' title='bbox 148 163 232 179; x_wconf 76; x_font Times_New_Roman; x_fsize 11'>detective</span>
      <span class='ocrx_word' id='word_1_47' title='bbox 239 164 306 180; x_wconf 76; x_font Verdana_Bold; x_fsize 11'><strong>service.</strong></span>
      <span class='ocrx_word' id='word_1_48' title='bbox 326 165 354 181; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>He</span>
     </span>
     <span class='ocr_line' id='line_1_9' title="bbox 12 184 355 208; baseline 0.009 -9; x_size 21; x_descenders 5; x_ascenders 5">
      <span class='ocrx_word' id='word_1_49' title='bbox 12 184 41 200; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>has</span>
      <span class='ocrx_word' id='word_1_50' title='bbox 54 184 77 200; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>all</span>
      <span class='ocrx_word' id='word_1_51' title='bbox 91 184 119 200; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_52' title='bbox 133 185 188 201; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>Celtic</span>
      <span class='ocrx_word' id='word_1_53' title='bbox 201 190 259 206; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>power</span>
      <span class='ocrx_word' id='word_1_54' title='bbox 271 186 290 202; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_55' title='bbox 301 186 355 208; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>quick</span>
     </span>
     <span class='ocr_line' id='line_1_10' title="bbox 12 205 354 225; baseline 0.012 -4; x_size 19; x_descenders 3; x_ascenders 5">
      <span class='ocrx_word' id='word_1_56' title='bbox 12 205 102 225; x_wconf 71; x_font Trebuchet_MS; x_fsize 10'>intuition,</span>
      <span class='ocrx_word' id='word_1_57' title='bbox 112 206 144 222; x_wconf 86; x_font Times_New_Roman; x_fsize 10'>but</span>
      <span class='ocrx_word' id='word_1_58' title='bbox 154 207 175 223; x_wconf 90; x_font Times_New_Roman; x_fsize 10'>he</span>
      <span class='ocrx_word' id='word_1_59' title='bbox 185 207 198 223; x_wconf 83; x_font Times_New_Roman; x_fsize 10'>is</span>
      <span class='ocrx_word' id='word_1_60' title='bbox 207 208 288 224; x_wconf 79; x_font Times_New_Roman; x_fsize 10'>deficient</span>
      <span class='ocrx_word' id='word_1_61' title='bbox 297 208 315 224; x_wconf 87; x_font Times_New_Roman; x_fsize 10'>in</span>
      <span class='ocrx_word' id='word_1_62' title='bbox 326 209 354 225; x_wconf 84; x_font Times_New_Roman; x_fsize 10'>the</span>
     </span>
     <span class='ocr_line' id='line_1_11' title="bbox 12 228 354 252; baseline 0.009 -9; x_size 22; x_descenders 6; x_ascenders 6">
      <span class='ocrx_word' id='word_1_63' title='bbox 12 228 55 244; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>wide</span>
      <span class='ocrx_word' id='word_1_64' title='bbox 75 234 128 250; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>range</span>
      <span class='ocrx_word' id='word_1_65' title='bbox 147 229 165 245; x_wconf 85; x_font Times_New_Roman; x_fsize 11' lang='fra'>of</span>
      <span class='ocrx_word' id='word_1_66' title='bbox 185 233 234 245; x_wconf 78; x_font Verdana_Bold; x_fsize 11' lang='fra'><strong>exact</strong></span>
      <span class='ocrx_word' id='word_1_67' title='bbox 253 230 354 252; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>knowledge</span>
     </span>
     <span class='ocr_line' id='line_1_12' title="bbox 11 250 354 273; baseline 0.009 -7; x_size 21; x_descenders 5; x_ascenders 5">
      <span class='ocrx_word' id='word_1_68' title='bbox 11 250 68 266; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>which</span>
      <span class='ocrx_word' id='word_1_69' title='bbox 78 250 91 266; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>is</span>
      <span class='ocrx_word' id='word_1_70' title='bbox 100 251 179 267; x_wconf 78; x_font Times_New_Roman; x_fsize 11'>essential</span>
      <span class='ocrx_word' id='word_1_71' title='bbox 190 255 207 268; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_72' title='bbox 218 252 247 268; x_wconf 93; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_73' title='bbox 257 252 318 273; x_wconf 82; x_font Times_New_Roman; x_fsize 11' lang='fra'>higher</span>
      <span class='ocrx_word' id='word_1_74' title='bbox 328 253 354 269; x_wconf 63; x_font Times_New_Roman; x_fsize 11' lang='fra'>de</span>
     </span>
     <span class='ocr_line' id='line_1_13' title="bbox 12 272 354 294; baseline 0.012 -7; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_75' title='bbox 12 272 119 294; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>velopments</span>
      <span class='ocrx_word' id='word_1_76' title='bbox 127 273 145 289; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_77' title='bbox 153 273 179 289; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>his</span>
      <span class='ocrx_word' id='word_1_78' title='bbox 186 278 217 289; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>art.</span>
      <span class='ocrx_word' id='word_1_79' title='bbox 230 274 270 290; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>The</span>
      <span class='ocrx_word' id='word_1_80' title='bbox 277 280 313 291; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>case</span>
      <span class='ocrx_word' id='word_1_81' title='bbox 321 280 354 291; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>was</span>
     </span>
     <span class='ocr_line' id='line_1_14' title="bbox 11 295 354 318; baseline 0.009 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_82' title='bbox 11 295 108 310; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>concerned</span>
      <span class='ocrx_word' id='word_1_83' title='bbox 116 295 158 311; x_wconf 92; x_font Times_New_Roman; x_fsize 11'>with</span>
      <span class='ocrx_word' id='word_1_84' title='bbox 165 301 174 311; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>a</span>
      <span class='ocrx_word' id='word_1_85' title='bbox 183 296 218 311; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>will</span>
      <span class='ocrx_word' id='word_1_86' title='bbox 225 296 261 312; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>and</span>
      <span class='ocrx_word' id='word_1_87' title='bbox 269 297 354 318; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>possessed</span>
     </span>
     <span class='ocr_line' id='line_1_15' title="bbox 12 317 354 336; baseline 0.009 -3; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_88' title='bbox 12 322 58 333; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>some</span>
      <span class='ocrx_word' id='word_1_89' title='bbox 66 317 140 334; x_wconf 73; x_font Times_New_Roman; x_fsize 11'>features</span>
      <span class='ocrx_word' id='word_1_90' title='bbox 146 318 165 334; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_91' title='bbox 172 318 247 335; x_wconf 74; x_font Times_New_Roman; x_fsize 11'>interest.</span>
      <span class='ocrx_word' id='word_1_92' title='bbox 260 319 268 335; x_wconf 94; x_font Times_New_Roman; x_fsize 11'>I</span>
      <span class='ocrx_word' id='word_1_93' title='bbox 275 325 308 336; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>was</span>
      <span class='ocrx_word' id='word_1_94' title='bbox 315 320 354 336; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>able</span>
     </span>
     <span class='ocr_line' id='line_1_16' title="bbox 12 340 353 361; baseline 0.009 -6; x_size 19; x_descenders 3; x_ascenders 5">
      <span class='ocrx_word' id='word_1_95' title='bbox 12 343 30 355; x_wconf 87; x_font Times_New_Roman; x_fsize 10'>to</span>
      <span class='ocrx_word' id='word_1_96' title='bbox 41 340 85 355; x_wconf 81; x_font Times_New_Roman; x_fsize 10'>refer</span>
      <span class='ocrx_word' id='word_1_97' title='bbox 97 340 134 356; x_wconf 88; x_font Times_New_Roman; x_fsize 10'>him</span>
      <span class='ocrx_word' id='word_1_98' title='bbox 146 344 163 356; x_wconf 87; x_font Times_New_Roman; x_fsize 10'>to</span>
      <span class='ocrx_word' id='word_1_99' title='bbox 175 344 209 356; x_wconf 85; x_font Times_New_Roman; x_fsize 10'>two</span>
      <span class='ocrx_word' id='word_1_100' title='bbox 233 342 293 358; x_wconf 81; x_font Times_New_Roman; x_fsize 10'>arallel</span>
      <span class='ocrx_word' id='word_1_101' title='bbox 304 347 353 361; x_wconf 83; x_font Times_New_Roman; x_fsize 10'>cases,</span>
     </span>
     <span class='ocr_line' id='line_1_17' title="bbox 12 361 354 384; baseline 0.009 -7; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_102' title='bbox 12 361 41 377; x_wconf 78; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_103' title='bbox 53 366 86 377; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>one</span>
      <span class='ocrx_word' id='word_1_104' title='bbox 99 365 117 377; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>at</span>
      <span class='ocrx_word' id='word_1_105' title='bbox 130 362 174 384; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>Riga</span>
      <span class='ocrx_word' id='word_1_106' title='bbox 187 362 205 378; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>in</span>
      <span class='ocrx_word' id='word_1_107' title='bbox 229 363 265 381; x_wconf 81; x_font Century_Schoolbook_L; x_fsize 11'>857.</span>
      <span class='ocrx_word' id='word_1_108' title='bbox 277 363 312 379; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>and</span>
      <span class='ocrx_word' id='word_1_109' title='bbox 325 364 354 380; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>the</span>
     </span>
     <span class='ocr_line' id='line_1_18' title="bbox 11 383 354 404; baseline 0.009 -5; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_110' title='bbox 11 383 61 399; x_wconf 76; x_font Times_New_Roman; x_fsize 11'>other</span>
      <span class='ocrx_word' id='word_1_111' title='bbox 74 387 91 399; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>at</span>
      <span class='ocrx_word' id='word_1_112' title='bbox 103 384 126 400; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>St.</span>
      <span class='ocrx_word' id='word_1_113' title='bbox 139 384 192 400; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>Louis</span>
      <span class='ocrx_word' id='word_1_114' title='bbox 205 384 223 400; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>in</span>
      <span class='ocrx_word' id='word_1_115' title='bbox 239 386 284 404; x_wconf 85; x_font Times_New_Roman; x_fsize 11' lang='fra'>1871,</span>
      <span class='ocrx_word' id='word_1_116' title='bbox 297 386 354 402; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>which</span>
     </span>
     <span class='ocr_line' id='line_1_19' title="bbox 11 405 354 427; baseline 0.012 -7; x_size 21; x_descenders 5; x_ascenders 5">
      <span class='ocrx_word' id='word_1_117' title='bbox 11 405 54 421; x_wconf 80; x_font Times_New_Roman; x_fsize 11'>have</span>
      <span class='ocrx_word' id='word_1_118' title='bbox 72 406 161 427; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>suggested</span>
      <span class='ocrx_word' id='word_1_119' title='bbox 180 410 197 423; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_120' title='bbox 214 407 251 423; x_wconf 90; x_font Times_New_Roman; x_fsize 11'>him</span>
      <span class='ocrx_word' id='word_1_121' title='bbox 269 407 298 424; x_wconf 92; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_122' title='bbox 316 412 354 424; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>true</span>
     </span>
     <span class='ocr_line' id='line_1_20' title="bbox 11 428 353 446; baseline 0.009 -3; x_size 22.115427; x_descenders 6.115427; x_ascenders 4">
      <span class='ocrx_word' id='word_1_123' title='bbox 11 428 93 444; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>solution.</span>
      <span class='ocrx_word' id='word_1_124' title='bbox 108 429 154 444; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>Here</span>
      <span class='ocrx_word' id='word_1_125' title='bbox 163 429 176 445; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>is</span>
      <span class='ocrx_word' id='word_1_126' title='bbox 185 429 213 445; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_127' title='bbox 222 430 272 446; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>letter</span>
      <span class='ocrx_word' id='word_1_128' title='bbox 281 430 337 446; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>which</span>
      <span class='ocrx_word' id='word_1_129' title='bbox 346 431 353 446; x_wconf 92; x_font Times_New_Roman; x_fsize 11' lang='fra'>l</span>
     </span>
     <span class='ocr_line' id='line_1_21' title="bbox 11 449 354 474; baseline 0.009 -9; x_size 21; x_descenders 5; x_ascenders 5">
      <span class='ocrx_word' id='word_1_130' title='bbox 11 449 45 465; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>had</span>
      <span class='ocrx_word' id='word_1_131' title='bbox 63 450 96 466; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>this</span>
      <span class='ocrx_word' id='word_1_132' title='bbox 113 451 195 472; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>morning</span>
      <span class='ocrx_word' id='word_1_133' title='bbox 211 452 354 474; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>acknowledging</span>
     </span>
     <span class='ocr_line' id='line_1_22' title="bbox 11 472 153 493; baseline 0.007 -6; x_size 22; x_descenders 6; x_ascenders 6">
      <span class='ocrx_word' id='word_1_134' title='bbox 11 477 38 493; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>my</span>
      <span class='ocrx_word' id='word_1_135' title='bbox 47 472 153 488; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>aSsistance.&quot;</span>
     </span>
    </p>

    <p class='ocr_par' id='par_1_2' lang='eng' title="bbox 10 495 354 667">
     <span class='ocr_line' id='line_1_23' title="bbox 33 495 354 518; baseline 0.009 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_136' title='bbox 33 495 61 510; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>He</span>
      <span class='ocrx_word' id='word_1_137' title='bbox 76 495 131 511; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>tossed</span>
      <span class='ocrx_word' id='word_1_138' title='bbox 145 500 190 514; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>over,</span>
      <span class='ocrx_word' id='word_1_139' title='bbox 205 501 222 512; x_wconf 90; x_font Times_New_Roman; x_fsize 11'>as</span>
      <span class='ocrx_word' id='word_1_140' title='bbox 236 496 257 512; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>he</span>
      <span class='ocrx_word' id='word_1_141' title='bbox 271 497 329 518; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>spoke,</span>
      <span class='ocrx_word' id='word_1_142' title='bbox 344 502 354 513; x_wconf 82; x_font URW_Bookman_L_Bold; x_fsize 11' lang='fra'><strong>a</strong></span>
     </span>
     <span class='ocr_line' id='line_1_24' title="bbox 10 517 353 540; baseline 0.009 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_143' title='bbox 10 517 101 538; x_wconf 75; x_font Times_New_Roman; x_fsize 11'>crumpled</span>
      <span class='ocrx_word' id='word_1_144' title='bbox 119 517 167 533; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>sheet</span>
      <span class='ocrx_word' id='word_1_145' title='bbox 184 517 203 533; x_wconf 91; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_146' title='bbox 221 518 288 540; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>foreign</span>
      <span class='ocrx_word' id='word_1_147' title='bbox 307 522 353 535; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>note-</span>
     </span>
     <span class='ocr_line' id='line_1_25' title="bbox 11 539 353 561; baseline 0.009 -7; x_size 19; x_descenders 3; x_ascenders 5">
      <span class='ocrx_word' id='word_1_148' title='bbox 11 543 69 560; x_wconf 85; x_font Times_New_Roman; x_fsize 10'>paper.</span>
      <span class='ocrx_word' id='word_1_149' title='bbox 86 540 93 555; x_wconf 95; x_font Times_New_Roman_Bold; x_fsize 10'><strong>I</strong></span>
      <span class='ocrx_word' id='word_1_150' title='bbox 104 539 176 561; x_wconf 81; x_font Times_New_Roman; x_fsize 10'>glanced</span>
      <span class='ocrx_word' id='word_1_151' title='bbox 187 545 215 561; x_wconf 93; x_font Times_New_Roman; x_fsize 10'>my</span>
      <span class='ocrx_word' id='word_1_152' title='bbox 225 545 262 561; x_wconf 84; x_font Times_New_Roman; x_fsize 10'>eyes</span>
      <span class='ocrx_word' id='word_1_153' title='bbox 272 540 324 556; x_wconf 84; x_font Times_New_Roman; x_fsize 10'>down</span>
      <span class='ocrx_word' id='word_1_154' title='bbox 335 541 353 560; x_wconf 84; x_font Times_New_Roman; x_fsize 10'>it,</span>
     </span>
     <span class='ocr_line' id='line_1_26' title="bbox 10 561 353 583; baseline 0.009 -7; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_155' title='bbox 10 561 91 583; x_wconf 76; x_font Times_New_Roman; x_fsize 11'>catching</span>
      <span class='ocrx_word' id='word_1_156' title='bbox 98 567 108 577; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>a</span>
      <span class='ocrx_word' id='word_1_157' title='bbox 117 562 208 583; x_wconf 78; x_font Times_New_Roman; x_fsize 11'>profusion</span>
      <span class='ocrx_word' id='word_1_158' title='bbox 216 562 235 578; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_159' title='bbox 243 567 291 579; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>notes</span>
      <span class='ocrx_word' id='word_1_160' title='bbox 299 563 318 579; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_161' title='bbox 326 563 353 579; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>ad-</span>
     </span>
     <span class='ocr_line' id='line_1_27' title="bbox 10 583 353 607; baseline 0.012 -9; x_size 21; x_descenders 5; x_ascenders 5">
      <span class='ocrx_word' id='word_1_162' title='bbox 10 583 98 602; x_wconf 81; x_font Times_New_Roman; x_fsize 11'>miration,</span>
      <span class='ocrx_word' id='word_1_163' title='bbox 115 584 157 600; x_wconf 84; x_font Times_New_Roman; x_fsize 11'>with</span>
      <span class='ocrx_word' id='word_1_164' title='bbox 173 588 217 605; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>stray</span>
      <span class='ocrx_word' id='word_1_165' title='bbox 234 585 353 607; x_wconf 70; x_font Times_New_Roman; x_fsize 11' lang='fra'>magnifiques,</span>
     </span>
     <span class='ocr_line' id='line_1_28' title="bbox 10 606 353 628; baseline 0.006 -7; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_166' title='bbox 10 606 160 627; x_wconf 53; x_font Times_New_Roman; x_fsize 11' lang='fra'>coup-domaines</span>
      <span class='ocrx_word' id='word_1_167' title='bbox 172 607 207 623; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>and</span>
      <span class='ocrx_word' id='word_1_168' title='bbox 220 607 353 628; x_wconf 49; x_font Times_New_Roman; x_fsize 11'>tows-defame,</span>
     </span>
     <span class='ocr_line' id='line_1_29' title="bbox 10 628 352 650; baseline 0.009 -7; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_169' title='bbox 10 628 33 643; x_wconf 93; x_font Times_New_Roman; x_fsize 11'>all</span>
      <span class='ocrx_word' id='word_1_170' title='bbox 44 628 133 650; x_wconf 79; x_font Times_New_Roman; x_fsize 11'>testifying</span>
      <span class='ocrx_word' id='word_1_171' title='bbox 142 632 160 645; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_172' title='bbox 170 629 198 645; x_wconf 89; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_173' title='bbox 209 630 270 646; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>ardent</span>
      <span class='ocrx_word' id='word_1_174' title='bbox 280 630 352 646; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>admira-</span>
     </span>
     <span class='ocr_line' id='line_1_30' title="bbox 11 650 245 667; baseline 0.004 -1; x_size 21.605808; x_descenders 5.6058083; x_ascenders 5">
      <span class='ocrx_word' id='word_1_175' title='bbox 11 650 48 666; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>tion</span>
      <span class='ocrx_word' id='word_1_176' title='bbox 60 651 79 667; x_wconf 90; x_font Times_New_Roman; x_fsize 11'>of</span>
      <span class='ocrx_word' id='word_1_177' title='bbox 91 651 120 667; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>the</span>
      <span class='ocrx_word' id='word_1_178' title='bbox 132 651 245 667; x_wconf 68; x_font Times_New_Roman; x_fsize 11'>Frenchman</span>
     </span>
    </p>

    <p class='ocr_par' id='par_1_3' lang='eng' title="bbox 10 672 352 714">
     <span class='ocr_line' id='line_1_31' title="bbox 32 672 352 696; baseline 0.006 -8; x_size 22; x_descenders 6; x_ascenders 5">
      <span class='ocrx_word' id='word_1_179' title='bbox 32 672 71 688; x_wconf 82; x_font Times_New_Roman; x_fsize 11'>“He</span>
      <span class='ocrx_word' id='word_1_180' title='bbox 87 673 147 694; x_wconf 77; x_font Times_New_Roman; x_fsize 11'>speaks</span>
      <span class='ocrx_word' id='word_1_181' title='bbox 164 679 181 690; x_wconf 87; x_font Times_New_Roman; x_fsize 11'>as</span>
      <span class='ocrx_word' id='word_1_182' title='bbox 197 679 207 690; x_wconf 90; x_font Times_New_Roman; x_fsize 11'>a</span>
      <span class='ocrx_word' id='word_1_183' title='bbox 223 674 275 696; x_wconf 86; x_font Times_New_Roman; x_fsize 11'>pupil</span>
      <span class='ocrx_word' id='word_1_184' title='bbox 292 678 310 690; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>to</span>
      <span class='ocrx_word' id='word_1_185' title='bbox 326 674 352 691; x_wconf 88; x_font Times_New_Roman; x_fsize 11'>his</span>
     </span>
     <span class='ocr_line' id='line_1_32' title="bbox 10 695 156 714; baseline 0.007 -3; x_size 22.605808; x_descenders 5.6058083; x_ascenders 6">
      <span class='ocrx_word' id='word_1_186' title='bbox 10 695 91 714; x_wconf 83; x_font Times_New_Roman; x_fsize 11'>master,&quot;</span>
      <span class='ocrx_word' id='word_1_187' title='bbox 97 696 134 712; x_wconf 85; x_font Times_New_Roman; x_fsize 11'>said</span>
      <span class='ocrx_word' id='word_1_188' title='bbox 142 697 156 712; x_wconf 94; x_font Times_New_Roman; x_fsize 11'>I.</span>
     </span>
    </p>
   </div>
  </div>
 </body>
</html>

Reply via email to