Hi Tobias, On 20/06/2011, at 7:40 AM, Tobias Schoel wrote:
>> And both /Alt and /ActualText allow multiple values having been preceded >> by a /Lang tag, so that the actual vocalization generated by the >> screen-reader can be adjusted for different languages --- the document >> author normally would provide this, but a sophisticated PDF browser >> plug-in might be programmed to produce a translation on-the-fly. >> > > What exactly is the intention of the /Alt tagging? Clearly one use is the same as the alt="..." attribute in HTML, attached to an image or other visually-based layout of material. It allows you to provide a short description of an image, say e.g. <img alt="TUG logo" src="..." ... /> You don't describe the contents of the image, but just what is its purpose, or what it is about overall. The same idea can be applicable to other content, including words. But more generally than this, you need to appreciate that Adobe's Acrobat Pro allows you to save 2 different kinds of view of the text in a PDF document. It has export options: Save As Text Save As Text (Accessible) It is this 2nd view that is used by Assistive Technology, such as (so-called) "screen-readers" for people with poor eyesight. This is a misnomer, since they need not be reading the screen at all, but instead from an underlying text view based upon the words that are shown. The /Alt tag allows you to substitute something else for anything that otherwise would not read particularly well. For example, the different uses and pronunciations of 'a'; as in "a dog" or "the letter a ", or "the variable $a$". A priori, you would expect these to be all read the same way, but this can be altered in the latter cases by, say, /Span<</Alt( ay )>> BDC ... (a) ... EMC (I think I had 'BMC' in my previous post. That was incorrect syntax, which should have been 'BDC' for "begin dictionary-affected content", whereas BMC is "begin marked content", used when there is no extra dictionary affecting anything about the marked content. Now EMC is "end marked content", used in both situations.) >>> >>> Actually, Roman numerals are mostly used when the numerical information is >>> almost irrelevant as such. Nobody uses the "XIV" in "Louis XIV" to perform >>> calculations. That's just a different way of writing "quatorze". In this context, you might use: /Span<</Alt( roman numeral X I V, meaning 14, )BDC .... (XIV) ... EMC explaining exactly what is meant, since a visually impaired person cannot take the visual cue of a sequence of (specific) capital letters, to interpret the special meaning. A screen reader getting just 'XIV' might otherwise try to read pronounce this as something like "ksiv", which would be quite confusing to the listener. By the way, the use of the ',' in the /Alt string affects the timing of how it is read out. Another example is when you have words borrowed from another language; especially names. e.g. /Span<<( Lennard Oiler, a famous Swiss mathematician, )>>BDC ... (Leonhard) ... (Euler) ... EMC While this extra verbosity could be useful generally, mostly when you use "Save As Text" you would not be expecting to get this as you deliberately can have nonsense words that just happen to be pronounced the way you want the listener to hear. Hence the distinction between /ActualText and /Alt for non-image content. >> >> Right. So /ActualText tagging can support this distinction in meaning. >> It is *not* intended to support calculations --- that is the domain >> of "Content Tagging" using MathML. > > As nearly all roman numerals used in pratice are in the range up to 5000, no > on-the-fly calculation should be needed. That can be done by the producing > software. Exactly. The producing software should do all the hard work, since it should be able to analyse using quite sophisticated algorithms, and the results can be checked before "going to print". Furthermore, the reading software is of unknown quality and sophistication, so it is much better for the producer to enrich the document with as much extra potentially useful information as possible. Good Assistive Technology should then have settings to be able to select whatever level of verbosity is appropriate to the person using it. > >> >>> >>> I see it just as the ability to copy "quatorze" from a text and paste it >>> into a >>> worksheet cell accepting numbers to get 14. In the case of Roman numerals >>> it may be simpler, of course. But is it useful? >> >> Most certainly it is useful. >> It is part of the way of the future for smart PDF documents. > > Exactly. It is a different representation form of numbers not the actual > letters. It doesn't matter, when the pdf is only intended to be printed, but > for electronic use, it does matter. > > bye > > Toscho Hope this helps, Ross ------------------------------------------------------------------------ Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 ------------------------------------------------------------------------ -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex