[XeTeX] Typesetting arabic and european mix encoded in utf8
Hello! In my current project I use XeLaTeX to typeset PDF files from texts in different languages held in a separate database. (This is done with a generator that is language-unaware, generating lines like \long\def\msgtext{عطل في التهيئة البنيوماتية GS} Into a .inc file and a manually written, language dependent, frame document that defines \msgtext{} I typeset a (mostly) Arabic document using XeLaTeX and \usepackage{arabxetex}[utf] Arabxetex supports encoding Arabic in ASCII, and this interferes with the fact, that our texts have latin characters, like English abbreviations, location IDs and such. The documented solution would be enclosing these latin characters which are to be typeset verbally into \text{LR}, which is rather hard if the text comes from a database. Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion completely? Or is there a package that supports Arabic (with Arabic typographic conventions) but made for pure Unicode sources? With best regards Hartmut Niemann
Re: [XeTeX] Typesetting arabic and european mix encoded in utf8
Hello, I have not used Arabic but Urdu which uses a modified Arabic script. I have a book written in Czech with just small parts in Hindi and Urdu and I do it in XeLaTeX with the polyglossia package. A very small sample of the book is here: http://icebearsoft.euweb.cz/bharat.php The page contains a link to the presentation of typesetting the book. The slides are in Czech because it was a national conference but slide #10 shows that the line break in the Urdu text is correct although the main language of the paragraph is Czech. Zdeněk Wagner https://www.zdenek-wagner.eu/ út 11. 6. 2024 v 12:07 odesílatel Niemann, Hartmut via XeTeX napsal: > Hello! > > > > In my current project I use XeLaTeX to typeset PDF files from texts in > different languages held in a separate database. > > (This is done with a generator that is language-unaware, generating lines > like > > \long\def\msgtext{عطل في التهيئة البنيوماتية GS} > > Into a .inc file and a manually written, language dependent, frame > document that defines \msgtext{} > > > > I typeset a (mostly) Arabic document using XeLaTeX and > \usepackage{arabxetex}[utf] > > > > Arabxetex supports encoding Arabic in ASCII, and this interferes with the > fact, that our texts have latin characters, like English abbreviations, > location IDs and such. > > The documented solution would be enclosing these latin characters which > are to be typeset verbally into \text{LR}, which is rather hard if the text > comes from a database. > > > > Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion > completely? > > > > Or is there a package that supports Arabic (with Arabic typographic > conventions) but made for pure Unicode sources? > > > > With best regards > > > > Hartmut Niemann > > > > > > >
Re: [XeTeX] Typesetting arabic and european mix encoded in utf8
Hello Zdeněk, thank you for your hints. What a wonderful book! I’ll take the TeX code of your tools package as a start and experiment with it. Maybe I will need to adopt to mark the latin characters and not rely on automatic switching between latin LtoR and Arabic RtoL. Hartmut Von: Zdenek Wagner Gesendet: Dienstag, 11. Juni 2024 12:22 An: XeTeX (Unicode-based TeX) discussion. Cc: Niemann, Hartmut (SMO RS LMC EN LM CCI FT) Betreff: Re: [XeTeX] Typesetting arabic and european mix encoded in utf8 Hello, I have not used Arabic but Urdu which uses a modified Arabic script. I have a book written in Czech with just small parts in Hindi and Urdu and I do it in XeLaTeX with the polyglossia package. A very small sample of the book is here: http://icebearsoft.euweb.cz/bharat.php The page contains a link to the presentation of typesetting the book. The slides are in Czech because it was a national conference but slide #10 shows that the line break in the Urdu text is correct although the main language of the paragraph is Czech. Zdeněk Wagner https://www.zdenek-wagner.eu/ út 11. 6. 2024 v 12:07 odesílatel Niemann, Hartmut via XeTeX mailto:xetex@tug.org>> napsal: Hello! In my current project I use XeLaTeX to typeset PDF files from texts in different languages held in a separate database. (This is done with a generator that is language-unaware, generating lines like \long\def\msgtext{عطل في التهيئة البنيوماتية GS} Into a .inc file and a manually written, language dependent, frame document that defines \msgtext{} I typeset a (mostly) Arabic document using XeLaTeX and \usepackage{arabxetex}[utf] Arabxetex supports encoding Arabic in ASCII, and this interferes with the fact, that our texts have latin characters, like English abbreviations, location IDs and such. The documented solution would be enclosing these latin characters which are to be typeset verbally into \text{LR}, which is rather hard if the text comes from a database. Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion completely? Or is there a package that supports Arabic (with Arabic typographic conventions) but made for pure Unicode sources? With best regards Hartmut Niemann
Re: [XeTeX] Typesetting arabic and european mix encoded in utf8
Hello Hartmut Niemann, may be that the XeLaTex-package polyglossia could serve your purposes better much better. You could use many languages in one document, also Arabic and other RTL text. Best wishes and best regards, Jens Bakker > Am 11.06.2024 um 12:06 schrieb Niemann, Hartmut via XeTeX : > > Hello! > > In my current project I use XeLaTeX to typeset PDF files from texts in > different languages held in a separate database. > (This is done with a generator that is language-unaware, generating lines like > \long\def\msgtext{عطل في التهيئة البنيوماتية GS} > Into a .inc file and a manually written, language dependent, frame document > that defines \msgtext{} > > I typeset a (mostly) Arabic document using XeLaTeX and > \usepackage{arabxetex}[utf] > > Arabxetex supports encoding Arabic in ASCII, and this interferes with the > fact, that our texts have latin characters, like English abbreviations, > location IDs and such. > The documented solution would be enclosing these latin characters which are > to be typeset verbally into \text{LR}, which is rather hard if the text comes > from a database. > > Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion > completely? > > Or is there a package that supports Arabic (with Arabic typographic > conventions) but made for pure Unicode sources? > > With best regards > > Hartmut Niemann