[XeTeX] Typesetting arabic and european mix encoded in utf8

2024-06-11 Thread Niemann, Hartmut via XeTeX
Hello!

In my current project I use XeLaTeX to typeset PDF files from texts in 
different languages held in a separate database.
(This is done with a generator that is language-unaware, generating lines like
\long\def\msgtext{عطل في التهيئة البنيوماتية GS}
Into a .inc file and a manually written, language dependent, frame document 
that defines \msgtext{}

I typeset a (mostly) Arabic document using XeLaTeX and 
\usepackage{arabxetex}[utf]

Arabxetex supports encoding Arabic in ASCII, and this interferes with the fact, 
that our texts have latin characters, like English abbreviations, location IDs 
and such.
The documented solution would be enclosing these latin characters which are to 
be typeset verbally into \text{LR}, which is rather hard if the text comes from 
a database.

Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion 
completely?

Or is there a package that supports Arabic (with Arabic typographic 
conventions) but made for pure Unicode sources?

With best regards

Hartmut Niemann





Re: [XeTeX] Typesetting arabic and european mix encoded in utf8

2024-06-11 Thread Zdenek Wagner
Hello,

I have not used Arabic but Urdu which uses a modified Arabic script. I have
a book written in Czech with just small parts in Hindi and Urdu and I do it
in XeLaTeX with the polyglossia package. A very small sample of the book is
here: http://icebearsoft.euweb.cz/bharat.php

The page contains a link to the presentation of typesetting the book. The
slides are in Czech because it was a national conference but slide #10
shows that the line break in the Urdu text is correct although the main
language of the paragraph is Czech.

Zdeněk Wagner
https://www.zdenek-wagner.eu/


út 11. 6. 2024 v 12:07 odesílatel Niemann, Hartmut via XeTeX 
napsal:

> Hello!
>
>
>
> In my current project I use XeLaTeX to typeset PDF files from texts in
> different languages held in a separate database.
>
> (This is done with a generator that is language-unaware, generating lines
> like
>
> \long\def\msgtext{عطل في التهيئة البنيوماتية GS}
>
> Into a .inc file and a manually written, language dependent, frame
> document that defines \msgtext{}
>
>
>
> I typeset a (mostly) Arabic document using XeLaTeX and
> \usepackage{arabxetex}[utf]
>
>
>
> Arabxetex supports encoding Arabic in ASCII, and this interferes with the
> fact, that our texts have latin characters, like English abbreviations,
> location IDs and such.
>
> The documented solution would be enclosing these latin characters which
> are to be typeset verbally into \text{LR}, which is rather hard if the text
> comes from a database.
>
>
>
> Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion
> completely?
>
>
>
> Or is there a package that supports Arabic (with Arabic typographic
> conventions) but made for pure Unicode sources?
>
>
>
> With best regards
>
>
>
> Hartmut Niemann
>
>
>
>
>
>
>


Re: [XeTeX] Typesetting arabic and european mix encoded in utf8

2024-06-11 Thread Niemann, Hartmut via XeTeX
Hello Zdeněk,

thank you for your hints. What a wonderful book!
I’ll take the TeX code of your tools package as a start and experiment with it. 
Maybe I will need to adopt to mark the latin characters and not rely on 
automatic switching between latin LtoR and Arabic RtoL.

Hartmut

Von: Zdenek Wagner 
Gesendet: Dienstag, 11. Juni 2024 12:22
An: XeTeX (Unicode-based TeX) discussion. 
Cc: Niemann, Hartmut (SMO RS LMC EN LM CCI FT) 
Betreff: Re: [XeTeX] Typesetting arabic and european mix encoded in utf8

Hello,

I have not used Arabic but Urdu which uses a modified Arabic script. I have a 
book written in Czech with just small parts in Hindi and Urdu and I do it in 
XeLaTeX with the polyglossia package. A very small sample of the book is here: 
http://icebearsoft.euweb.cz/bharat.php

The page contains a link to the presentation of typesetting the book. The 
slides are in Czech because it was a national conference but slide #10 shows 
that the line break in the Urdu text is correct although the main language of 
the paragraph is Czech.

Zdeněk Wagner
https://www.zdenek-wagner.eu/


út 11. 6. 2024 v 12:07 odesílatel Niemann, Hartmut via XeTeX 
mailto:xetex@tug.org>> napsal:
Hello!

In my current project I use XeLaTeX to typeset PDF files from texts in 
different languages held in a separate database.
(This is done with a generator that is language-unaware, generating lines like
\long\def\msgtext{عطل في التهيئة البنيوماتية GS}
Into a .inc file and a manually written, language dependent, frame document 
that defines \msgtext{}

I typeset a (mostly) Arabic document using XeLaTeX and 
\usepackage{arabxetex}[utf]

Arabxetex supports encoding Arabic in ASCII, and this interferes with the fact, 
that our texts have latin characters, like English abbreviations, location IDs 
and such.
The documented solution would be enclosing these latin characters which are to 
be typeset verbally into \text{LR}, which is rather hard if the text comes from 
a database.

Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion 
completely?

Or is there a package that supports Arabic (with Arabic typographic 
conventions) but made for pure Unicode sources?

With best regards

Hartmut Niemann





Re: [XeTeX] Typesetting arabic and european mix encoded in utf8

2024-06-11 Thread Jens Bakker
Hello Hartmut Niemann,

may be that the XeLaTex-package polyglossia could serve your purposes better 
much better. You could use many languages in one document, also Arabic and 
other RTL text.

Best wishes and best regards,
Jens Bakker



> Am 11.06.2024 um 12:06 schrieb Niemann, Hartmut via XeTeX :
> 
> Hello!
>  
> In my current project I use XeLaTeX to typeset PDF files from texts in 
> different languages held in a separate database.
> (This is done with a generator that is language-unaware, generating lines like
> \long\def\msgtext{عطل في التهيئة البنيوماتية GS}
> Into a .inc file and a manually written, language dependent, frame document 
> that defines \msgtext{}
>  
> I typeset a (mostly) Arabic document using XeLaTeX and 
> \usepackage{arabxetex}[utf]
>  
> Arabxetex supports encoding Arabic in ASCII, and this interferes with the 
> fact, that our texts have latin characters, like English abbreviations, 
> location IDs and such.
> The documented solution would be enclosing these latin characters which are 
> to be typeset verbally into \text{LR}, which is rather hard if the text comes 
> from a database.
>  
> Does anybody how to switch off arabxetex’s ASCII-to-arabic conversion 
> completely?
>  
> Or is there a package that supports Arabic (with Arabic typographic 
> conventions) but made for pure Unicode sources?
>  
> With best regards
>  
> Hartmut Niemann