Hi list, Thanks a ton to @RadioNoiseE for the examples: with your help we can now export org documents in Chinese or Japanese to PDF using XeLaTeX.
@all: Please install or refresh the feature/all-tex-fonts to check the results. A short section in the manual has been added, documenting my current understanding of the state of the branch. CALL FOR HELP: Anyone fluent in Korean to point at missing things On 11/11/25 16:01, RadioNoiseE wrote: > On Tue, 11 Nov 2025 22:05:55 +0800, > Pedro Andres Aranda Gutierrez wrote: >> >> [1 <text/plain; UTF-8 (quoted-printable)>] >> [2 <text/html; UTF-8 (quoted-printable)>] >> Hi >> >> Thanks a lot for tuning in... >> Answers - or maybe more questions ;-) - inline... >> >> On Tue, 11 Nov 2025 at 14:35, RadioNoiseE <[email protected]> wrote: >> >> On Tue, 11 Nov 2025 02:14:11 +0800, >> Ihor Radchenko wrote: >> > >> > Huang Jing <[email protected]> writes: >> > >> > >> How does it play with babel and polyglossia? >> > > >> > > It's not mentioned in the documents of xeCJK and luatex-ja, however I >> > > believe they do work together. From my limited testing, when loaded as >> > > packages, xeCJK and luatex-ja does no localization, thus relying on >> > > babel. However they will override the font settings by babel, which is >> > > totally acceptable. >> > >> > That actually depends. If the user of Org mode customizes fonts, it may >> > be a surprise when xeCJK/luatex-ja override the fonts. So, we might only >> > load these packages conditionally, when no font of explicitly selected. >> > Or maybe we simply put font settings _after_ xeCJK/luatex-ja is loaded. >> >> We don't need to configure fonts for babel, and it only provides >> localization. xeCJK provides the \setCJK...font control sequence while >> luatex-ja provides \set...jfont, so we can use them for font >> configuration. >> >> That was my understanding... I've done a couple of experiments based on what >> overleaf.com was providing and was able to >> start handling \setCJK...font{} with \usepackage{fontspec}. If you were so >> kind to provide a MWE for luatex-ja, I think >> we could have something reasonable for Japanese too. > > Sure. This is for Chinese under LuaTeX: > > \documentclass{article} > > \makeatletter > \def\ltj@stdmcfont{FandolSong} % serif font > \def\ltj@stdgtfont{FandolHei} % sans serif and monospace font (usually > the same) > \def\ltj@stdyokojfm{quanjiao} % jfm > \makeatother > > \usepackage{luatexja} % load after defining \ltj@std... > \usepackage{indentfirst} % convention > \usepackage[chinese,provide=*]{babel} % load after luatexja > > \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws > > \parindent=2\zw % convention > \linespread{1.333} % 16pt/12pt > > \begin{document} > > \section{天山山脉} > > 位于乌鲁木齐市以东的博格达峰海拔5445米,峰上的积雪终年不化,人们称它 > “雪海”。位于博格达峰山腰的天池,清澈透明,是新疆著名的旅游胜地。目前, > 博格达峰自然保护区已纳入联合国“人与生物圈”自然保护区网。托木尔峰,海 > 拔7439米,是天山的最高峰,登山界一般承认1956年阿巴拉科夫首次登顶成功, > 但也有说1938年已有苏联登山队登顶;1975年7月25日首个中国登山队登顶成 > 功。 > > \end{document} > > This is for Japanese under XeTeX: > > \documentclass{article} > > \usepackage{luatexja} % OOTB Japanese supp > \usepackage{indentfirst} % conventions > \usepackage[japanese,provide=*]{babel} % laod after luatexja > > \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws > > \parindent=\zw % convention, different from Chinese which is 2\zw > \linespread{1.333} % 16pt/12pt > > \begin{document} > > \section{二億圓の犬} > > 犬はよく訓練されたフォックス・テリアで「歐洲の驚異の犬」といわれたも > のだそうである。それを加州へ送る途中、兩會社の不注意で、途中で死んで > しまったので、それに對して、二億二千萬圓の損害賠償をしろというのが、 > この訴えである。 > > いくらアメリカでも、こういう話は珍しいらしく、加州の話が、シカゴの新 > 聞にまで載ったわけである。どんな犬かは知らないが、いくら名犬でも、二 > 億圓の犬というのは、われわれには一寸考えが及ばない。とにかく、とんで > もない話が時々起る國である。 > > \end{document} > > >> > > 1. Under XeTeX and LuaTeX, xeCJK and luatex-ja will setup font support >> > > according to the platform (operating system) detected, and activate >> > > font, kinsoku, line-breaking support. They will not change the >> > > \baselineskip. >> > > >> > > 2. When ctex is being used, it will also configure correct >> > > \baselineskip (from the default 12pt to 16pt). It will also try to >> > > support pdfTeX. >> > > >> > > 3. Localization support provided by babel. >> > > >> > > So it's actually necessary to load babel when not using the document >> > > classes provided. It's safer to load babel first though. >> > >> > Note that babel also provides rules for typography. So, >> > xeCJK/lualatex-ja do step onto babel a bit. But, as you said, they >> > basically add missing typographical rules, so it might be reasonable. >> > >> > > Neither xeCJK nor luatex-ja is necessary for font configuration when >> > > babel is being used. Since babel only support Chinese and Japanese on >> > > LuaTeX and XeTeX with OTF support, the CJK font can be loaded the same >> > > way as latin fonts. See >> https://latex3.github.io/babel/guides/locale-chinese.html. >> > >> > > However babel is hardly ever used in Chinese or Japanese community, >> > > since their support is so, primitive. For example it does not add >> > > xkanjiskip between latin and CJK characters. Here's a relevant >> > > discussion on relying on babel for localization in the ctex community: >> > > >> https://github.com/CTeX-org/ctex-kit/issues/626#issuecomment-1147428749. >> > >> > My understanding from this is that we (1) always want to load xeCJK for >> > Chinese documents (what about luatex?); (2) always want to load >> > luatex-ja for Japanese (what about xetex?). >> >> We can configure luatex-ja for Chinese documents on LuaTeX, by >> changing the \parindent to 2\zw, change the default font (HaranoAji) >> to FandolSong, and change the JFM (Japanese font metric). Vice versa. >> >> As said above... I'd like to see a MWE to check. > > For LuaTeX, see above. For XeTeX, Chinese: > > \documentclass{article} > > \usepackage{xeCJK} % OOTB Chinese support > \usepackage{indentfirst} % convention > \usepackage[chinese,provide=*]{babel} % load after xeCJK > > \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws > > \parindent=2em % convention > \linespread{1.333} % 16pt/12pt > > \begin{document} > > \section{天山山脉} > > 位于乌鲁木齐市以东的博格达峰海拔5445米,峰上的积雪终年不化,人们称它 > “雪海”。位于博格达峰山腰的天池,清澈透明,是新疆著名的旅游胜地。目前, > 博格达峰自然保护区已纳入联合国“人与生物圈”自然保护区网。托木尔峰,海 > 拔7439米,是天山的最高峰,登山界一般承认1956年阿巴拉科夫首次登顶成功, > 但也有说1938年已有苏联登山队登顶;1975年7月25日首个中国登山队登顶成 > 功。 > > \end{document} > > and for Japanese: > > \documentclass{article} > > \usepackage{xeCJK} % load first > \usepackage{indentfirst} % convention > \usepackage[japanese,provide=*]{babel} % load after xeCJK > > \setCJKmainfont{HaranoAjiMincho} % serif font > \setCJKsansfont{HaranoAjiGothic} % sans serif font > \setCJKmonofont{HaranoAjiGothic} % monospace font > > \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws > > \parindent=1em % convention > \linespread{1.333} % 16pt/12pt > > \begin{document} > > \section{二億圓の犬} > > 犬はよく訓練されたフォックス・テリアで「歐洲の驚異の犬」といわれたも > のだそうである。それを加州へ送る途中、兩會社の不注意で、途中で死んで > しまったので、それに對して、二億二千萬圓の損害賠償をしろというのが、 > この訴えである。 > > いくらアメリカでも、こういう話は珍しいらしく、加州の話が、シカゴの新 > 聞にまで載ったわけである。どんな犬かは知らないが、いくら名犬でも、二 > 億圓の犬というのは、われわれには一寸考えが及ばない。とにかく、とんで > もない話が時々起る國である。 > > \end{document} > >> > >> > For the \setCJK...font declaration, I can provide a wrapper in LaTeX >> > >> > if needed, compatible with XeTeX, LuaTeX and probabily other >> > >> > engines. You will need xeCJK for this control sequence while other >> > >> > engines will not compile because it is provided by the xeCJK >> package. >> > >> > Under other engines, there are different control sequences used for >> > >> > font configuration (i.e., under LuaTeX thus luatex-ja, you use >> > >> > \set...jfont). >> > >> > Could you expand on "other engines will not compile"? How does it fit to >> > "compatible with XeTeX, LuaTeX, and probably other engines"? >> > (Note that inclusion or not inclusion of xeCJK can be controlled by us - >> > we know which compiler is used for export during export and can >> > conditionally include it on Elisp level) >> >> What I mean by ``other engines will not compile'' is when directly >> using \setCJK...font in the exported document, even though ctex works >> across different TeX engines, since it's xeCJK providing these >> commands, it will not compile under, i.e., LuaTeX. >> >> But as we don't use ctex now, we just need to call \setCJK...font for >> XeTeX after loading xeCJK, and \set...jfont for luatex-ja under >> LuaTeX. Since we can access the target engine through >> org-latex-compilers. >> >> Hmm... so my guess was not that wrong ;-) >> >> > >> Could you provide more details about these commands? >> > > >> > > Equivalents to \setCJK...font provided by luatex-ja are documented in >> > > English here: >> https://mirrors.ctan.org/macros/luatex/generic/luatexja/doc/luatexja-en.pdf >> > > Search for ``Tabel 1: Commands of luatexja-fontspec'' in that >> > > PDF. They are provided by luatexja-fontspec, which autoloads luatexja >> > > and fontspec. >> > >> > Ok. \setmainjfont, \setsansjfont, and \setmonojfont seems to be of >> > interest. They are direct equivalents of \setCJKmainfont, >> > \setCJKsansfont, and \setCJKmonofont. This is probably only relevant >> > when using bare bones fontspec or polyglossia to set fonts. When using >> > babel, it probably makes sense to keep using \babelfont[chinese]{rm}{...} >> >> I think we should configure fonts through xeCJK or luatex-ja provided >> interface, since they will override the babel font. Babel will not >> complain about no font specified. >> >> I'm close to designing a strategy for this. Currently, when I detect CJK >> fonts, I include xeCJK. >> So, with an MWE for Japanese fonts, it would not be too difficult to get >> this configuration right, too. > > I think you need to include xeCJK even if the user does not specify > fonts, so there's a fallback/default one. (Not necessary for Chinese > under xeCJK, since it's OOTB; but for Japanese it's necessary, and > same for luatexja -- need to specify default Chinese Fandol font.) > > Hopefully the MWEs help explain things. > >> > > luatexja also patches LaTeX2e's NFSS2, adding CJK font >> > > support. However unless there's a specific reason we shouldn't use >> > > that in Org export results. >> > >> > That sounds concerning. What are the potential consequences? >> >> I think no observable consequences for Org export. It will not >> interfere with any existing functionality. What is does is extending >> existing framework, providing NFSS2 like interfaces for document >> classes, handling CJK font scaling, vertical typesetting, etc >> features. >> >> However I was thinking to not use luatexja-fontspec, that is we no >> longer have \set...jfont control sequences. Since luatexja-fontspec >> should be loaded after fontspec as it patches fontspec. As a >> replacement, we can use (ref. luatexja document section 8.3) >> >> \ltj@stdmcfont -> The default Japanese font for the mincho family (serif) >> \ltj@stdgtfont -> The default Japanese font for the gothic family (sans >> serif and monospace) >> \ltj@stdyokojfm -> The default JFM for horizontal direction >> \ltj@stdtatejfm -> The default JFM for vertical direction >> >> > > I'm currently having my mid-term exams, so I'll be able to work on >> > > this after Tuesday. >> > >> > No problem. I think Pedro wanted the whole thing to be in mergeable >> > state (not necessary final) before EmacsConf, but we are generally not >> > very pushy - we are all volunteers after all. >> >> > >> >> I don't want to push... it's just that I have a talk on this in EmacsConf >> and it would be cool to be able to say 'you have it in org-mode master'. >> >> > >> Org mode only supports exporting via pdflatex, xelatex, and lualatex. >> > > >> > > Then my idea is to drop ctex, and use xeCJK or luatex-ja with babel. >> > > These two packages support both Chinese and Japanese, while xeCJK >> > > comes with out-of-the-box Chinese support and luatex-ja comes with >> > > out-of-the-box Japanese support. >> > >> > Good. >> > >> > > pdfTeX support is also feasible, through the CJK package, which is >> > > used by ctex as well. >> > >> > Note that pdfTeX is something we are not certain about. I wish we could >> > do it, but it seems tricky. We will need to work out how we want to >> > design the pdftex support. Tentatively, we may add a field to >> > `org-latex-language-alist' where standard per-language config will be >> > stored and loaded according to #+LANAGUAGE settings (note that there >> > might be multiple languages in one document). >> >> CJK support on pdfTeX would require appropriate tfm, then we should be >> able to use \pdfmapline to setup CJK font. It is tricky somehow. >> >> > -- >> > Ihor Radchenko // yantar92, >> > Org mode maintainer, >> > Learn more about Org mode at <https://orgmode.org/>. >> > Support Org development at <https://liberapay.com/org-mode>, >> > or support my work at <https://liberapay.com/yantar92> >> >> Best, /PA >> >> -- >> Fragen sind nicht da, um beantwortet zu werden, >> Fragen sind da um gestellt zu werden >> Georg Kreisler >> >> "Sagen's Paradeiser" (ORF: Als Radiohören gefährlich war) => write BE! >> Year 1 of the New Koprocracy
