Hi all, @Christian Boitet- you are right. Marathi characters are not subsets of the Hindi ones.
Firstly, Marathi has two L s unlike Hindi, ल and ळ. On InScript keyboard, the second one is considered as "harsh" L, in the sense that it is obtained by "Shift+ल'. This harsh L appears in Vedic Sanskrit also, e.g. from Rigveda: अहेळमानो वरुणेह बोध्युरुश◌ंस मा न आयुः प्र मोषीः॥ १.०२४.११॥ Secondly, the way श in Marathi is written is different than in Hindi. And finally, the soft L, or ल is typed differently in Marathi. For the details see Table 12.6 on page 461 in the unicode standard version specifications, the link below: http://www.unicode.org/versions/Unicode9.0.0/ch12.pdf Finally, the fonts "Yashomudra" and "Yashovenu" prepared by CDAC and the Goverment's Institute for Marathi Language Development (राज्य मराठी विकास संस्था ) have these characters. IIT Bombay's Girish Dalvi has the font "Mukta" and the above two fonts allow to choose Marathi characters in LaTeX (polyglossia) by adding the parameter "StylisticSet=1" in (Xe)LaTeX with Polyglossia. With best regards, -Rohit. On Mon, Jun 19, 2017 at 4:46 AM, Christian Boitet <christian.boi...@imag.fr> wrote: > Hi, 18/6/17 > > Marathi (script) is surely not a subset ofHindi (script) as, for example, > there are 2 letters "L" in Marathi and 1 only in Hindi. > > Maybe some colleagues from India or Pakistan could help. I put 3 in copy. > > IITB/CFILT is doing work on Hindi, Marathi, Bengali and more since years > under Prof. Pushpak Bhattacharyya. Ritesh Shah is finishing his PhD with > both of us and is a Gujarati native speaker. > Prof. Pushpak is currently President of the ACL and knows everybody in NLP > in India. He can certainly answer many questions and give pointers to > colleagues who know all the details of the "Indo-Pak" langages. Abbas > Malik, who also did his PhD with us, knows probably the most about Indo-Pak > languages and their scripts as he did his PhD on transliteration between > scripts of these languages (many have 2). > > Best, > Christian Boitet > > > Le 18 juin 2017 à 17:35, Zdenek Wagner <zdenek.wag...@gmail.com> a écrit : > > 2017-06-18 16:38 GMT+02:00 Mike Maxwell <maxw...@umiacs.umd.edu>: > >> On 6/18/2017 4:04 AM, Zdenek Wagner wrote: >> >>> as far as I know the Devanagari fonts are either Sanskrit with all >>> conjuncts that cannot be switched off or Hindi without the Sanskrit >>> conjuncts. >>> >> >> Do other languages that use Devanagari, like Gujarati, use the same >> conjuncts as Hindi? >> > > Gujarati is written in the Gujarati script. Devanagari is used in Marathi > and Nepali. There is a Nepali Linux Group, I offered them that I create > xindy rules and Steve White asked me about the conjuncts so that he could > implement the Nepali language but I got no reply from them. I have no > response from Marathi users either but I have some printed documents in > Marathi and it seems that the set of conjucts is the same as in nowadays > Hindi (Marathi does not use characters with nuktas, thus the name of the > Bollywood actress Priyanka Chopra is written as प्रियंका चोपड़ा in Hindi > newspapers and as प्रियांका चोप्रा in Marathi newspapers). I have not ben > to Rajastan so I do not know whether Rajastan, Mevari, Marvari have > differences but probably not. > > So the result is that Marathi is most probably a subset of Hindi hence > Language=Hindi can also be used for Marathi. Strictly Marathi font may be > unusable for Hindi because the charcters with nuktas and especially their > conjuncts and half forms need not be available in the font. I saw such a > font a few years ago but it was fixed. > > > Zdeněk Wagner > http://ttsm.icpf.cas.cz/team/wagner.shtml > http://icebearsoft.euweb.cz > > > >> -- >> Mike Maxwell >> "My definition of an interesting universe is >> one that has the capacity to study itself." >> --Stephen Eastmond >> > > > > -------------------------------------------------- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > > > ------------------------------------------------------------------------- > > Christian Boitet > > (Pr. émérite Université Grenoble Alpes) > Laboratoire d'Informatique de Grenoble > L I G > > Groupe d'Etude pour la Traduction Automatique > > et le Traitement Automatisé des Langues et de la Parole > > G E T A L P > > > --- Adresse postale --- > GETALP, LIG-campus > Bâtiment IMAG, bureau 339 > CS 40700 > 38058 Grenoble Cedex 9 > France > > > > -------------------------------------------------- > Subscriptions, Archive, and List information, etc.: > http://tug.org/mailman/listinfo/xetex > >
-------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex