Hi,

Yes, this doesn't work properly. There is an ongoing effort about this for 3.0 but it is incomplete, it supports only Bengali. See the GsubWorkerForBengali and the GsubWorkerFactory classes, maybe you can use this as inspiration to write something for Devanagari.

https://issues.apache.org/jira/browse/PDFBOX-4189

Tilman

On 29.04.2023 14:08, Ruel shakya wrote:
Hello,
I am writing to request assistance with modifying the encoding rules for Devanagari script in the PDFBox library. Our team is working on a research-based project that involves coding a rendering engine/library for Nepali Devanagari script in PDF documents using Scala programming language and the PDFBox library.

We have encountered issues with the current encoding rules for Devanagari script in the library. that we are trying to solve The script is not rendering properly in the PDF documents generated by our program, and we suspect that this issue may be due to the encoding rules.
For Instance when we run this program
public class App {
public static void main(String args[])throws Exception {

         PDDocument document =new PDDocument();
         PDPage page =new PDPage();
         document.addPage(page);
         PDPageContentStream contentStream =new PDPageContentStream(document, 
page);

         PDFont unicodeFont =PDType0Font.load(document,new 
File("arial-unicode-ms.ttf"));

         contentStream.setFont(unicodeFont,14);
         contentStream.beginText();
         contentStream.moveTextPositionByAmount(100,200);
         contentStream.drawString("नेपालीहरू फिनल्याण्डमा राष्ट्रिय");
         contentStream.endText();
         contentStream.beginText();
         contentStream.moveTextPositionByAmount(100,300);
         contentStream.endText();
         contentStream.close();
         document.save("test.pdf");
         document.close();




     }
The resulting output is as follows
image.png
PDF box doesn't replace for half consonants and displays the order of the left dependent vowel incorrectly(for this particular example; there are much more issues)

We would greatly appreciate any guidance on where in the PDFBox library code we can modify the encoding rules for the Devanagari script since it would save us a lot of time and effort.

Thank you very much for your help and support. We look forward to hearing back from you soon.

Regards
Ruel Shakya

Reply via email to