Public bug reported: In Ubuntu 10.04 (GPL Ghostscript 8.71) cups-pdf delivers searchable PDFs, containing text that can neatly be copied and pasted. In Ubuntu 11.10 (GPL Ghostscript 9.04) this feature is gone. PDFs cannot be searched, and a copy/paste produces garbage text. I experimented with capturing PostScript files from a printer with a very simple CUPS Backend, simply streaming stdin to a PostScript file. I installed and shared this printer on 11.10 and used it on 10.04. I used `ps2pdf` on 11.10 to generate the PDF files (just like cups-pdf does). I merged the sample PDFs from the two Ubuntu versions in one PDF file and executed `fontspdf` to see what fonts were present. This is the result: name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- AGTLMZ+Webdings TrueType yes yes yes 71 0 GYTYEM+DejaVuSerif TrueType yes yes yes 72 0 ZXOPYM+Verdana TrueType yes yes yes 73 0 PMFCNH+Verdana-Bold TrueType yes yes yes 74 0 KPEIOE+WenQuanYiZenHei TrueType yes yes yes 90 0 KIIHFC+DejaVuSerif TrueType yes yes yes 91 0 BPBDFC+UnBatang-Identity-H CID TrueType yes yes no 93 0 YTZIYS+Ume-P-Gothic-C4-Identity-H CID TrueType yes yes no 94 0 HYRHDD+DejaVuSansMono-Identity-H CID TrueType yes yes no 95 0
The searchable part of the PDF (originating from the 10.04 PostScript) leans on the embedded 'TrueType' fonts which have unicode encoding. On the other hand, the Ubuntu 11.10 PostScript is reponsible for the 'CID TrueType' font embedding. The absence of any 'unicode' encoding here seems responsible for this part not being searchable. What happened between 10.04 and 11.10? Using `ps2pdf` on 10.04 with the PostScript file produced on 11.10, also produces a non-searchable PDF. Did the printing process on 11.10 change, producing a different PostScript format? A format that `ps2pdf` can't handle? Is this a bug? P.S. I printed from different applications (gedit, geany, FireFox, Libre Office), all with the same result. ** Affects: ghostscript (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/942866 Title: Ubuntu 11.10: printing to PDF produces unsearchable PDF (contrary to 10.04) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/942866/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs