Well, it occurs to me that the *real* problem here is that the fonts in the individual PDF files are subsets. If they were not, then I believe you could safely and easily use MuPDF (specifically mutool clean) to remove the duplicate fonts. Or at least, the duplicate FontFile streams, I'm not
certain if the Font and FontDescriptor objects would be possible to remove as well. But that would certainly cover a good portion of the file size, the fonts are running at about 9Kb each, while the Font and FontDescriptor objects are a few tens of bytes.
The fonts in the pdfs are identical fonts constructed by ghostscript on the fly, I think
it was Ken Sharp who explained to me some years ago that the term "subset" is
wrong ;-)
One emmentaler font + three encodings + one character (scaled to invisibilty) of each encoding used prior to anything else in the ps leads ghostscript to produce three different subsets ;-)) of the emmentaler font in every pdf. But the set of 3 "subsets" is identical in any pdf that is produced
this way, and so gs is (was) able to remove the duplicates. That's the --bigpdf trick.
I agree that mutool clean can be a good starting point. If I read the documentation
correctly, it does "clean" (remove) unused objects, but it is unable to subset
fonts if not all glyphs of the fonts are used?
So the question then becomes 'why are the fonts subset ?' That's a really good question, and the answer is that I don't know. Its possible that there is a genuine pdfwrite bug here. The piece of information I'm missing is the step used to create the PDF files from the EPS files, I don't know how
you are doing that.
lilypond spawns ghostscript. If our --bigpdf option is used the command is e.g.:
gs -q -dSAFER -dEPSCrop -dCompatibilityLevel=1.4 -dNOPAUSE -dBATCH -r1200
-dSubsetFonts=false -sDEVICE=pdfwrite -dAutoRotatePages=/None
-sOutputFile=testa.pdf -c.setpdfwrite -ftesta.eps
My attempts to replicate the individual PDF files have been entirely
unsuccessful, I get files with three copies of the Emmentaler font embedded
instead of 1, and none of the three fonts match the ones in the PDF files Knut
supplied.
I used tag ghostscript-9.21 from the git repository.
Hmm, actually, going back to the 9.21 release does produce at least similar
behaviour, whereas the 9.22 release does not. In 9.22 I get three fonts output
instead of 1. I've no idea why currently, and right at the moment I don't have
time to look.
I'll try and remember to look at it when I am not drowning under support, but
it looks like there have been changes in this area unrelated to the
PDFDontUseObjectNum bug, and that in itself may mean that your process doesn't
work any more, or works less well.
Thanks for you patience!
Knut
_______________________________________________
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel