Not sure what you mean about ARGB being jpeg. The examples I provided should 
have been PNG.

The use case and reason we're doing this has to do with file sizes. These are 
insurance documents. There could be 20-50 different individual templates at 
play depending on the exact makeup of each insurance policy and they get 
populated with data at generation time. And each policy gets a unique 
combination of documents at issuance. Each day, we take all of the policies 
that need to print for the day by type and state (which could be 100's or more 
total pages / docs) and put them in a single batch for the outsourced print 
service. There are sometimes 100's of individual batch files being printed 
daily. The Printer then takes each resulting Batch PDF file and converts to 
postscript (or something similar) to embed their print controls and then back 
to PDF before sending through their printers.

If we embed fonts in each individual document, we end up with very large files 
sizes once they're combined into large batches and it makes the print 
production and file sharing / storage process much more cumbersome. Because 
individual policy documents are generated first and have to print in specific 
orders in the final doc, we can't / don't know what fonts will be involved in 
each final Batch pdf. And it also slows down the process where we would have a 
hard time completing each day's print before having to start the next day's 
workload.

We're open to other options but haven't come up with a good solution yet for 
high volume clients. I know the PDF -> image -> PDF is a little non-traditional 
but we were trying to find something that could put the burden of accuracy on 
our system rather than ending up with garbled character sets in the final 
product since fonts are not currently embedded. Maybe we're missing something 
obvious but even a few misprinted documents is a huge liability and we're 
trying to reduce the likelihood of that happening to as close to zero as 
possible.


________________________________
From: Tilman Hausherr <thaush...@t-online.de>
Sent: Tuesday, August 1, 2023 11:12 PM
To: users@pdfbox.apache.org <users@pdfbox.apache.org>
Subject: Re: Border / Box around images and form elements with backgrounds

EXTERNAL: Do not click links or open attachments if you do not recognize the 
sender.

Why are the ARGB image and its mask both JPEG?

I can see the effect with Adobe Reader and Chrome, but not with PDF.js
and PDF-XChange.

The whole thing you're doing sounds weird. You're printing at a low dpi
instead of using vector fonts that will look great at every dpi. The
"font issues" are usually avoided by telling your clients that their
fonts MUST be embedded AND subsetted or else. If your printing is a mass
mailing then the fonts needs to be only once for the whole document.

Tilman

On 01.08.2023 22:30, JJ Blodgett wrote:
> It looks like the attachments were stripped out of the email. I'll try to 
> include Google doc links and hope these work:
>
> Example of bad behavior: 
> https://urldefense.com/v3/__https://drive.google.com/file/d/1ZU-vvZ1uTTDM0LTRhDJPwqVX5nY2dBL_/view?usp=drive_link__;!!I_DbfM1H!AA3gNX8OOE2YHGHy7kvG3paGSHiPOWdhUZyVGExa0KgE7WLWflgkWq8chYRmzaszJHMEuVtQQmjVGlOGKgxft-zfua8h-FGgh8g$
>
> ARGB render image: 
> https://urldefense.com/v3/__https://drive.google.com/file/d/1ZwyZejehc6AdiQJHxdJ5QrsvfJbgSq9S/view?usp=drive_link__;!!I_DbfM1H!AA3gNX8OOE2YHGHy7kvG3paGSHiPOWdhUZyVGExa0KgE7WLWflgkWq8chYRmzaszJHMEuVtQQmjVGlOGKgxft-zfua8hg68zMzs$
> RGB render image: 
> https://urldefense.com/v3/__https://drive.google.com/file/d/1m7Ikf1G65HoGJSHt9PLt6TVgT5qMhpMa/view?usp=drive_link__;!!I_DbfM1H!AA3gNX8OOE2YHGHy7kvG3paGSHiPOWdhUZyVGExa0KgE7WLWflgkWq8chYRmzaszJHMEuVtQQmjVGlOGKgxft-zfua8hNYO4wUs$
>
> ARGB output PDF: 
> https://urldefense.com/v3/__https://drive.google.com/file/d/1kb-SHEE8xS2PYTWrAgfYgmuKJMF6YUql/view?usp=drive_link__;!!I_DbfM1H!AA3gNX8OOE2YHGHy7kvG3paGSHiPOWdhUZyVGExa0KgE7WLWflgkWq8chYRmzaszJHMEuVtQQmjVGlOGKgxft-zfua8hqLR2HkI$
> RGB output PDF: 
> https://urldefense.com/v3/__https://drive.google.com/file/d/1PpHVEsSGcUltKZY0Gi-Kk1kLIx9XPLIW/view?usp=drive_link__;!!I_DbfM1H!AA3gNX8OOE2YHGHy7kvG3paGSHiPOWdhUZyVGExa0KgE7WLWflgkWq8chYRmzaszJHMEuVtQQmjVGlOGKgxft-zfua8hVb8L7hk$
>
>
> ________________________________
> From: JJ Blodgett <jj.blodg...@silvervinesoftware.com>
> Sent: Tuesday, August 1, 2023 11:49 AM
> To: users@pdfbox.apache.org <users@pdfbox.apache.org>
> Subject: Border / Box around images and form elements with backgrounds
>
>
> EXTERNAL: Do not click links or open attachments if you do not recognize the 
> sender.
>
> We're working on converting large batches of text-based PDF documents into 
> images and then back to PDF (partly to avoid font issues with certain print 
> processes down the line). But we've come across an issue that's preventing us 
> from moving forward.
>
> Both with version 2.0.29 and 3.0.0, we can generate clean images with 
> "PDFRenderer" and renderImageWithDPI() or similar methods. With RGB output, 
> we get solid images but the size is larger than we'd like. So we try to use 
> ARGB which creates a smaller / transparent background image except for 2 
> items we've found. Any form field with a transparent background and any 
> embedded image have a non-transparent background. The images look clean and 
> presumably are exactly what we need out of the render process.
>
> But as soon as we try to convert the images back into a PDF by drawing the 
> image to a blank document page, we end up with a border around all images and 
> form fields that are non-transparent. I've included examples of both the raw 
> images and the resulting PDF (as well as the source PDF). We've tried all 
> kinds of things from render settings to draw settings and can't find a 
> combination that changes this at all. We could address all of the form fields 
> by removing backgrounds in our templates. However, we can't actually do 
> anything to get rid of company logos or other images that need to appear in 
> the documents.
>
> Because we can't figure out how to get around this issue, we're unable to use 
> ARGB and file sizes are too large to work with. If we can get ARGB to write 
> to documents without the border, I think we can move forward. Any ideas on 
> how or why this happens and whether there is a workaround or not?  If it 
> matters, we're using Adobe Coldfusion to access java objects from a 
> programming standpoint. But I'm pretty sure that's not a limiting factor. But 
> I did notice that the built-in CF functions for working with PDF's do the 
> same thing. So it may not have a workaround.
>
> If there's another way to accomplish the same thing (ie end up with 
> image-based pdf rather than text to avoid text interpretation issues), that 
> would also be a possible solution. We can't embed fonts in the documents 
> because the file sizes would then be too large to work with over the 1,000's 
> of individual documents.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to