[ 
https://issues.apache.org/jira/browse/SOLR-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921668#comment-13921668
 ] 

Hoss Man commented on SOLR-5819:
--------------------------------

bq. The most common of those boxes used, "note", only appears on 48 pages:

there's seems to be some sort of multiplicative factor going on - based not 
just on how many _wik_ pages the image appears on, but how many total PDF pages 
exist (wether the image is on them or not) ... for instance: 5135/395 = 13 
which is roughly how many PDF pages contain the "info" box -- the math isn't 
exact for all of the boxes, but i bet that's a factor of how many _pdf_ pages 
it's on.

{quote}
I can make a github.com pdfdeduper thats just a simple java -jar thing if we 
want.

But i dont think that code i pasted should go into our build, the license is 
AGPL.
{quote}

It would be nice if we could at least put the jar in dev-tools so it cna be 
part of the automated ref-guide publishing we do now -- but if one of the steps 
is "please go download this jar & run it" i don't think it's the end of the 
world.

bq. The funny thing being that confluence itself uses the same library to 
generate the PDF in the first place 

I know ... I have a lot of "certi-tude" that someone in atlassian at some point 
between 3.x and 5.x found some "slow" code using PdfSmartCopy, changed it to 
use PdfConcatenate or PdfCopy and saw a big speed improvement and said "Yeah! I 
fixed it!"


> Investigate & reduce size of ref-guide PDF
> ------------------------------------------
>
>                 Key: SOLR-5819
>                 URL: https://issues.apache.org/jira/browse/SOLR-5819
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>         Attachments: img-0007.png, img-0008.png, img-0009.png, img-0010.png, 
> img-0011.png, img-0012.png, img-0013.png, img-0014.png
>
>
> As noted on the solr-user mailing list in response to the ANNOUNCE about the 
> 4.7 ref guide, the size of the 4.4, 4.5 & 4.6 PDF files were all under 5MB, 
> but the 4.7 PDF was 30MB.
> opening this issue to track trying to reduce this



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to