[XeTeX] Python Project: PDF Optimization

Rob Oakes Fri, 04 Jun 2010 13:41:06 -0700

Dear XeTeX Mailing List,

As a quick way of introduction, my name is Rob Oakes.  I tend to mostly be a 
lurker here, but I greatly appreciate the many individuals here who patiently 
answer questions.  I've learned a great deal from many.


With that said, I am writing regarding a project idea that I am interested in 
pursuing.  Though it is only tangentially related to xetex, I thought this 
might be an appropriate forum to present it and solicit feedback/collaborators.

A couple of weeks ago, I was putting together an article about different 
utilities available for working with PDF documents on Linux 
(http://blog.oak-tree.us/index.php/2010/05/26/pdf-linux).  While doing so, I 
looked high and low for something that would make it easy to optimize a PDF for 
web distribution.  (Specifically, I wanted a tool to downsample images, convert 
between different color spaces, and streamline the PDF for web viewing.)

I came up (mostly) empty handed.

PDF optimization seems to be a major hurdle to a complete set of PDF related 
GUI tools on Linux, and is one of the holes and annoyances (referred to as 
"paper cuts" within the Ubuntu project) that I'm particularly sensitive to.  
(I've been working on a book that claims open source tools superior to  
proprietary ones for writing, and in the process, I've opened my eyes to all 
kinds of shortcomings.)  And I've noticed it is a complaint of others, as well. 
 (The issue of PDF optimization, provided by Acrobat through the "PDF 
Optimizer" feature seems to come up regularly on several of the mailing lists 
that I am a member of.)  So, while researching my article, I looked into what 
it would require to put together such a tool.

As it turns out, slapping together a functional (and useful) prototype probably 
wouldn't be too hard.

The GUI and framework already exists in the form of PDF-Shufler 
(http://sourceforge.net/projects/pdfshuffler/), which is written in Python and 
relies on python-pdf and pyGTK.  The image manipulation and conversions could 
be done using any one four or five image processing frameworks for Python.  The 
only missing piece appears to be backend code that can integrate the two, and 
some GUI code to provide users with options.

I am writing to see if there are any students, budding Python programmers, or 
others who might be interested in collaborating on this.  I've already created 
a GUI layout and a pretty detailed spec that could serve as a starting point.  
Unfortunately, given work stuff,  an outliner extension for LyX tgat would 
provide it with some Scrivener like organization abilities and a book I'm 
writing, I can't take on primary responsibility for yet another project (though 
I would be happy to both contribute code and experience).  From what I've put 
together so far, I estimate that it would take about 25 to 30 hours of 
programming time to implement these features as an extension to PDF-Shuffler.

Such a program would plug a *really* big hole in the world of Linux 
based/writing and publishing and would be an enormous aid to many people, it 
would also be a great project for people who wish to learn more about document 
manipulation or image processing.

Anyone who might interested in helping to tackle this project?

Cheers,

Rob Oakes

PS, as per the GPL, any code contributions would be sent upstream to the 
maintainer of PDF-Shuffler.


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] Python Project: PDF Optimization

Reply via email to