On 12/13/06, Robert Landrum <[EMAIL PROTECTED]> wrote:
Alex Beamish wrote: > I'll deal with multiple documents with some combination of stale timers > and LRU slots, but that's not really what I see as the most complicated > or difficult part of this problem. For this particular application, my > inactivity timer will probably by 10-15 minutes, and I'll expect to have > 6-8 documents open at any given time, so it shouldn't be a big drain on > memory. And I will probably be able to set something up that signals > that a document has been expired as well .. (this is just me thinking > out loud) .. > > Thanks for your feedback .. I think named pipes is my next focus. > Sockets will be the way to go on this, rather than pipes. I don't want to say pipes are a dead technology, but by using sockets, you can move your gs application off your web servers (if load ever gets that high) without having to rewrite any code, something that isn't possible with a pipe (not without netcat, anyway).
I'll think about sockets, thanks. There's probably a good reason for it, but why not just pre-generate all
of your page images? Even when new documents are added, a cron could be setup to come along (once a minute even) and convert those PDFs to images. Are these PDFs dynamically generated?
(The drawbacks of providing too little information .. or too much.) The current solution is that we generate high resolution page images for the documents in question, and then resize them on the fly using mod_perl. This solution works well locally, but the problem is that we also have a satellite office, and the VPN between the offices is not that great -- we get between 50K and 100K in bandwidth. When we're dealing with tens of thousands of pages, that's a fair bit of data to pass back and forth and to store. I currently have a daemon process that takes care of asynchronously rsyncing the page images to the satellite office. A better solution would be to generate the page images as needed, by using mod_perl again, but this time running Ghostscript on the document. The problem is that Ghostscript is an interactive program, so we need two way communication with it. I already have technology that very efficiently spawns programs using a double fork and directs stdout and stderr into files (I use that to do the document rsyncing), but that's no good for this application -- I need to start a Ghostscript session for a document, and keep it running so that as each page image request comes in, I can just forward it to the appropriate session. And I don't want to have to start up Ghostscript fresh for each session -- I can afford some memory overhead as long as I can make a request and get a response from a live Ghostscript session. Thanks to the discussion that my original post has engendered, I think the answer may lie in using pseudo-ttys, and I'm going to re-visit some of my original research to see if that's the case. I didn't want to describe the existing system in too much detail in case all the detail was unnecessary and irrelevant. So now you have all of the detail after all -- I hope it answers your questions. -- Alex Beamish Toronto, Ontario aka talexb