On Jan 1, 4:28 pm, Piet van Oostrum <[EMAIL PROTECTED]> wrote: > >>>>>Shriphani<[EMAIL PROTECTED]> (S) wrote: > >S> I tried pyPdf for this and decided to get the pagelinks. The trouble > >S> is that I don't know how to determine whether a particular page is the > >S> first page of a chapter. Can someone tell me how to do this ? > > AFAIK PDF doesn't have the concept of "Chapter". If the document has an > outline, you could try to use the first level of that hierarchy as the > chapter starting points. But you don't have a guarantee that they really > are chapters. > -- > Piet van Oostrum <[EMAIL PROTECTED]> > URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4] > Private email: [EMAIL PROTECTED]
How would a pdf to html conversion work ? I've seen Google's search engine do it loads of times. Just that running a 500odd page ebook through one of those scripts might not be such a good idea. -- http://mail.python.org/mailman/listinfo/python-list