Hi everyone, I'm working on exporting LyX documents to EPUB as part of my Google Summer of Code project, and I'd like to invite you to try out my current implementation, which can be found in the "epub/master" branch of the gsoc repository (g...@git.lyx.org:gsoc.git). The export process begins by exporting the document to XHTML via LyXHTML, then converting the XHTML to EPUB with the scripts in lib/scripts/epub.
Right now, documents will successfully export to EPUB 2.0.1, with the following caveats: - Almost all metadata fields (author, book id, etc.) are filled in with default values. Only the title field is taken from the XHTML file from which the EPUB is converted. - No intra-document navigation is implemented; the document is just one long page. - MathML isn't part of the EPUB 2.0.1 standard, so the document output settings should be set to output math as images. What I'd like to implement soon: - Extracting other metadata fields from the document. The required fields are language, title, and identifier. The title field is taken from the document, but not the language or the identifier. I'm taking the title from the first paragraph to use the "title" inset, but there aren't corresponding insets for the other elements, so I'm not sure of the best way or ways to get the rest of the info. (There's an inset for author, but the author name is needed in both reading order and "file-as" order, and there's only one author inset.) One thought is to create custom insets, and another is to ask for the information via the document settings. - Intra-document navigation. In order to skip around within the document, add bookmarks, etc., navigation information needs to be added to the toc.ncx file within the EPUB archive. Which locations in the document should be added to the list of navigable points is not obvious. First, I read (here at http://www.gbenthien.net/Kindle%20and%20EPUB/ncx.html) that some e-readers only work with at most one depth level--only parts, or only chapters, or only sections, or whatever. I'm not sure whether this is correct or not. Either way, we can't always assume what depth the user wants in the table of contents--this is probably something we should ask. It's probably easiest to pull the navigation info straight from the document's table of contents, but I don't know if this info is available in the exported XHTML file without appearing visibly. What I'd like to implement at some point: - optional conversion of images to SVG format Note: Vector-based graphics scale better than raster-based graphics, making them well-suited for electronic media. Note: EPUB specifications require compliant e-readers to support SVG. Note: Older versions of some browsers (primarily IE) don't support SVG. Note: Preliminary searches turn up a package named dvisvgm ( http://www.ctan.org/pkg/dvisvgm) that converts DVI to SVG, and it's licensed under the GPL v3 or later. - ability to split large XHTML files into smaller ones Note: Splitting large XHTML files should boost the performance of the converted EPUB documents. - allow selection of an image for front cover artwork Note: Amazon requires JPEG or TIFF format for front cover artwork. I'd love to hear any thoughts, comments, and suggestions you all have, especially if you encounter any bugs or see something important I'm overlooking. Thanks, Josh