http://ProjectMadurai.com has tons of old tamil literature as HTML and A4 PDF files in public domain license.
There are many ebook reading devices and tables that dont support Tamil. To use these devices, we can create 6 Inch PDF files with tamil content. Let us see here, how to convert all Project Madurai ebooks into 6 Inch PDF files using the utilities available in GNU/Linux. 1. Get the filenames. http://www.projectmadurai.org/pmworks.html This page has all the ebooks. Copy the page content, paste in LibreOffice spreadsheet. Copy the column named "unicode" save the filenames only as a separate text file. cat pm.txt pmuni0001.html pmuni0002.html pmuni0002.html pmuni0002.html pmuni0002.html ... 2. Download these files using wget and python script. cat dl-wget.py import urllib import os book = open("pm.txt").readlines() for filename in book: filename = filename.strip() print "Downloading " + filename bookurl = "http://www.projectmadurai.org/pm_etexts/utf8/" + filename command = "wget -E -H -k -K -p --max-redirect 0 --domains www.projectmadurai.org -e robots=off " + bookurl os.system(command) running the following command, python dl-wget.py will download all the html files with the relevant images to the current folder. There are 593 html files downloaded. 3. Convert to PDF The utility wkhtmltopdf will convert any given html file to PDF file. To convert to 6 inch PDF, the following command helps. wkhtmltopdf -s A6 --minimum-font-size 40 -B 5 -L 5 -R 5 -T 5 source.html destination.pdf Now, let us convert all the downloaded html files to 6 Inch PDF file using a small shell script. for i in *.html; do orig=`basename $i .html`; echo "Converting $orig"; wkhtmltopdf -s A6 --minimum-font-size 40 -B 5 -L 5 -R 5 -T 5 $i.html $orig-6-inch.pdf; done By running this command, all the 593 html files are converted into 6 inch PDF files. Now, we can read these 6 inch PDF files in Kindle, android mobile or tablets. 4. Upload the 6 inch PDF files. I have uploaded all the 6 inch PDF files here. http://bit.ly/project-madurai-kindle-books Get the real name of the book by comparing here. http://www.projectmadurai.org/pmworks.html Download your favorite book and start reading. -- Regards, T.Shrinivasan My Life with GNU/Linux : http://goinggnu.wordpress.com Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com Get CollabNet Subversion Edge : http://www.collab.net/svnedge _______________________________________________ ILUGC Mailing List: http://www.ae.iitm.ac.in/mailman/listinfo/ilugc ILUGC Mailing List Guidelines: http://ilugc.in/mailinglist-guidelines