On Wed, Jun 30, 2010 at 12:42 PM, Samuel J Klein <s...@wikimedia.org> wrote:
> < PGDP has a very strict and arduous workflow... The >> result is quality, however only the text is sent downstream. > > Why not send images and text downstream? Because PGDP produces for Project Gutenberg, which publishes text and html versions, not scans. > Perhaps we have competing interfaces / workflows. but I expect we > would be glad to share 99.99%-verified high-quality > texts-unified-with-images if it were easy for both projects to > identify that combination of quality and comprehensive data... and > would be glad to share metadata so that a WS editor could quickly > check to see if there's a PGDP effort covering an edition of the text > she is proofing; and vice-versa. For the PGDP side, it's possible to check at PGDP itself (one will need to get a login for that, but it's as free and unencumbered as the same on Wikimedia), but there is also a useful superset at http://www.dprice48.freeserve.co.uk/GutIP.html (warning! I'm talking of a 7 megabyte html file here). This contains, sorted by author (books by more than one author given multiple times) all books that have a clearance for Project Gutenberg. For cooperation, one idea could be to get the PGDP material either after the P3 stage or after the F2 stage. As long as a project is still active, it isn't hard at all to get both the text and the scan pages. -- André Engels, andreeng...@gmail.com _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l