Mostly the text versions, however the pages without complete text versions are scraped from the HTML instead. Each day the files should be updated and a new revision committed to RCS. If you want to use diff, I recommend diff -b to ignore whitespace. The script is there and very ugly. Feel free to mirror and please suggest improvements.
-- -c.