All, I received a request to package some of the bugtracker data for easier downloading. For this request, I've zipped the PDFs and FDFs from the bugtrackers and made those zips available here: https://corpora.tika.apache.org/base/packaged/pdfs/
I don't think we'll be inundated with one-off requests, and I don't think we should be zipping large chunks of govdocs1 or commoncrawl. Are there any objections? Is there a better way to package data and/or make it available/browsable/navigable/retrievable? Cheers, Tim