Great.  Thank you!

-----Original Message-----
From: Chris Mattmann [mailto:mattm...@apache.org] 
Sent: Friday, September 22, 2017 1:46 PM
To: dev@tika.apache.org
Subject: Re: TikaIO concerns

[dropping Beam on this]

Tim, another thing is that you can finally download the TREC-DD Polar data 
either from the NSF Arctic Data Center (70GB zip), or from Amazon S3, as 
described here:

http://github.com/chrismattmann/trec-dd-polar/ 

In case we want to use as part of our regression.

Cheers,
Chris




On 9/22/17, 10:43 AM, "Allison, Timothy B." <talli...@mitre.org> wrote:

    >>1) We've gathered a TB of data from CommonCrawl and we run regression 
tests against this TB (thank you, Rackspace for hosting our vm!) to try to 
identify these problems.
    
    And if anyone with connections at a big company doing open source + cloud 
would be interested in floating us some storage and cycles,  we'd be happy to 
move off our single vm to increase coverage and improve the speed for our 
large-scale regression tests.  
    
    :D
    
    But seriously, thank you for this discussion and collaboration!
    
    Cheers,
    
             Tim
    
    


Reply via email to