Am Donnerstag, dem 09.01.2025 um 08:07 +0100 schrieb Tilman Hausherr:
> How about a simple password protection?

is https access needed or would ssh do?

> Thus would be helpful during regression tests.

We have to restrict access. It might still be neccessary to delete some
of the content where this is not public content/within the intented
use. I.e. grabbing a website and storing it's content still poses some
risks.

BR
Maruan 


> Tilman 
> 
> 
> 
> Gesendet mit der Telekom Mail App
> 
> -- Original-Nachricht --
> Von: Andreas Lehmkühler <andr...@lehmi.de.invalid>
> Betreff: Re: Turning off public access to the regression corpora?
> Datum: 09.01.2025, 07:53 Uhr
> An: corpora-dev@tika.apache.org
> 
> Hi,
> 
> I agree with Maruan. :-(
> 
> Just out of curiosity, the origin source of those files is some
> public 
> webserver, isn't it?
> 
> Andreas
> 
> Am 09.01.25 um 05:27 schrieb Maruan Sahyoun:
> > Hi,
> > 
> > this is unfortunate but as this is posing the risk of legal actions
> > to the ASF but also to me hosting the site I think we should stop
> > that.
> > 
> > BR
> > Maruan
> > 
> > > Am 09.01.2025 um 02:37 schrieb Tim Allison <talli...@apache.org>:
> > > 
> > > \All,
> > > We've gotten a handful of takedown requests recently. I had
> > > initially
> > > envisioned public sharing of files as a key component of our
> > > server. We can
> > > still use the files and offer read access to fellow file
> > > researchers. I'm
> > > not sure I want to deal with further takedown requests.
> > > As an intermediate step, we could ask robots not to crawl the
> > > data, but
> > > that's not reliable.
> > > So, in lieu of that, with heavy heart, I ask if it is time to
> > > close off
> > > public access?
> > >   WDYT?
> > > 
> > >           Best,
> > > 
> > >                     Tim
> 
> 

Reply via email to