On 09.01.2025 08:18, sahy...@fileaffairs.de wrote:
Am Donnerstag, dem 09.01.2025 um 08:07 +0100 schrieb Tilman Hausherr:
How about a simple password protection?
is https access needed or would ssh do?
We already have ssh access. Yes http would be nice.
Tilman
Thus would be helpful during regression tests.
We have to restrict access. It might still be neccessary to delete some
of the content where this is not public content/within the intented
use. I.e. grabbing a website and storing it's content still poses some
risks.
BR
Maruan
Tilman
Gesendet mit der Telekom Mail App
-- Original-Nachricht --
Von: Andreas Lehmkühler <andr...@lehmi.de.invalid>
Betreff: Re: Turning off public access to the regression corpora?
Datum: 09.01.2025, 07:53 Uhr
An: corpora-dev@tika.apache.org
Hi,
I agree with Maruan. :-(
Just out of curiosity, the origin source of those files is some
public
webserver, isn't it?
Andreas
Am 09.01.25 um 05:27 schrieb Maruan Sahyoun:
Hi,
this is unfortunate but as this is posing the risk of legal actions
to the ASF but also to me hosting the site I think we should stop
that.
BR
Maruan
Am 09.01.2025 um 02:37 schrieb Tim Allison <talli...@apache.org>:
\All,
We've gotten a handful of takedown requests recently. I had
initially
envisioned public sharing of files as a key component of our
server. We can
still use the files and offer read access to fellow file
researchers. I'm
not sure I want to deal with further takedown requests.
As an intermediate step, we could ask robots not to crawl the
data, but
that's not reliable.
So, in lieu of that, with heavy heart, I ask if it is time to
close off
public access?
WDYT?
Best,
Tim