Hi,

browsing is now available from https://corpora.tika.apache.org/base/

Let me know what you think or if it doesn't work for you.

Proposed additions are to 

a) have a nice landing page on https://corpora.tika.apache.org/
b) make the file browsing a little nicer by tweaking the layout

BR
Maruan

> Added font-config and ttf-dejavu.  Let me know if you need anything else.
> 
> Cheers,
> 
>             Tim
> 
> On Thu, Jun 11, 2020 at 11:44 AM Dominik Stadler <dominik.stad...@gmx.at>
> wrote:
> 
> > Hi,
> > 
> > I tried a quick run to see how far my regression-test scripts get when
> > deployed to the new VM, it stopped at a point where it needs package
> > "fontconfig" to be installed to not run into
> > https://github.com/AdoptOpenJDK/openjdk-build/issues/693
> > 
> > Can you add that to the installation?
> > 
> > Thanks... Dominik.
> > 
> > 
> > On Thu, Jun 11, 2020 at 12:38 AM Maruan Sahyoun <sahy...@fileaffairs.de>
> > wrote:
> > 
> > > Could be that it will be Monday before I get to it.
> > > 
> > > BR
> > > Maruan
> > > 
> > > > Gah...typo...sorry.
> > > > 
> > > > > > > > /usr/share/corpora/docs
> > > > > > > > /usr/share/corpora/metadata
> > > > 
> > > > That's what I did. :D  They should both be searchable.
> > > > 
> > > > On Wed, Jun 10, 2020 at 5:25 PM Maruan Sahyoun <sahy...@fileaffairs.de
> > > > wrote:
> > > > 
> > > > > > For now, I moved /home/work to /home/.work and I bind mounted that
> > to
> > > > > > /data1.  I soft linked /data1/docs and /data1/metadata to
> > > > > > /usr/share/corpora/docs and /usr/share/metadata.
> > > > > > 
> > > > > > I chgrp to collab and set permissions to 755.
> > > > > > 
> > > > > > I _think_ we're good?
> > > > > 
> > > > > and browsing shall be possible for /usr/share/corpora/docs only? Or
> > for
> > > > > /usr/share/metadata too?
> > > > > 
> > > > > BR
> > > > > Maruan
> > > > > > On Wed, Jun 10, 2020 at 2:44 PM Maruan Sahyoun <
> > > sahy...@fileaffairs.de>
> > > > > > wrote:
> > > > > > 
> > > > > > > > Separate question...
> > > > > > > > 
> > > > > > > > The 6TB drive is mounted to /home, which is why I initially put
> > > the
> > > > > data
> > > > > > > > there even though that is, um, non-traditional.
> > > > > > > > 
> > > > > > > > Should we move /home to / and mount the 6TB drive to, say,
> > /data.
> > > > > Then
> > > > > > > we
> > > > > > > > could link the docs under /usr/share/corpora.
> > > > > > > 
> > > > > > > feel free to move to what best suits your needs.
> > > > > > > 
> > > > > > > BR
> > > > > > > Maruan
> > > > > > > 
> > > > > > > > On Wed, Jun 10, 2020 at 2:01 PM Tim Allison <
> > talli...@apache.org
> > > > > wrote:
> > > > > > > > > Thank you, Maruan!
> > > > > > > > > 
> > > > > > > > > I'm moving the data over now.
> > > > > > > > > 
> > > > > > > > > We should add some other folders, e.g. metadata/.
> > > > > > > > > 
> > > > > > > > > Do we want
> > > > > > > > > 
> > > > > > > > > /usr/share/corpora/docs
> > > > > > > > > /usr/share/corpora/metadata
> > > > > > > > > 
> > > > > > > > > or
> > > > > > > > > 
> > > > > > > > > /usr/share/corpora
> > > > > > > > > /usr/share/metadata
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Wed, Jun 10, 2020 at 1:47 PM Maruan Sahyoun <
> > > > > sahy...@fileaffairs.de
> > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > I've done the following steps
> > > > > > > > > > 
> > > > > > > > > > - upgraded Ubuntu to use the latest packages
> > > > > > > > > > - installed Apache HTTP Server
> > > > > > > > > > - created & installed certificate (from Let's Encrypt)
> > > > > > > > > > - setup redirect so all http traffic is forwarded to https
> > > > > > > > > > - created a very basic landing page so that the default
> > > Ubuntu
> > > > > page
> > > > > > > is
> > > > > > > > > > gone (very basic!)
> > > > > > > > > > - setup a cron job to handle the certificate renewal
> > > > > > > > > > 
> > > > > > > > > > Now we need to decide where to move the files. Default
> > > Ubuntu is
> > > > > > > > > > expecting these either below /var/www or /usr/share. I'd go
> > > for
> > > > > > > > > > /usr/share/corpora. Please move these if you are happy with
> > > that.
> > > > > > > After
> > > > > > > > > > that I can enable the file browsing.
> > > > > > > > > > 
> > > > > > > > > > BR
> > > > > > > > > > Maruan
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > --
> > > > > Maruan Sahyoun
> > > > > 
> > > > > FileAffairs GmbH
> > > > > Josef-Schappe-Straße 21
> > > > > 40882 Ratingen
> > > > > 
> > > > > Tel: +49 (2102) 89497 88
> > > > > Fax: +49 (2102) 89497 91
> > > > > sahy...@fileaffairs.de
> > > > > www.fileaffairs.de
> > > > > 
> > > > > Geschäftsführer: Maruan Sahyoun
> > > > > Handelsregister: AG Düsseldorf, HRB 53837
> > > > > UST.-ID: DE248275827
> > > > > 
> > > > > 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

Reply via email to