Let me know if these are of any use...
https://github.com/centic9/CommonCrawlDocumentDownload
http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/
https://events.static.linuxfound.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf
https://wiki.apac
Hi Nick,
Sit at BarCamp 2 Monday morning or do a BOF later?
Would someone point me to the Common crawler information.
Regards,
Dave
Sent from my iPhone
> On Sep 17, 2018, at 8:07 AM, Nick Burch wrote:
>
>> On Sat, 15 Sep 2018, Dave Fisher wrote:
>> I’ll be at Apachecon Montreal, anyone else?
On Sat, 15 Sep 2018, Dave Fisher wrote:
I’ll be at Apachecon Montreal, anyone else?
I'll be there! Happy to look at draft slides then, and offer advice :)
Nick
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For addi
Hi -
These are all great ideas. Thanks!
I’ll be at Apachecon Montreal, anyone else?
Regards,
Dave
Sent from my iPhone
> On Sep 15, 2018, at 2:12 PM, Tim Allison wrote:
>
> Looks great! If at all possible, I’d appreciate a bullet or two on
> Dominik’s and my large scale regression tests... Mo
Looks great! If at all possible, I’d appreciate a bullet or two on
Dominik’s and my large scale regression tests... More input on test files
for the corpus would be useful. Complete understand if this is off topic.
Thank you!
On Fri, Sep 14, 2018 at 5:27 PM Dave Fisher wrote:
> Hi Team,
>
> I’ve
Hi Dave,
thank you for spreading the word - I've already noticed that the latest
released was noticed in China [1] :)
If I understand you correctly, you want to present some kind of history list
and
I guess the talk won't have so much interaction with the audience, right?
In case it would be int
Hi Team,
I’ve been invited to speak at the COSCON in Shenzhen, China on October 20-21.
I’ll have two talks. One is about the Incubator, but one was my choice and I
chose POI due to a few interesting things happening here plus our long history
as a small project that show some important points a