Nice overview, would be interesting to watch, especially the slides about the "old days"!
Ad (9) there were a few other bugs that popped up regularly, unicode handling, duplicate classes which forced us to unattic it Ad (7) results for POI regression tests are at http://people.apache.org/~centic/poi_regression/reports/ if you want to add a link Dominik On Mon, Oct 15, 2018, 03:50 Dave Fisher <[email protected]> wrote: > Hi - > > I’ve come with the plan for my POI talk next weekend. I need to finalize > my slides tomorrow so that some Chinese translation can be done. I have > some questions that I’ll mark as “—>”. If you can answer you’ll save me > some research. > > I plan to tell the story of POI, including Tika interactions, and Common > Crawler, in the end I want to give people two places to contribute along > with motivation. > > (1) Title > Name of presentation > About Dave > (2) POI > When it started in Jakarta the simple use case. > End of Jakarta > (3) OOXML and the Microsoft Open Specification Promise > The OSP > The flame war > OpenXML4J - > http://incubator.apache.org/ip-clearance/openxml4j.html < > http://incubator.apache.org/ip-clearance/openxml4j.html> > XSSF, XSLF, and SS > (4) Tika and OOXML lite > Apachecon Oakland 2009 - Jukka asked Nick, Yegor and I during > BarCamp if we could something about the 13MB ooxml jar. Yegor came up with > a solution in a day. > Unit Test and your Beans are included > —> Anyone: anything to add? XMLBeans impacts? > (5) Graphics2D > Discuss output techniques developed. > —> Yegor - is there some sample code you might share. > (6) Tika Text Extraction > —> Could use pointers to the basic tutorial. > (7) Common Crawler - 1TB of samples > Common Crawler - commoncrawl.org > Common Crawler Download - centic9 > Regression sets for POI, Tika and PDFBox > —> Are there other Apache projects that use these documents? > (8) The POI Toolbox > A table of the various formats with input, output, and remarks. > (9) XMLBeans 3 > Bringing the product out of the attic. > —> Any reasons besides better control of Entity Expansion attacks? > (10) Contributing to POI and Tika Will Improve Your Solr Search Results > How Solr and similar architectures depend on Tika and Tika depends > on POI > Example is Headers and Footers choices on Word documents on the > Tika List this past week. > > Thanks for your help and feedback! > > Regards, > Dave > > >
