Um...Ok. If no one else is concerned... off we go? -----Original Message----- From: Julien Nioche [mailto:lists.digitalpeb...@gmail.com] Sent: Monday, April 20, 2015 10:56 AM To: dev@tika.apache.org Subject: Re: [VOTE] Apache Tika 1.8 Release Candidate #2
and I haven't tested it with Nutch either... On 20 April 2015 at 15:46, Julien Nioche <lists.digitalpeb...@gmail.com> wrote: > I haven't tested the RC with Behemoth, it will probably have the same > issue but I'll do like you and defer the update if that's the case. > > On 20 April 2015 at 15:23, Ken Krugler <kkrugler_li...@transpac.com> > wrote: > >> >> > From: Allison, Timothy B. >> > Sent: April 20, 2015 5:11:04am PDT >> > To: dev@tika.apache.org >> > Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2 >> > >> > If I understand correctly, if we release rc2, Tika 1.8 will break in >> Hadoop clusters across the land?! >> > Or, Hadoop folks will have to apply a classloading workaround or >> rebuild 1.8/trunk with small version mod in TIKA-1606 to get Tika to work. >> > >> > For most Hadoopites, this will be a straightforward fix, and I'm >> assuming that's why Ken is not more outspoken against releasing rc2 as is >> (Ken, let me know if I'm wrong!). >> >> Usually it's straightforward. Though whenever you start manipulating the >> classloader logic, you can get odd results. >> >> E.g. by forcing your job jar's dependencies to show up first, now you can >> have an issue where one of your jars masks an older/newer version that >> Hadoop needs, so the job fails for some other reason. >> >> But yes, I don't feel strongly enough about this to vote -1, as I don't >> think there are that many people using Tika with Hadoop. >> >> For Bixo, I'd defer updating the Tika dependency until another version is >> released. >> >> Don't know about Behemoth - Julien? >> >> -- Ken >> >> >> > For other users, though, say, in healthcare, where code security review >> is stringent, this could be a real pain, no? >> > >> > Am I understanding correctly what will happen? If so, do we really >> want to do this? >> > >> > >> > -----Original Message----- >> > From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] >> > Sent: Saturday, April 18, 2015 11:48 PM >> > To: dev@tika.apache.org >> > Subject: Re: [VOTE] Apache Tika 1.8 Release Candidate #2 >> > >> > +1 to pushing on Monday - if we have to roll a 1.9 quickly >> > after, we can :) >> > >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > Chris Mattmann, Ph.D. >> > Chief Architect >> > Instrument Software and Science Data Systems Section (398) >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> > Office: 168-519, Mailstop: 168-527 >> > Email: chris.a.mattm...@nasa.gov >> > WWW: http://sunset.usc.edu/~mattmann/ >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > Adjunct Associate Professor, Computer Science Department >> > University of Southern California, Los Angeles, CA 90089 USA >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > >> > >> > >> > >> > >> > >> > -----Original Message----- >> > From: Tyler Palsulich <tpalsul...@gmail.com> >> > Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> >> > Date: Saturday, April 18, 2015 at 11:29 PM >> > To: "dev@tika.apache.org" <dev@tika.apache.org> >> > Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2 >> > >> >> Hi Folks, >> >> >> >> If there are no blocking complaints (OSGi?) by Monday (a little longer >> >> than >> >> 3 days, I realize), I'll mark this as passed and finish the release >> >> process. >> >> >> >> Of course, it's no problem for me to cut another RC, if it's needed. >> >> >> >> Have a great weekend! >> >> Tyler >> >> I've run into one problem while testing Tika 1.8 with Bixo >> >> >> >> It involves a dependency issue involving (of course) Guava, since that >> >> project loves to break their API :( >> >> >> >> The bixo-core jar has these transitive dependencies on various >> versions of >> >> Guava: >> >> >> >> Hadoop - 11.0.2 >> >> Cascading - 14.0.1 >> >> Tika-parsers - 10.0.1 >> >> cdm - 17.0 >> >> >> >> Everyone winds up using version 10.0.1 (note that Tika has a >> dependency on >> >> cdm, which wants to use 17.0) >> >> >> >> The problem is that Hadoop (for any recent version) uses an API from >> >> Guava's cache implementation that no longer exists: >> >> >> >> >> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL >> >> oader;)Lcom/google/common/cache/LoadingCache; >> >> java.lang.NoSuchMethodError: >> >> >> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL >> >> oader;)Lcom/google/common/cache/LoadingCache; >> >> at >> >> org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62) >> >> at >> >> org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74) >> >> at >> >> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1272) >> >> at >> >> >> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutp >> >> utFormat.java:79) >> >> >> >> So what this means is that anyone trying to use Tika with Hadoop will >> need >> >> to play games with the class loader to get the older version of Guava - >> >> though that can cause other issues if Hadoop (or Cascading, etc) rely >> on >> >> anything that's only in the newer Guava API. >> >> >> >> Guava 1.0.01 was released about 3.5 years ago; 11.0.2 was from about 3 >> >> years ago. So it seems like we should upgrade to at least 11.0.2 >> >> >> >> But I don't know if this is enough of an issue to require another RC. >> >> >> >> -- Ken >> >> >> >> PS - I've created https://issues.apache.org/jira/browse/TIKA-1606 to >> track >> >> this. >> >> >> >> >> >>> From: Tyler Palsulich >> >>> Sent: April 13, 2015 10:56:29am PDT >> >>> To: dev@tika.apache.org, u...@tika.apache.org >> >>> Subject: [VOTE] Apache Tika 1.8 Release Candidate #2 >> >>> >> >>> Hi Folks, >> >>> >> >>> A candidate for the Tika 1.8 release is available at: >> >>> https://dist.apache.org/repos/dist/dev/tika/ >> >>> >> >>> The release candidate is a zip archive of the sources in: >> >>> http://svn.apache.org/repos/asf/tika/tags/1.8-rc2/ >> >>> >> >>> The SHA1 checksum of the archive is >> >>> 5e22fee9079370398472e59082d171ae2d7fdd31. >> >>> >> >>> In addition, a staged maven repository is available here: >> >>> >> https://repository.apache.org/content/repositories/orgapachetika-1009 >> >>> >> >>> Please vote on releasing this package as Apache Tika 1.8. The vote is >> >> open for the next 72 hours and passes if a majority of at least three >> +1 >> >> Tika PMC votes are cast. >> >>> >> >>> [ ] +1 Release this package as Apache Tika 1.8 >> >>> [ ] ±0 I don't object to this release, but I haven't checked it >> >>> [ ] -1 Do not release this package because... >> >>> >> >>> Thanks, >> >>> Tyler >> >> >> >> >> >> -------------------------- >> >> Ken Krugler >> >> +1 530-210-6378 >> >> http://www.scaleunlimited.com >> >> custom big data solutions & training >> >> Hadoop, Cascading, Cassandra & Solr >> > >> >> -------------------------- >> Ken Krugler >> +1 530-210-6378 >> http://www.scaleunlimited.com >> custom big data solutions & training >> Hadoop, Cascading, Cassandra & Solr >> >> >> >> >> >> -------------------------- >> Ken Krugler >> +1 530-210-6378 >> http://www.scaleunlimited.com >> custom big data solutions & training >> Hadoop, Cascading, Cassandra & Solr >> >> >> >> >> >> > > > -- > > Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble > -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble