and I haven't tested it with Nutch either...

On 20 April 2015 at 15:46, Julien Nioche <lists.digitalpeb...@gmail.com>
wrote:

> I haven't tested the RC with Behemoth, it will probably have the same
> issue but I'll do like you and defer the update if that's the case.
>
> On 20 April 2015 at 15:23, Ken Krugler <kkrugler_li...@transpac.com>
> wrote:
>
>>
>> > From: Allison, Timothy B.
>> > Sent: April 20, 2015 5:11:04am PDT
>> > To: dev@tika.apache.org
>> > Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2
>> >
>> > If I understand correctly, if we release rc2, Tika 1.8 will break in
>> Hadoop clusters across the land?!
>> > Or, Hadoop folks will have to apply a classloading workaround or
>> rebuild 1.8/trunk with small version mod in TIKA-1606 to get Tika to work.
>> >
>> > For most Hadoopites, this will be a straightforward fix, and I'm
>> assuming that's why Ken is not more outspoken against releasing rc2 as is
>> (Ken, let me know if I'm wrong!).
>>
>> Usually it's straightforward. Though whenever you start manipulating the
>> classloader logic, you can get odd results.
>>
>> E.g. by forcing your job jar's dependencies to show up first, now you can
>> have an issue where one of your jars masks an older/newer version that
>> Hadoop needs, so the job fails for some other reason.
>>
>> But yes, I don't feel strongly enough about this to vote -1, as I don't
>> think there are that many people using Tika with Hadoop.
>>
>> For Bixo, I'd defer updating the Tika dependency until another version is
>> released.
>>
>> Don't know about Behemoth - Julien?
>>
>> -- Ken
>>
>>
>> > For other users, though, say, in healthcare, where code security review
>> is stringent, this could be a real pain, no?
>> >
>> > Am I understanding correctly what will happen?  If so, do we really
>> want to do this?
>> >
>> >
>> > -----Original Message-----
>> > From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov]
>> > Sent: Saturday, April 18, 2015 11:48 PM
>> > To: dev@tika.apache.org
>> > Subject: Re: [VOTE] Apache Tika 1.8 Release Candidate #2
>> >
>> > +1 to pushing on Monday - if we have to roll a 1.9 quickly
>> > after, we can :)
>> >
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Chris Mattmann, Ph.D.
>> > Chief Architect
>> > Instrument Software and Science Data Systems Section (398)
>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> > Office: 168-519, Mailstop: 168-527
>> > Email: chris.a.mattm...@nasa.gov
>> > WWW:  http://sunset.usc.edu/~mattmann/
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Adjunct Associate Professor, Computer Science Department
>> > University of Southern California, Los Angeles, CA 90089 USA
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> >
>> >
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Tyler Palsulich <tpalsul...@gmail.com>
>> > Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
>> > Date: Saturday, April 18, 2015 at 11:29 PM
>> > To: "dev@tika.apache.org" <dev@tika.apache.org>
>> > Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2
>> >
>> >> Hi Folks,
>> >>
>> >> If there are no blocking complaints (OSGi?) by Monday (a little longer
>> >> than
>> >> 3 days, I realize), I'll mark this as passed and finish the release
>> >> process.
>> >>
>> >> Of course, it's no problem for me to cut another RC, if it's needed.
>> >>
>> >> Have a great weekend!
>> >> Tyler
>> >> I've run into one problem while testing Tika 1.8 with Bixo
>> >>
>> >> It involves a dependency issue involving (of course) Guava, since that
>> >> project loves to break their API :(
>> >>
>> >> The bixo-core jar has these transitive dependencies on various
>> versions of
>> >> Guava:
>> >>
>> >> Hadoop - 11.0.2
>> >> Cascading - 14.0.1
>> >> Tika-parsers - 10.0.1
>> >>       cdm - 17.0
>> >>
>> >> Everyone winds up using version 10.0.1 (note that Tika has a
>> dependency on
>> >> cdm, which wants to use 17.0)
>> >>
>> >> The problem is that Hadoop (for any recent version) uses an API from
>> >> Guava's cache implementation that no longer exists:
>> >>
>> >>
>> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL
>> >> oader;)Lcom/google/common/cache/LoadingCache;
>> >> java.lang.NoSuchMethodError:
>> >>
>> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL
>> >> oader;)Lcom/google/common/cache/LoadingCache;
>> >>       at
>> >> org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62)
>> >>       at
>> >> org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74)
>> >>       at
>> >> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1272)
>> >>       at
>> >>
>> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutp
>> >> utFormat.java:79)
>> >>
>> >> So what this means is that anyone trying to use Tika with Hadoop will
>> need
>> >> to play games with the class loader to get the older version of Guava -
>> >> though that can cause other issues if Hadoop (or Cascading, etc) rely
>> on
>> >> anything that's only in the newer Guava API.
>> >>
>> >> Guava 1.0.01 was released about 3.5 years ago; 11.0.2 was from about 3
>> >> years ago. So it seems like we should upgrade to at least 11.0.2
>> >>
>> >> But I don't know if this is enough of an issue to require another RC.
>> >>
>> >> -- Ken
>> >>
>> >> PS - I've created https://issues.apache.org/jira/browse/TIKA-1606 to
>> track
>> >> this.
>> >>
>> >>
>> >>> From: Tyler Palsulich
>> >>> Sent: April 13, 2015 10:56:29am PDT
>> >>> To: dev@tika.apache.org, u...@tika.apache.org
>> >>> Subject: [VOTE] Apache Tika 1.8 Release Candidate #2
>> >>>
>> >>> Hi Folks,
>> >>>
>> >>> A candidate for the Tika 1.8 release is available at:
>> >>>  https://dist.apache.org/repos/dist/dev/tika/
>> >>>
>> >>> The release candidate is a zip archive of the sources in:
>> >>>  http://svn.apache.org/repos/asf/tika/tags/1.8-rc2/
>> >>>
>> >>> The SHA1 checksum of the archive is
>> >>>  5e22fee9079370398472e59082d171ae2d7fdd31.
>> >>>
>> >>> In addition, a staged maven repository is available here:
>> >>>
>> https://repository.apache.org/content/repositories/orgapachetika-1009
>> >>>
>> >>> Please vote on releasing this package as Apache Tika 1.8. The vote is
>> >> open for the next 72 hours and passes if a majority of at least three
>> +1
>> >> Tika PMC votes are cast.
>> >>>
>> >>> [ ] +1 Release this package as Apache Tika 1.8
>> >>> [ ] ±0 I don't object to this release, but I haven't checked it
>> >>> [ ] -1 Do not release this package because...
>> >>>
>> >>> Thanks,
>> >>> Tyler
>> >>
>> >>
>> >> --------------------------
>> >> Ken Krugler
>> >> +1 530-210-6378
>> >> http://www.scaleunlimited.com
>> >> custom big data solutions & training
>> >> Hadoop, Cascading, Cassandra & Solr
>> >
>>
>> --------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
>>
>>
>>
>>
>>
>> --------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
>>
>>
>>
>>
>>
>>
>
>
> --
>
> Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to