Re: Metadata situation and XMP support in Tika

2012-04-24 Thread Ingo Renner
Am 24.04.2012 um 13:43 schrieb Nick Burch: > I certainly see there are cases when someone may want the full set of > metadata back from a file, in quite a low level way, but we should make sure > we don't loose the ability of users to say "give me the title of that > document, no matter what t

RE: Metadata situation and XMP support in Tika

2012-04-24 Thread Joerg Ehrlich
inggroup.com/specs/ [2] http://www.w3.org/TR/2012/REC-mediaont-10-20120209/ -Original Message- From: Ray Gauss II [mailto:ray.ga...@alfresco.com] Sent: Dienstag, 24. April 2012 15:10 To: dev@tika.apache.org Subject: Re: Metadata situation and XMP support in Tika I think the aliasin

RE: Metadata situation and XMP support in Tika

2012-04-24 Thread Joerg Ehrlich
ple of ideas I have also been discussing with Jukka so far and the steps we see on a roadmap that should help us to improve the metadata situation for Tika. Regards Jörg -Original Message- From: Ray Gauss II [mailto:ray.ga...@alfresco.com] Sent: Dienstag, 24. April 2012 15:10 To: dev@

Re: Metadata situation and XMP support in Tika

2012-04-24 Thread Ray Gauss II
I think the aliasing approach supports both use cases nicely, i.e.: Metatadata.java: ... Property TITLE = DublinCore.DC_TITLE; ... Users then only have to concern themselves with "give me the metadata that best fits the idea of Title, as defined by Tika", and not even have to know about Dubl

RE: Metadata situation and XMP support in Tika

2012-04-24 Thread Nick Burch
On Fri, 13 Apr 2012, Joerg Ehrlich wrote: I think it would be more clear if parsers/clients would use the namespace or standard properties explicitly instead of the metadata one. But your idea of having a set of "standard" properties available in the Metadata class would be a good help for clie

Re: Metadata situation and XMP support in Tika

2012-04-13 Thread Ray Gauss II
>> A parser can then map properties to those namespaces it sees fit or several >> at the same time and the client can then decide which semantic (i.e. >> properties) it would like to use. >> >> Regards >> Jörg >> >> [1] >> http://docs.oasis-ope

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
enhancements :) Jörg -Original Message- From: Ray Gauss II [mailto:ray.ga...@alfresco.com] Sent: Mittwoch, 11. April 2012 00:04 To: dev@tika.apache.org Subject: Re: Metadata situation and XMP support in Tika Hi Jörg, As you've seen from TIKA-859 and TIKA-842 I've had to deal wi

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
ns/standards/Ecma-376.htm > (in part 2) > > --- > Jörg Ehrlich | Computer Scientist | XMP Technology | Adobe Systems | > joerg.ehrl...@adobe.com | work: +49(40)306360 > > -Original Message- > From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] &g

Re: Metadata situation and XMP support in Tika

2012-04-13 Thread Ray Gauss II
w.ecma-international.org/publications/standards/Ecma-376.htm > (in part 2) > > --- > Jörg Ehrlich | Computer Scientist | XMP Technology | Adobe Systems | > joerg.ehrl...@adobe.com | work: +49(40)306360 > > -Original Message- > From: Mattmann, Chris A (388J) [mailto:chris.a.mat

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
ache.org Subject: Re: Metadata situation and XMP support in Tika Hi Jörg, Great summary! I would be in favor of option #2 as well, with the caveat that if we take it slow, I think there might be a way to not really have as much of a client/API impact, using deprecations and other techniques as you

Re: Metadata situation and XMP support in Tika

2012-04-10 Thread Ray Gauss II
Hi Jörg, As you've seen from TIKA-859 and TIKA-842 I've had to deal with similar issues. Those issues were needed by TIKA-774 which itself contains another mapping that converts the data output by ExifTool to the proper IPTC metadata defined in TIKA-842. The code for the ExifTool parser is now

Re: Metadata situation and XMP support in Tika

2012-04-05 Thread Mattmann, Chris A (388J)
Hi Jörg, Great summary! I would be in favor of option #2 as well, with the caveat that if we take it slow, I think there might be a way to not really have as much of a client/API impact, using deprecations and other techniques as you suggested. Looking forward to your participation! Cheers, C

Metadata situation and XMP support in Tika

2012-04-05 Thread Joerg Ehrlich
Hi everyone, I am an engineer in the XMP/Metadata team at Adobe and we would like to leverage Tika in current projects for metadata extraction (and mimetype detection). Our current systems primarily use the XMP data model to manage and interact with metadata. As far as I can see, the support fo