Hi Joe, Awesome! Thanks for picking this up and getting interested in this work. Right now, the only use cases we've had so far is to represent lats and lons (WGS84). It would be great to extract more information and come up with a policy for representing more WKTs and so forth. We should probably start by coming up with a scheme for encoding the extracted information in the Tika metadata object and in its output XHTML. Do you have any ideas about how to do that? Right now in the existing patch on TIKA-605, I simply was intended to use the met object and its key-multi-value structure to represent the extracted information but to take advantage of streaming and of content handlers, we ought to encode this information in the output XHTML.
Thoughts? Cheers, Chris On Feb 26, 2012, at 9:39 AM, Joe White wrote: > Hi, > I'm looking into implementing a bridge/link between Tika and GDAL so that > geospatial information can be saved from georeferenced images and vector > types. One thing that I have noticed while going through the code is that > the code only defines geographic coordinate types, using latitudes and > longitudes. Is this by design? If GDAL is wrapped into Tika, and a > projected image is imported, are the geospatial extents meant to be held in > the metadata as geographic points, possibly as WGS 84? > > Thanks > > Joe White ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++