Hey Rahul, This is great and I'm totally willing to work with you to shepherd this in. The first step would be to create a JIRA issue for your parser, and then to submit a patch to incorporate it into the tika-parsers module. Of course, you can start with changing the namespace to org.apache.* (from its current edu.anu.* package).
Then, it would be nice to create a unit test for the parser, and include a sample FITS file that the unit tests can run against. There are a number of existing examples under test-resources within tika-parsers. While you are doing all this, you might want to file an Apache Individual Contributor License Agreement (ICLA) -- and to submit the application to secret...@apache.org to cover your contributions: http://www.apache.org/licenses/icla.txt Again I'd be happy to help and thanks for wanting to contribute to the project! Cheers, Chris On 12/4/12 3:18 PM, "Rahul Khanna" <rahul.kha...@anu.edu.au> wrote: >Hi, > >I'm a developer who has used Apache Tika in a Research Data Repository >System at The Australian National University. As part of the >requirements of the project we extended the functionality of Apache Tika >by creating a parser that extracts the headers of files in the FITS >format >(http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?s >tatus=detailReport&id=657) using the nom.tam.fits library available at >http://heasarc.gsfc.nasa.gov/docs/heasarc/fits/java/v1.0/ . > > > >Apache Tika already has the ability to identify FITS files (without >parsing them) as per https://issues.apache.org/jira/browse/TIKA-874 . Is >your team willing to review and potentially incorporate the parser into >Tika? The parser in its current form is available at >https://github.com/anu-doi/anudc/blob/master/DcShared/src/main/java/au/e >du/anu/dcbag/metadata/FitsParser.java . > > > >Thank you, > >Rahul Khanna > >rahul.kha...@anu.edu.au > > >