RE: Custom parser error

2012-07-31 Thread Uwe Schindler
Hi, > Hi Nick, sorry to bother again but I'm not quite sure of what you have said. > > > Nick Burch-2 wrote > > > > On Tue, 31 Jul 2012, 122jxgcn wrote: > > If your TikaInputStream lacks a file, and getFile is called, one will > > automatically be created for you. (That's part of the point!) > >

Re: Custom parser error

2012-07-31 Thread 122jxgcn
Hi Nick, sorry to bother again but I'm not quite sure of what you have said. Nick Burch-2 wrote > > On Tue, 31 Jul 2012, 122jxgcn wrote: > If your TikaInputStream lacks a file, and getFile is called, one will > automatically be created for you. (That's part of the point!) > I believe created f

[jira] [Commented] (TIKA-885) Possible ConcurrentModificationException while accessing Metadata produced by ParsingReader

2012-07-31 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426257#comment-13426257 ] Luis Filipe Nassif commented on TIKA-885: - But how to track updates to the metadata

[jira] [Commented] (TIKA-966) org.apache.tika.Tika missing from tika-bundle-1.2.jar

2012-07-31 Thread Gary Karasiuk (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426142#comment-13426142 ] Gary Karasiuk commented on TIKA-966: Using the 1.2 version of Tika, here is the stack tr

[jira] [Commented] (TIKA-966) org.apache.tika.Tika missing from tika-bundle-1.2.jar

2012-07-31 Thread Gary Karasiuk (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426046#comment-13426046 ] Gary Karasiuk commented on TIKA-966: >> You should be able to get your deployment workin

[jira] [Comment Edited] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425981#comment-13425981 ] Ray Gauss II edited comment on TIKA-965 at 7/31/12 6:18 PM: That

[jira] [Commented] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425981#comment-13425981 ] Ray Gauss II commented on TIKA-965: --- That's the solution I was looking into and I wanted t

RE: [ANNOUNCE] Welcome Jörg Ehrlich as new Tika PMC member and committer

2012-07-31 Thread Joerg Ehrlich
Hi everyone, First of all thank you very much. I am really looking forward to working with all of you on this interesting project! I am an engineer at Adobe located in Hamburg, Germany and I am working in a larger team which provides components and solutions for metadata management and automat

[ANNOUNCE] Welcome Jörg Ehrlich as new Tika PMC member and committer

2012-07-31 Thread Mattmann, Chris A (388J)
Hi Folks, The Tika PMC has VOTEd to elect Jörg Ehrlich to our ranks as a PMC member and committer. Welcome Jörg! Feel free to mention a bit about yourself. Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet P

[jira] [Commented] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425797#comment-13425797 ] Jukka Zitting commented on TIKA-965: In the {{TextDetector}} we could also look for the

[jira] [Commented] (TIKA-966) org.apache.tika.Tika missing from tika-bundle-1.2.jar

2012-07-31 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425794#comment-13425794 ] Jukka Zitting commented on TIKA-966: In 1.0 we excluded tika-core from tika-bundle as it

[jira] [Commented] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425790#comment-13425790 ] Ray Gauss II commented on TIKA-965: --- I do have a test file and it's more than a few bytes

[jira] [Commented] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425784#comment-13425784 ] Nick Burch commented on TIKA-965: - Do you have a sample file that shows this problem? And is

[jira] [Created] (TIKA-966) org.apache.tika.Tika missing from tika-bundle-1.2.jar

2012-07-31 Thread Gary Karasiuk (JIRA)
Gary Karasiuk created TIKA-966: -- Summary: org.apache.tika.Tika missing from tika-bundle-1.2.jar Key: TIKA-966 URL: https://issues.apache.org/jira/browse/TIKA-966 Project: Tika Issue Type: Bug

Re: Custom parser error

2012-07-31 Thread Nick Burch
On Tue, 31 Jul 2012, 122jxgcn wrote: I tried TikaInputStream.get() and tstream is no longer null. But it seems that tstream.hasFile() is null. If you create a TikaInputStream with an InputStream, then initially hasFile will be false. If you create it with a file, it'll be true If your TikaInpu

[ANNOUNCE] Welcome Ingo Renner as Tika PMC member and committer

2012-07-31 Thread Mattmann, Chris A (388J)
Hi Folks, The Tika PMC VOTEd to add Ingo Renner to our ranks as a PMC member and committer. Welcome, Ingo! Please feel free to say a bit about yourself. Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Prop

[jira] [Created] (TIKA-965) Text Detection Fails on Mostly Non-ASCII UTF-8 Files

2012-07-31 Thread Ray Gauss II (JIRA)
Ray Gauss II created TIKA-965: - Summary: Text Detection Fails on Mostly Non-ASCII UTF-8 Files Key: TIKA-965 URL: https://issues.apache.org/jira/browse/TIKA-965 Project: Tika Issue Type: Bug

Re: Custom parser error

2012-07-31 Thread 122jxgcn
Hi Nick, I tried TikaInputStream.get() and tstream is no longer null. But it seems that tstream.hasFile() is null. I'm pretty sure I'm loading the file right, as I did same thing with parser for pdf. -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-parser-error-tp39983

Re: Custom parser error

2012-07-31 Thread Nick Burch
On Tue, 31 Jul 2012, 122jxgcn wrote: try { TikaInputStream tstream = TikaInputStream.cast(stream); You probably want TikaInputStream.get rather than cast. Cast casts it if possible, get wraps it Nick

Custom parser error

2012-07-31 Thread 122jxgcn
Hi, I'm continuing my question from http://lucene.472066.n3.nabble.com/Convert-file-before-Tika-processes-it-td3990629.html this post So, I wrote some code and test, but it's not passing On the test, I did something like InputStream stream = HWPParserTest.class.getResourceAsStream( "/t