Use your parser to get the string out of the binary file and index them using Lucene.

Store the string as it is, if it is small otherwise store the path and its offset position. The content could be later retrieved.

Regards
Ganesh


----- Original Message ----- From: "Paul Feuer" <paul...@gmail.com>
To: <java-user@lucene.apache.org>
Sent: Friday, January 30, 2009 10:00 AM
Subject: Re: indexing binary files?


we have parsers for these files.

to index them, do the string representations need to be stored (aside
from sitting in the index file)? or can the reader simply provide the
string in order to record the location of the record in the binary
file?

if i need to convert the binary file into text fields, the files will
get VERY large.

the binary data are well-formed events, so queries would be like
"where ACCOUNT = 'Microsoft'"

./paul


On Thu, Jan 29, 2009 at 11:00 PM, Anshum <ansh...@gmail.com> wrote:
Hi Paul,
Lucene is a 'text only' saerch lib. i.e. as long as you feed in anything as
a string, you'd be able to use lucene else I don't think there's a way.
How do you even intend to search in those binary files? as in... what would
be the keyword/phrase? asking out of curiosity!

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Fri, Jan 30, 2009 at 9:13 AM, Paul Feuer <paul...@gmail.com> wrote:

Hi -

I've looked on the FAQ, the Java Docs, and searched a little in
google, but haven't been able to figure out if Lucene can index binary
files.

Our binary files can get up into the 20-30 gigabyte range.

If it is possible, anyone have any pointers to what interfaces I should
look at?

Thanks,

./paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to