Hi Shashi, What is the sense of this? The base64 encoded documents cannot be tokenized and searched. To do this, they must be indexed as plain text. If you want to store the original binary values as document data in the index, you could also store them additionally as byte[] in the raw biary form in the index. You must differentiate between *indexed* and *stored* fields.
But as Paul said, just *index* the text parts from the binary file using a parser and also *store* the offset value to get a pointer to the original data. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Shashi Kant [mailto:shashi_k...@yahoo.com] > Sent: Friday, January 30, 2009 3:32 PM > To: java-user@lucene.apache.org > Subject: Re: indexing binary files? > > Hi Paul, have you tried persisting the binaries in Base64 format and then > indexing them? > As you are aware, Base64 is a robust representation used in email > attachments for example. > > > Thanks > Shashi > > > > ----- Original Message ---- > From: Paul Feuer <paul...@gmail.com> > To: java-user@lucene.apache.org > Sent: Thursday, January 29, 2009 10:43:36 PM > Subject: indexing binary files? > > Hi - > > I've looked on the FAQ, the Java Docs, and searched a little in > google, but haven't been able to figure out if Lucene can index binary > files. > > Our binary files can get up into the 20-30 gigabyte range. > > If it is possible, anyone have any pointers to what interfaces I should > look at? > > Thanks, > > ./paul > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org