Thanks. I think I will follow the advice. But just for the sack of curiosity, can what I suggest be done ?
Yonik Seeley wrote: > > On Thu, Jul 10, 2008 at 1:42 AM, blazingwolf7 <[EMAIL PROTECTED]> > wrote: >> Well, I am trying to extract the URL and contentLength from the ".fdt" >> file. >> I am planning to use both of these values in a filter to remove certain >> links to be display in the search result. The problem is, I am told not >> to >> use the IndexReader to retrieve these values for each document found >> matching with the query. >> >> So now, instead, I will have to retrieve the entire .fdt file, extract >> both >> the values and store it into an arraylist which will be use later. I am >> having problem extracting the entire file without using all the seek() >> method to determine the position of the document. >> >> Any suggestion? > > You're trying to do things at too low of a level (bypassing Lucene's > public APIs) > I suggested earlier that you index the URL untokenized, and then use > the FieldCache. That will allow you to easily retrieve a String[] of > all the URLs. > > -Yonik > > >> Yonik Seeley wrote: >>> >>> On Wed, Jul 9, 2008 at 11:13 PM, blazingwolf7 <[EMAIL PROTECTED]> >>> wrote: >>>> Sorry,but I am still quite new to Lucene. What exactly is "cp"? >>> >>> The unix command for copy (hence the smiley). >>> >>> Some of your recent questions seem to be suffering from an XY problem: >>> http://www.perlmonks.org/index.pl?node_id=542341 >>> You may get more help by explaining what you are trying to do. >>> >>> -Yonik >>> >>>> Yonik Seeley wrote: >>>>> >>>>> On Wed, Jul 9, 2008 at 9:01 PM, blazingwolf7 <[EMAIL PROTECTED]> >>>>> wrote: >>>>>> I had recently found out that Lucene will retrieve the content of a >>>>>> document >>>>>> from a file ".fdt". I am trying to retrieve the entire file in one go >>>>>> instead of retrieving it based on document number. can it be done? >>>>> >>>>> "cp" can retrieve the file on one go ;-) >>>>> >>>>> Other than that, the format is documented here: >>>>> http://lucene.apache.org/java/docs/fileformats.html >>>>> But I'm not sure why retrieving by document number won't work for you. >>>>> >>>>> -Yonik >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/.fdt-file-tp18373913p18376301.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/.fdt-file-tp18373913p18394786.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]