Re: indexing records in hierachy

2005-11-01 Thread Otis Gospodnetic
Hi, In Lucene individual documents stand alone and can't be related to other Lucene documents in the index automatically. In other words, this is something you have to do through clever document/field design. Otis --- Urvashi Gadi <[EMAIL PROTECTED]> wrote: > Hi, > > Is there a way to link i

indexing records in hierachy

2005-11-01 Thread Urvashi Gadi
Hi, Is there a way to link indexed records in Lucene? I am working with collections and a record might be a part of a collection and this collection can again be a part of super collection. So if the search result has the record, is there a way to find out all the collections under which the reco

RE: Help requested

2005-11-01 Thread Daniel . Clark
You're right, Peter. I did some thorough testing today and I was very wrong. I'm now using MultiFieldQueryParser as Erik suggested. ~ Daniel Clark, Senior Consultant Sybase Federal Professional Services 6550 Rock Spring Drive, Suite 800 Bethesda, MD 20817

Re: Analysis

2005-11-01 Thread Erik Hatcher
On 1 Nov 2005, at 11:02, Malcolm wrote: Hi, I've been reading my new project bible 'Lucene in Action' Amen! ;) about Analysis in Chapter 4 and wondered what others are doing for indexing XML(if anyone else is, that is!). Are you folks just writing your own or utilising the current Lucene

Re: lock file race conditions

2005-11-01 Thread Chris Hostetter
: IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), : true); : : so if I try to close() it in a finally or something it throws a null : pointer exception since the exception was throw in the constructor. : : I'm simulating the exception by hand-creating the index directory and

Indexing Help

2005-11-01 Thread msftblows
Hey- I currently have an index with user information...one of the fields that I have in a document is CurrentJob...I was asked to add in past jobs as wellbut the rules are that I can configure the weighting (mainly boosting) for absolute jobs (that this is a current job or a not current jo

Re: lock file race conditions

2005-11-01 Thread Dan Adams
Well, I'm running on linux and I thought the problem was that the writer was not being closed but the ioexception is thrown at: IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); so if I try to close() it in a finally or something it throws a null pointer exception sin

Re: lock file race conditions

2005-11-01 Thread Chris Hostetter
1) how do you simulate the exception? 2) you didn't say you got a lock timeout error, you said you got a "couldnt delete the lock file" exception ... is your second test forcably trying to unlock the index? 3) are you running this test on a windows machine? 4) can you post your unit test?

lock file race conditions

2005-11-01 Thread Dan Adams
I have 2 junit tests. The first opens on index writer and then simulates have an IOException get throw when trying to add a document. The test that runs after than is just a normal test of the search. After the first test completes a lock file is left in /tmp. Now, if I run the test suite normally

Re: Analysis

2005-11-01 Thread Malcolm
I'm currently indexing the INEX collection and then performing queries on the Format features within the text. I am using a wide range of the XML features. The reason I asked about the XML Analysis is I am interesting in opinions and reasons for adding a wide range of discussion to my dissertat

Re: Analysis

2005-11-01 Thread Grant Ingersoll
Not sure I am understanding your question correctly, but I think you want to pick your Analyzer based on what is in your content (i.e. language, usage of special symbols, etc.), not based on what the format of your content is (i.e. XML). Malcolm wrote: Hi, I'm just asking for opinions on Ana

RE: Analysis

2005-11-01 Thread Peter Kim
Ok... just got confused because you mentioned XML. Unless you're actually indexing the raw XML in some of your fields, the fact that you're indexing XML documents as your source content is irrelevant to your choice of Analyzer. Choice of indexer really depends on your specific project requirements

Re: Analysis

2005-11-01 Thread Malcolm
Hi, I'm just asking for opinions on Analyzer's for the indexing. For example Otis in his article uses the WhitespaceAnalyzer and the Sandbox program uses the StandardAnalyzer.I am just gauging opinions on the subject with regard to XML. I'm using a mix of the Sandbox XMLDocumentHandlerSAX and a

RE: Analysis

2005-11-01 Thread Peter Kim
Not exactly sure what you're asking with regards to Analyzers and parsing XML... But for parsing and indexing XML documents with Lucene, you can find a lot of material out there by searching the list archives and using google. However, the document I found most helpful was this piece written by Ot

Analysis

2005-11-01 Thread Malcolm
Hi, I've been reading my new project bible 'Lucene in Action' about Analysis in Chapter 4 and wondered what others are doing for indexing XML(if anyone else is, that is!). Are you folks just writing your own or utilising the current Lucene analysis libraries? thanks, Malcolm Clark

StandardAnalyzer and thread safety

2005-11-01 Thread Sharma, Siddharth
Is using a QueryParser to parse a query using the same, single instance of Analyzer thread-safe? Or should I create a new Analyzer each time? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PR

Re: Search problems

2005-11-01 Thread Steven Rowe
Such an analyzer already exists, in Lucene's Subversion repository, under contrib/analyzers/: KeywordAnalyzer. Robert Watkins wrote: One approach for matching your queries with Luke would be to write a custom Analyzer that does absolutely nothing to the terms. Then, if you put this Analyzer in

Re: Search problems

2005-11-01 Thread Robert Watkins
One approach for matching your queries with Luke would be to write a custom Analyzer that does absolutely nothing to the terms. Then, if you put this Analyzer in your classpath when running Luke you can select it as the Analyzer you want Luke to use to tokenize your query. This is not, of course,

Re: Search problems

2005-11-01 Thread Robert Watkins
One approach for matching your queries with Luke would be to write a custom Analyzer that does absolutely nothing to the terms. Then, if you put this Analyzer in your classpath when running Luke you can select it as the Analyzer you want Luke to use to tokenize your query. This is not, of course,

Re: Java Indexer + DotLucene + IIS question

2005-11-01 Thread msftblows
One more question...if I do use a tool for replication and I only have the indexer running on one machine...say it creates 10 file *.cfs...the tool replicates all the files...then the machine with the indexer compresses those files and then are now all gone...how will they be removed from the sl

Re: Search problems

2005-11-01 Thread Miles Barr
On Thu, 2005-10-27 at 16:35 -0400, Sharma, Siddharth wrote: > My index has 4 keyword fields and one unindexed field. > I want to search by the 4 keyword fields and return the one unindexed field. > > I can iterate over the documents via Luke. > But when I search for the same values that I see via

Re: BooleanQuery

2005-11-01 Thread Erik Hatcher
On 1 Nov 2005, at 08:17, Michael D. Curtin wrote: tcorbet wrote: I have an index over the titles to .mp3 songs. It is not unreasonable for the user to want to see the results from: "Show me Everything". I understand that title:* is not a valid wildcard query. I understand that title:[a* TO z

Re: BooleanQuery

2005-11-01 Thread Michael D. Curtin
tcorbet wrote: I have an index over the titles to .mp3 songs. It is not unreasonable for the user to want to see the results from: "Show me Everything". I understand that title:* is not a valid wildcard query. I understand that title:[a* TO z*] is a valid wildcard query. What I cannot underst

Re: what is the best way to sort by document ids

2005-11-01 Thread Michael D. Curtin
Oren Shir wrote: My documents contain a field called SORT_ID, which contains an int that increases with every document added to the index. I want my results to be sorted by it. Which approach will prove the best performance: 1) Zero pad SORT_ID field and sort by it as plain text. 2) Sort using

Re: what is the best way to sort by document ids

2005-11-01 Thread Erik Hatcher
On 1 Nov 2005, at 06:03, Oren Shir wrote: Hi, My documents contain a field called SORT_ID, which contains an int that increases with every document added to the index. I want my results to be sorted by it. Which approach will prove the best performance: 1) Zero pad SORT_ID field and sor

Re: BooleanQuery

2005-11-01 Thread Erik Hatcher
On 1 Nov 2005, at 01:43, tcorbet wrote: I have an index over the titles to .mp3 songs. It is not unreasonable for the user to want to see the results from: "Show me Everything". I understand that title:* is not a valid wildcard query. I understand that title:[a* TO z*] is a valid wildcard query

what is the best way to sort by document ids

2005-11-01 Thread Oren Shir
Hi, My documents contain a field called SORT_ID, which contains an int that increases with every document added to the index. I want my results to be sorted by it. Which approach will prove the best performance: 1) Zero pad SORT_ID field and sort by it as plain text. 2) Sort using SortField for

BooleanQuery

2005-11-01 Thread tcorbet
I have an index over the titles to .mp3 songs. It is not unreasonable for the user to want to see the results from: "Show me Everything". I understand that title:* is not a valid wildcard query. I understand that title:[a* TO z*] is a valid wildcard query. What I cannot understand is this behavi