Re: how to match a term within digital strings?

2009-11-08 Thread Wenbo Zhao
Thanks. 2009/11/9 AHMET ARSLAN : > > >> Thanks a lot.  I'm such a fool >> >> BTW, where can I find better doc other than javadoc ? >> Or how do you get into lucene docs ?   I'm >> really a little crazy >> reading javadoc, all concepts are split into fragments. > > Lucene in Action [1] Second Editi

Re: how to match a term within digital strings?

2009-11-08 Thread Wenbo Zhao
Thanks. But i'm indexing number strings like phone numbers and card numbers, all kinds of search are possible. And fortunately my application is not strict on fast response. A search in several seconds is acceptable :-) 2009/11/9 Uwe Schindler : > If you *only* want to do wildcard queries on tha

RE: how to match a term within digital strings?

2009-11-08 Thread Uwe Schindler
If you *only* want to do wildcard queries on that field with a * in front, I would suggest to reverse the string so the query uses the * at the end. Wildcards at the beginning are very slow, because every term from this field has to be enumerated. If the wildcard is at the end, because you made all

Re: how to match a term within digital strings?

2009-11-08 Thread AHMET ARSLAN
> Thanks a lot.  I'm such a fool > > BTW, where can I find better doc other than javadoc ? > Or how do you get into lucene docs ?   I'm > really a little crazy > reading javadoc, all concepts are split into fragments. Lucene in Action [1] Second Edition is excellent. [1] http://www.manning.com

Index individual digital strings

2009-11-08 Thread Wenbo Zhao
Hi all, What's the best way to index digital strings ? currently I'm using doc.add(new Field("id", docid, Field.Store.YES, Field.Index.NOT_ANALYZED)); doc.add(new Field("field", str, Field.Store.NO, Field.Index.ANALYZED)); str is concatenated digital strings for this document. I guess there

Re: how to match a term within digital strings?

2009-11-08 Thread Wenbo Zhao
Thanks a lot. I'm such a fool BTW, where can I find better doc other than javadoc ? Or how do you get into lucene docs ? I'm really a little crazy reading javadoc, all concepts are split into fragments. 2009/11/9 AHMET ARSLAN : >> Hi all, >> I want to query part of a digital string: >> say in

Re: how to match a term within digital strings?

2009-11-08 Thread Wenbo Zhao
Hi all, I think I got an approach, it may not be the best but it works. My code is as following, work as query of "*19810919*" IndexSearcher isearcher = new IndexSearcher(directory, true); IndexReader ir = isearcher.getIndexReader(); TermEnum te = ir.terms(); List result = new Array

Re: how to match a term within digital strings?

2009-11-08 Thread AHMET ARSLAN
> Hi all, > I want to query part of a digital string: > say indexed token is "123456789" > I want to query 56789 to match this token > The "Query Parser Syntax" says wildcard search can not > be the first char.  So "*56789" is not allowed > How can I do that ? > Thanks. With org.apache.lucene.quer

how to match a term within digital strings?

2009-11-08 Thread Wenbo Zhao
Hi all, I want to query part of a digital string: say indexed token is "123456789" I want to query 56789 to match this token The "Query Parser Syntax" says wildcard search can not be the first char. So "*56789" is not allowed How can I do that ? Thanks. -- Best Regards, ZHAO, Wenbo ===

Re: IndexWriter.close() no longer seems to close everything

2009-11-08 Thread Daniel Noll
On Mon, Nov 9, 2009 at 14:41, John Wang wrote: > I am seeing the samething, but only when IndexWriter.getReader is called at > a high rate. > > from lsof, I see file handles growing. This hint turned out to help. :-) Turns out we had an IndexReader hanging around from a previous index state (bef

Re: IndexWriter.close() no longer seems to close everything

2009-11-08 Thread John Wang
I am seeing the samething, but only when IndexWriter.getReader is called at a high rate. from lsof, I see file handles growing. -John On Sun, Nov 8, 2009 at 7:29 PM, Daniel Noll wrote: > Hi all. > > We updated to Lucene 2.9, and now we find that after closing our text > index, it is not possib

IndexWriter.close() no longer seems to close everything

2009-11-08 Thread Daniel Noll
Hi all. We updated to Lucene 2.9, and now we find that after closing our text index, it is not possible to rename the directory in which it resides (we are actually renaming a directory further up the hierarchy.) We discovered that the following files were still open by the process: _0.tis, _0

Re: Indexing domain names?

2009-11-08 Thread Chris Were
Thanks for the tips guys, got it working now. Cheers, Chris On Sun, Nov 8, 2009 at 6:43 AM, Erick Erickson wrote: > << indexed > are free form text. >>> > > If you try Ahmet's suggestion, PerFieldAnalyzerWrapper is your friend. The > snippet > above makes me wonder if you've seen this class.

Re: 2 phase commit with external data

2009-11-08 Thread Michael McCandless
OK, thanks for the tests... this test also reproduces it: public void testPrepareCommitIsCurrent() throws Throwable { Directory dir = new MockRAMDirectory(); IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), IndexWriter.MaxFieldLength.UNLIMITED); Document doc = new

Re: similarity function

2009-11-08 Thread Chris Hostetter
: "how do i set the score of each document result to be the score of that : of the field that best matches the search terms"? you'll want something like this psuedo code... DisjunctionMaxQuery dq = new DMQ foreach fieldname in list_of_fields { BooleanQuery bq = new BQ foreach word in l

Re: 2 phase commit with external data

2009-11-08 Thread Peter Keegan
>Are you using Lucene 2.9? Yes Peter On Sun, Nov 8, 2009 at 6:23 PM, Peter Keegan wrote: > Here is some stand-alone code that reproduces the problem. There are 2 > classes. jvm1 creates the index, jvm2 reads the index. The system console > input is used to synchronize the 4 steps. > > jvm1: > -

Re: 2 phase commit with external data

2009-11-08 Thread Peter Keegan
Here is some stand-alone code that reproduces the problem. There are 2 classes. jvm1 creates the index, jvm2 reads the index. The system console input is used to synchronize the 4 steps. jvm1: -- import java.io.File; import java.util.Scanner; import org.apache.lucene.index.IndexWriter;

Re: Directory.list() deprecation

2009-11-08 Thread Daniel Noll
On Fri, Nov 6, 2009 at 20:26, Michael McCandless wrote: > Well... you can use oal.index.IndexFileNameFilter.getFilter() to > filter for only the Lucene index files, or, you could filter for the > additional files you know you've placed in the index directory? This is the workaround we're currentl

Re: Indexing domain names?

2009-11-08 Thread Erick Erickson
<<>> If you try Ahmet's suggestion, PerFieldAnalyzerWrapper is your friend. The snippet above makes me wonder if you've seen this class.. Best Erick On Sun, Nov 8, 2009 at 5:54 AM, AHMET ARSLAN wrote: > > Hi, > > > > How do I go about indexing domain names? I currently index > > the domain

RE: lucene 2.9+ numeric indexing

2009-11-08 Thread Uwe Schindler
That's indeed strange. The problem has nothing to do with NumericField/NumericUtils and corresponding FieldCache parsing at all, it is more the autodetection falling back to NumericField parser, if the first term is not parseable as old-style numeric. Because of that you get this error message, bec

Re: synonym payload boosting

2009-11-08 Thread AHMET ARSLAN
Additionaly you need to modify your queryparser to return BoostingTermQuery, PayloadTermQuery, PayloadNearQuery etc. With these types of Queries scorePayload method invoked. Hope this helps. --- On Sun, 11/8/09, David Ginzburg wrote: > From: David Ginzburg > Subject: synonym payload boosting

Re: synonym payload boosting

2009-11-08 Thread Simon Willnauer
You might get an answer on the solr list. This is the lucene users list. Simon On Nov 8, 2009 2:24 PM, "David Ginzburg" wrote: Hi, I have a field and a wighted synonym map. I have indexed the synonyms with the weight as payload. my code snippet from my filter *public Token next(final Token reu

synonym payload boosting

2009-11-08 Thread David Ginzburg
Hi, I have a field and a wighted synonym map. I have indexed the synonyms with the weight as payload. my code snippet from my filter *public Token next(final Token reusableToken) throws IOException * *. * *. * *.* * Payload boostPayload;* * * *for (Synonym sy

Re: Indexing domain names?

2009-11-08 Thread AHMET ARSLAN
> Hi, > > How do I go about indexing domain names? I currently index > the domain, but > it only works if I put the exact full domain in. For > example: > > site:www.youtube.com (this works) > site:youtube.com (this doesn't work) > > I am using the StandardAnalyzer as most of the other fields >

lucene 2.9+ numeric indexing

2009-11-08 Thread John Wang
Hi guys: Running into a strange problem: I am indexing into a field a numeric string: int n = Math.abs(rand.nextInt(100)); Field myField = new Field(MY_FIELD,String.valueOf(n),Store.NO,Index. NOT_ANALYZED_NO_NORMS); myField.setOmitTermFreqAndPositions(true); doc.add(myFi