Re: Doc Caching

2010-04-19 Thread Chris B
Mike cheers for the reply. Is it worth setting up your own caching or letting the OS do it? I've setup a caching system, but if the OS is doing it it seems pointless. Chris - Original Message - From: "Michael McCandless" To: Sent: Tuesday, April 20, 2010 2:17 AM Subject: Re: Doc C

RE: NumericRangeQuery in BooleanQuery

2010-04-19 Thread Murdoch, Paul
Yes and no. I extended the QueryParser and overrode the getRangeQuery method to let it build NRQ. When parsing a BooleanQuery containing NRQ using the extended QueryParser the overridden getRangeQuery was never being called. I think it was still using the QueryParser's. In the end I'm buildi

Re: Doc Caching

2010-04-19 Thread Michael McCandless
No, Lucene doesn't. But the OS usually does (in is IO cache), assuming there is spare RAM. The "only" things that are explicitly held in memory by Lucene are the norms ("boost bytes"), terms dict index, deletions bit vector and field cache (used eg when you sort by a field), I think. Mike On Fr

Re: Term offsets for highlighting

2010-04-19 Thread Koji Sekiguchi
Stephen Greene wrote: Hi Koji, An additional question. Is it possible to access the FieldTermStack from the FastVectorHighlighter after the it has been populated with matching terms from the field? I think this would provide an ideal solution for this problem, as ultimately I am only concerned

Re: How to search by numbers

2010-04-19 Thread Erick Erickson
You might also want to think about which analyzer you're using for this field then. KeywordAnalyzer may suit your purpose if you're using some other one, or possibly even WhitespaceAnalyzer. PerFieldAnalyzeWrapper may be your friend too Bestr Erick On Mon, Apr 19, 2010 at 2:39 PM, Andy wrot

RE: How to search by numbers

2010-04-19 Thread Andy
That works, and now that I re-test my original code, it also works. > Date: Mon, 19 Apr 2010 10:52:45 -0700 > From: iori...@yahoo.com > Subject: Re: How to search by numbers > To: java-user@lucene.apache.org > > > > Hi, I have indexed the following two fields: > > org_id - NOT_ANALYZEDorg_name

Re: Different index per user

2010-04-19 Thread Erick Erickson
If you're thinking about sharding etc, have you looked at SOLR which manages a lot of the hard stuff for you? Best Erick On Mon, Apr 19, 2010 at 1:34 PM, Erdinc Yilmazel wrote: > Thanks Erick, > I don't have any access control constraints. The index won't be exposed to > the users. I am just con

Re: How to search by numbers

2010-04-19 Thread Ahmet Arslan
> Hi, I have indexed the following two fields: > org_id - NOT_ANALYZEDorg_name - ANALYZED > However when I try to search by org_id, for example, 12345, > I get no hits.  > I am using the StandardAnalyzer to index and search.  > > And I am using:  Query query = > queryParser.parse("org_id:12345")

How to search by numbers

2010-04-19 Thread Andy
Hi, I have indexed the following two fields: org_id - NOT_ANALYZEDorg_name - ANALYZED However when I try to search by org_id, for example, 12345, I get no hits. I am using the StandardAnalyzer to index and search. And I am using: Query query = queryParser.parse("org_id:12345"); Any ideas? Th

Re: Different index per user

2010-04-19 Thread Erdinc Yilmazel
Thanks Erick, I don't have any access control constraints. The index won't be exposed to the users. I am just concerned about scalability issues. I'll probably use adding a user identifier field approach first and I'll try sharding based on that value when my index grows. Erdinc On Mon, Apr 19, 2

RE: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Uwe Schindler
I am sorry, I dont understand what you are trying to say. Its confusing to me. If you want a Fuzzy query, where the first 3 chars ("the") always match (a PrefixQuery) and the rest of the term is fuzzy, use this Constructor and pass 3 as prefixlen: http://lucene.apache.org/java/3_0_1/api/all/org

Re: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Lukas Österreicher
Update to my last response with a sample of what I thought you might mean: This does not work. Original query up till now: +(item.name:the* item.name:the) New query would look like this (which states Match item.name where a term exists that is either Exactly the or starts with the): +(item.name:t

Re: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Lukas Österreicher
Thanx. However I do not fully understand, specifically with "how many characters are the prefix". As far as I understand you tell the prefix query How fuzzy the search shall be. In my example below 0.79 means 4 out of 5 characters must match. Do you mean I should change code to this: BooleanQuer

Big problem with solr in an official server.

2010-04-19 Thread Ariel
Hi everybody: I have a big problem with solr in a server with the memory size it is using, I would want to know how to configure it to use a limited memory size, I am setting up Solr with "java -jar start.jar" command in an ubuntu server, the process start.jar is using 7Gb of memory in the server

RE: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Uwe Schindler
Dont use PrefixQuery, only FuzzyQuery. There you pass in the whole term (with prefix) and define how many characters are the prefix. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Lukas Österreicher [m

Re: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Lukas Österreicher
Well, how would this look like in code? Currently I have the prefix query like this: BooleanQuery bQuery = new BooleanQuery(); PrefixQuery prefixQuery = new PrefixQuery(new Term("item.name", termText)); bQuery.add( prefixQuery, Occur.MUST); I dont see any class named PrefixTerm. I'd appreciate it

RE: Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Uwe Schindler
How about a fuzzy query with a prefix term? Its configureable. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Lukas Österreicher [mailto:lukas.oesterreic...@austria.real.com] > Sent: Monday, April 19, 2

Combining PrefixQuery and FuzzyQuery

2010-04-19 Thread Lukas Österreicher
Hello. Is it possible to combine PrefixQuery and FuzzyQuery? The search on a term should both be fuzzy but also match with results that jut begin with that token (or an approximation of that token). If it is possible, can you give me an example on how to achieve this? Currently I only use the Pr

[ANN] Carrot2 3.3.0 released

2010-04-19 Thread Stanislaw Osinski
Dear All, We're pleased to announce the 3.3.0 release of Carrot2 which significantly improves the scalability of the clustering algorithms (up to 7x times faster clustering in case of the STC algorithm) and fixes a number of minor issues. Release notes: http://project.carrot2.org/release-3.3.0-no

Re: Payload Example for Lucene 3.0.0

2010-04-19 Thread Ajay_978
Hi I am able to manage getpayload value using TermPosition class but now I am having one issue. I have document with doc_id and Title and I want to get payload for a given term in a given document but current API's takes input as term only. There is no parameter for document id. API example: Term

RE: Term offsets for highlighting

2010-04-19 Thread Stephen Greene
Hi Koji, An additional question. Is it possible to access the FieldTermStack from the FastVectorHighlighter after the it has been populated with matching terms from the field? I think this would provide an ideal solution for this problem, as ultimately I am only concerned with returning positiona

RE: Term offsets for highlighting

2010-04-19 Thread Stephen Greene
Hi Koji, SpanScorer was not highlighting correctly for me in 2.4.x. I have upgraded to 3.1 in the hopes of having better luck! Thank you, Steve -Original Message- From: Koji Sekiguchi [mailto:k...@r.email.ne.jp] Sent: Sunday, April 18, 2010 10:42 AM To: java-user@lucene.apache.org Sub

Re: about analyzer for searching location

2010-04-19 Thread Samarendra Pratap
Well... you are 50% right. when you write * * * Query q = qp.parse("\"united states\"");* It does search for two separate tokens "united" and "states" but checks if those are written sequentially. So above search will search for documents where token "states" is written after "united". *Note* th

Re: about analyzer for searching location

2010-04-19 Thread Ian.huang
Does a token of "united states" exist in index if using standard analyzer. My understanding is, united and states are separately stored in index, but not as "united states". So, if I build a query like Query q = qp.parse("\"united states\""); It would not return any result. Am I right? Ian --