Re: How to get a apache public license

2009-12-23 Thread Jake Mannix
Merry Christmas to you, Weiwei. If you want to release your software under *exactly* the Apache License (version 2.0 is the most current form of it), you may do so very easily - just read the appendix at the end of this page: http://www.apache.org/licenses/LICENSE-2.0 In particular, note that

Re: Lucene memory usage

2009-12-23 Thread tsuraan
> This (very large number of unique terms) is a problem for Lucene currently. > > There are some simple improvements we could make to the terms dict > format to not require so much RAM per term in the terms index... > LUCENE-1458 (flexible indexing) has these improvements, but > unfortunately tied

Re: How to get a apache public license

2009-12-23 Thread N Hira
To you as well, Weiwei Wang. You can theoretically release your project under a license that is very similar to the Apache license at any time, presuming you are licensing rights related to your project. To create a project that is maintained by the Apache Software Foundation, you should proba

Re: solution for parent-child relationship

2009-12-23 Thread Erick Erickson
First, if you don't need to distinguish between posts and comments, you don't care about the increment gap. But if you do... It's kind of arcane, but here's the general idea. You override your Analyzer of choice and implement getPositionIncrementGap. Say your getPositionIncrementGap returns 100. W

RES: spellchecker

2009-12-23 Thread Mário André
Thanks! - Mário André Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS Mestrando em MCC - Universidade Federal de Alagoas - UFAL http://www.marioandre.com.br/ Skype: mario-fa ---

Re: Need help with XML Query Parser example in Lucene 3.0

2009-12-23 Thread syedfa
Ignore the previous message, I realized that I just needed to choose the right combination to get a result! Thanks again for your time and patience. Take care. Sincerely; Fayyaz syedfa wrote: > > Thanks very much for your kind reply, and for pointing out my mistake. I > made the correction,

Re: Need help with XML Query Parser example in Lucene 3.0

2009-12-23 Thread syedfa
Thanks very much for your kind reply, and for pointing out my mistake. I made the correction, (I can't believe I left that line out!) This was a total oversight on my part. Having said that, after making the change, I re-ran the application, but now I'm getting no results to appear. I've tried

Re: spellchecker

2009-12-23 Thread Simon Willnauer
Hi mario, PlainTextDictionary expects a text file with one word per line like: hello world foo bar simon 2009/12/23 Mário André : > Hello friends, > > I’m new here and in the lucene Project. I’m trying use the "spellchecker" > according to the exemple below: > > > > // To index a file containin

spellchecker

2009-12-23 Thread Mário André
Hello friends, I’m new here and in the lucene Project. I’m trying use the "spellchecker" according to the exemple below: // To index a file containing words: spellchecker.indexDictionary(new PlainTextDictionary(new File("myfile.txt"))); String[] suggestions = spellchecker.suggestSimilar(

Re: solution for parent-child relationship

2009-12-23 Thread Deve
@Erick, you are right i will have to stop thinking in terms of databases, thats why i wanted to discuss this. i don't get how can i use getPositionIncrementGap, could you provide little more details. thanks, On Wed, Dec 23, 2009 at 8:45 PM, Erick Erickson wrote: > Ya just gotta stop thinking li

Re: solution for parent-child relationship

2009-12-23 Thread Erick Erickson
Ya just gotta stop thinking like a database guy, man . Lucene searches lots and lots of text very well. It doesn't do joins worth a darn. The moment you star thinking in terms of sub-queries, you're probably starting down the wrong track. Here's a possibility. Index each post and all associated co

Re: Lucene Index size v/s available memory

2009-12-23 Thread Erick Erickson
The size of your index isn't a very useful number without knowing a significant amount about the structure of your index. Depending upon what's stored, what's indexed and what kind of searching you're doing (e.g. sorting?) it varies. About all we can say is that you'll probably need less than 100G.

RE: Lucene going non-responsive under heavy load

2009-12-23 Thread Uwe Schindler
You behaviour sounds really like a GC issue. I would switch the GC to verbose and see what's happening (like Mark in his blog post). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Siraj Haider [mailto:s

Re: Lucene going non-responsive under heavy load

2009-12-23 Thread Michael McCandless
Are you using IndexReader.reopen to open a new reader, from an existing one? That's much more efficient than opening a new reader. I think a good next step is to run with IndexWriter.setInfoStream on, and run your JRE with verbose GC, to see more details. Mike On Wed, Dec 23, 2009 at 9:12 AM, S

Re: Lucene going non-responsive under heavy load

2009-12-23 Thread Siraj Haider
We have dual cpu intel xeon machines running "Red Hat Enterprise Linux ES release 3 (Taroon Update 6)". We have 4GB memory on these machines with 2GB allocated to tomcat. After modifying the index we open a new one, warm it up, make it live and then close the old one. -siraj Michael McCandle

RE: solution for parent-child relationship

2009-12-23 Thread Rao, Vaijanath
Hi Shahid, This is just one of the ways to get it. You can have an id to your post and comment for every post get's an subID so for example Post 1 get id 2566 and comment 1 in that post get id 2566-64. You can use various methods to get this ID. Then you can write your own Hit Collector in whic

solution for parent-child relationship

2009-12-23 Thread Shahid Faiz
Hi, Following are details of my problem and possible solutions which I can think of. Please suggest which should I choose, or is there any other approach better than these. I want to index blog posts and their comments, in my database posts and comments are stored in two different tables. Current

Re: Lucene Index size v/s available memory

2009-12-23 Thread Ian Lea
Hi 24Gb RAM for a 100Gb index is likely to be plenty. You don't have a huge amount of control over what lucene loads in memory, but take a look at termInfosIndexDivisor in IndexReader. And I believe that omitting field norms (Field.setOmitNorms) may help too. Googling for "lucene memory usage"

Re: Need help with XML Query Parser example in Lucene 3.0

2009-12-23 Thread mark harwood
Hi Fayyaz, >>I have found an error in the web.xml file, Good job! I found an error in your code so that makes us even :) It looks like you removed the line in the "openExampleIndex" method which opens the searcher. That explains your null pointer. The problem you found in the web.xml isn't a