Re: Using Lucene for user query parsing

2009-03-05 Thread Anshum
Hi Srinivas, Perhaps what you need here is a query formation logic which assigns the right keywords to the right fields. Let me know in case I got it wrong. One way to do that could be by using index time boost for fields and then running a query (so that a particular field is preferred over the o

Questions about analyzer

2009-03-05 Thread Ganesh
Hello all 1) Which is best to use Snowball analyzer or Lucene contrib analyzer? There is no inbuilt stop word list for Snowball analyzer? 2) Whether Analyzer and QueryParser are thread-free. They could created once and use it in as many threads? 3) I am using Snowball Analyzer to do index a

Re: indexing but not tokenizing

2009-03-05 Thread John Marks
Thank you Ian, > If you want a direct suggestion: use PerFieldAnalyzerWrapper, > specifying a different analyzer for field B. > > > -- > Ian. this makes a lot of sense. -John - To unsubscribe, e-mail: java-user-unsubscr...@luc

Using Lucene for user query parsing

2009-03-05 Thread Srinivas Bharghav
I am trying to evaluate as to whether Lucene is the right candidate for the problem at hand. Say I have 3 indexes: Index 1 has street names. Index 2 has business names. Index 3 has area names. All these names can be single words or a combination of words like woodward street or marks and spencer

Re: error in code

2009-03-05 Thread Ganesh
Hello gopi, My comments. if(textFiles[i].isFile() > textFiles[i].getName().endsWith(".txt")){ && should be used. *document.add(Field.Text("content",textReader)); document.add(new Field("content", textReader); document.add(Field.Text("path",textFiles[i].getPath()));* document.ad

error in code

2009-03-05 Thread nitin gopi
Hi all, I am getting error in running this code. Can somebody please tell me what is the problem? The code is given below. The bold lines were giving error as *cannot find symbol * import java.io.File; import java.io.FileReader; import java.io.Reader; import java.util.Date; import org.apache.

deletion of index-files fails

2009-03-05 Thread rolarenfan
So, I have a (small) Lucene index, all fine; I use it a bit, and then (on app shutdown) want to delete its files and the containing directory (the index is intended as a temp object). At some earlier time this was working just fine, using java.io.File.delete(). Now however, some of the files get

Re: execute on server and read from file

2009-03-05 Thread Erick Erickson
Uhhhm, this is the Lucene user's list, not a general Java programming thread, so unless this has something to do with Lucene I doubt you'll get much help. I'd suggest one of the Java programming language lists rather than this one. Best Erick On Thu, Mar 5, 2009 at 6:32 PM, futurpc wrote: > >

Re: Confidence scores at search time

2009-03-05 Thread Chris Hostetter
: That being said, I could see maybe determining a delta value such that if the : distance between any two scores is more than the delta, you cut off the rest : of the docs. This takes into account the relative state of scores and is not : some arbitrary value (although, the delta is, of course)

RE: Confidence scores at search time

2009-03-05 Thread Chris Hostetter
: > Hmm, bugzilla has moved to JIRA. I'm not sure where the mapping is : > anymore. There used to be a Bugzilla Id in JIRA, I think. Sorry. FYI... by default the jira homepage has a form for searching by legacy bugzilla ID... https://issues.apache.org/jira/ ...if you create a Jira account

execute on server and read from file

2009-03-05 Thread futurpc
hello. i have data files on web server that contains some values(i need to build from them chart). i make applet that read information from file and build chart. but when i upload the applet to server , it didn't find the files. can you please suggest how can i make java program that will be execu

Re: Query against newly created index.. Do not work

2009-03-05 Thread Chris Hostetter
: I can now create indexes with Nutch, and see them in Luke.. this is : fantastic news, well for me it is beyond fantastic.. : Now I would like to (need to) query them, and to that end I wrote the : following code segment. : : int maxHits = 1000; : NutchBean nutchBean

Re: similarity function

2009-03-05 Thread patrick o'leary
Sounds like your most difficult part will be the question parser using POS. This is kind of old school but use something like the AliceBot AIML library http://en.wikipedia.org/wiki/AIML Where the subjective terms can be extracted from the questions, and indexed separately. Or as Grant and others

Re: similarity function

2009-03-05 Thread Grant Ingersoll
Hi Seid, Do you have a reference for the article? I've done some QA in my day, but don't recall reading that one. At any rate, I do think it is possible to do what you are after. See below. On Mar 5, 2009, at 9:49 AM, Seid Mohammed wrote: For my work, I have read an article stating th

Re: public apology for company spam

2009-03-05 Thread Shashi Kant
Yes, it is good to learn that Yonik, Erik et al are also human-beings. :-) Thanks for all your contributions to Lucene/Solr, this list and the OSS community in general. Best, Shashi On Thu, Mar 5, 2009 at 11:36 AM, Erick Erickson wrote: > Let's see, you guys generously contributed your time and

Re: similarity function

2009-03-05 Thread Vasudevan Comandur
Hi, The very fact that you are trying to answer factoid questions to start with, it is better to use OpenNLP components to identify NER (Named Entity recognition) in the document and use those tags as part of your indexing process. REgards Vasu On Thu, Mar 5, 2009 at 8:19 PM, Seid Mohamm

Instantiating a RAMDirectory from a mutating directory

2009-03-05 Thread Kieran Topping
Hello, I would like to be able to instantiate a RAMDirectory from a directory that an IndexWriter in another process might currently be modifying. Ideally, I would like to do this without any synchronizing or locking. Kind-of like the way in which an IndexReader can open an index in a direct

Re: public apology for company spam

2009-03-05 Thread Erick Erickson
Let's see, you guys generously contributed your time and saved my butt way more than once. I *think* I can stand an inadvertent message or two ... Best Erick On Thu, Mar 5, 2009 at 10:12 AM, Glen Newton wrote: > Yonik, > > Thank-you for your email. I appreciated and accept your apology. > > Ind

Re: 答复: 答复: Lucene in large database contexts

2009-03-05 Thread Patrick Turcotte
mkjjyy On 8/10/07, Askar Zaidi wrote: Hey Guys, I am trying to do something similar. Make the content search-able as soon as it is added to the website. The way it can work in my scenario is that , I create the Index for a every new user account created. Then, whenever a new document is

Re: indexing but not tokenizing

2009-03-05 Thread Ian Lea
Hi I think that the SimpleAnalyzer you are passing to the query parser will be downcasing the X. You can fix it using an analyzer that doesn't convert to lower case, creating the query directly in code, or by using PerFieldAnalyzerWrapper, and no doubt other ways too. If you want a direct sugge

indexing but not tokenizing

2009-03-05 Thread John Marks
Hi all, I'm not able to see what's wrong in the following sample code. I'm indexing a document with 5 fields, using five different indexing strategies. I'm fine the the results for 4 of them, but field B is causing me some trouble in understanding what's going on. The value of field B is X (upper

Re: public apology for company spam

2009-03-05 Thread Glen Newton
Yonik, Thank-you for your email. I appreciated and accept your apology. Indeed the spam was annoying, but I think that you and your colleagues have significant social capital in the Lucene and Solr communities, so this minor but unfortunate incident should have minimal impact. That said, you and

Re: Learning Lucene

2009-03-05 Thread Erik Hatcher
On Mar 5, 2009, at 9:24 AM, Tuztuz T wrote: dear all I am really new to lucene Is there anyone who can guid me learning lucene I have lucene in action the old book, but I get hard time to understand the syntaxes in the book and the new lucene release (2.4) Can anyone give me copy of the new lu

similarity function

2009-03-05 Thread Seid Mohammed
For my work, I have read an article stating that " Answer type can be automatically constructed by Indexing Different Questions and Answer types. Later, when an unseen question apears, answer type for this question will be found with the help of 'similarity function' computation" so I am clear wit

public apology for company spam

2009-03-05 Thread Yonik Seeley
This morning, an apparently over-zealous marketing firm, on behalf of the company I work for, sent out a marketing email to a large number of subscribers of the Lucene email lists. This was done without my knowledge or approval, and I can assure you that I'll make all efforts to prevent it from ha

RE: Learning Lucene

2009-03-05 Thread Sudarsan, Sithu D.
Hi Tuztuz, Please visit the book's website and the forum. You will get most queries cleared. Sincerely, Sithu D Sudarsan -Original Message- From: Tuztuz T [mailto:tuztu...@yahoo.com] Sent: Thursday, March 05, 2009 9:24 AM To: java-user@lucene.apache.org Subject: Learning Lucene dear a

Learning Lucene

2009-03-05 Thread Tuztuz T
dear all I am really new to lucene Is there anyone who can guid me learning lucene I have lucene in action the old book, but I get hard time to understand the syntaxes in the book and the new lucene release (2.4) Can anyone give me copy of the new lucen inaction book or any other material that i

Re: IndexSearcher

2009-03-05 Thread Erick Erickson
I think your root problem is that you're indexing UN_TOKENIZED, which means that the tokens you're adding to your index are NOT run through the analyzer. So your terms are exactly "111", "222 333" and "111 222 333", none of which match "222". I expect you wanted your tokens to be "111", "222", and

IndexSearcher

2009-03-05 Thread liat oren
Hi, I would like to do a search that will return documents that contain a given word. For example, I created the following index: IndexWriter writer = new IndexWriter("C:/TryIndex", new StandardAnalyzer()); Document doc = new Document(); doc.add(new Field(WordIndex.FIELD_WORLDS, "111 222 333", F

Re: crawler questions..

2009-03-05 Thread adasal
That's interesting. I've been working in python recently, not crawling though. But, as ever, the more you get into it the more curious you get. Did you come up with a solution to a node error? Are you really talking about a broken link, or are you just saying the bottom of the tree has been reached

Re: Tomcat Threads are BLOCKED after some time

2009-03-05 Thread Varun Dhussa
Hi, I think it might be a case of the allowed open files at the OS. Try setting a higher ulimit and run the program. Also, what are the gc parameters you have set on the jvm? Regards Varun Dhussa Product Architect CE InfoSystems (P) Ltd http://www.mapmyindia.com damu_verse wrote: Hi Than