date:20080322

Re: factor in stopwords when searching

2008-03-22 Thread Chris Lu

Hi, Erik, I understand your rant. :) Well, the solution I finalized with is this, as suggested by Jake and Grant. For those stop words, when indexing content, I will treat them as normal words. When processing the user query, there will be normal query with stop words skipped, and another part tha

RE: Field values ...

2008-03-22 Thread Chris Hostetter

: I want to do something like: : : List infoList = new ArrayList (); : foreach (Document doc in LuceneIndex) : { :String id = doc.get ("Id"); :String phone = doc.get ("Phone"); :infoList.add (new Info (id, phone)); : } If "Id" and "Phone" are stored value

Re: factor in stopwords when searching

2008-03-22 Thread Erick Erickson

Well, whether it's a good user experience is exactly the question. I've spent far too much time satisfying customer (or product manager) requests that add zero value to the product *in the user's eyes*. And I quote: "This is asked by some customer, who may not know what's "stop words" at all." wh

Re: factor in stopwords when searching

2008-03-22 Thread Chris Lu

This is asked by some customer, who may not know what's "stop words" at all. Jake's approach should be quite similar to what some search engine companies are doing. It'll cost some storage, but can achieve a good user experience. The benefit is kind of obvious in real world. When users enter some

Re: Access Denied in opening IndexSearcher

2008-03-22 Thread Erick Erickson

Two things: 1> get a copy of Luke and try to navigate to your dir and open it. That'll tell you if you are looking in the right place. 2> Post the code snippets where you open your index for writing and where you open it for reading. That'll give folks something to analyze. Best Erick On Sat, M

Re: factor in stopwords when searching

2008-03-22 Thread Erick Erickson

What's your reason for trying? The whole point of stop words is that they should be considered "no ops". That is, they add nothing to the semantics of whatever is being processed. I' don't understand the use case for why you want to go outside that assumption. Another way of asking this is "what t

Re: Field name size and index size

2008-03-22 Thread Michael McCandless

Summary: I think there will be no real impact if you use longer field names. Details: Index size will be just a tiny bit bigger. There is a single file per segment (*.fnm) that resolves the field names into integer IDs, then the rest of the index uses these integer IDs. So only that

Field name size and index size

2008-03-22 Thread John

Hi, Lets say my data source consists of records like so (the example is Field=Value): ? AA=Value1 ? BB=Value2 ? CC=Value3 ? DD=Value4 And lets say I a second copy of my data but this time it looks like so: ? A=Value1 ? B=Value2 ? C=Value3 ? D=Value4 I..e, same

Access Denied in opening IndexSearcher

2008-03-22 Thread Jeet Singh

Hi, This is my first post to this group. I'm using Lucene 2.3 on XP machine. I've an index of 3000 pages in a dir named 'wcrawl', that I want to search through. In the code, when i'm trying to open IndexSearcher at a specific 'Directory' location, it gives error of FileNotFound: c:\raw\wcraw

Re: factor in stopwords when searching

RE: Field values ...

Re: factor in stopwords when searching

Re: factor in stopwords when searching

Re: Access Denied in opening IndexSearcher

Re: factor in stopwords when searching

Re: Field name size and index size

Field name size and index size

Access Denied in opening IndexSearcher

9 matches

Site Navigation

Mail list logo

Footer information