Hi, Erik, I understand your rant. :) Well, the solution I finalized
with is this, as suggested by Jake and Grant.
For those stop words, when indexing content, I will treat them as normal words.
When processing the user query, there will be normal query with stop
words skipped, and another part tha
: I want to do something like:
:
: List infoList = new ArrayList ();
: foreach (Document doc in LuceneIndex)
: {
:String id = doc.get ("Id");
:String phone = doc.get ("Phone");
:infoList.add (new Info (id, phone));
: }
If "Id" and "Phone" are stored value
Well, whether it's a good user experience is exactly the question. I've
spent far too much time satisfying customer (or product manager)
requests that add zero value to the product *in the user's eyes*.
And I quote:
"This is asked by some customer, who may not know what's "stop words" at
all."
wh
This is asked by some customer, who may not know what's "stop words" at all.
Jake's approach should be quite similar to what some search engine
companies are doing. It'll cost some storage, but can achieve a good
user experience.
The benefit is kind of obvious in real world. When users enter some
Two things:
1> get a copy of Luke and try to navigate to your dir and open it.
That'll tell you if you are looking in the right place.
2> Post the code snippets where you open your index for
writing and where you open it for reading. That'll give folks
something to analyze.
Best
Erick
On Sat, M
What's your reason for trying? The whole point of stop words is that
they should be considered "no ops". That is, they add nothing to the
semantics of whatever is being processed. I' don't understand the use
case for why you want to go outside that assumption.
Another way of asking this is "what t
Summary: I think there will be no real impact if you use longer field
names.
Details:
Index size will be just a tiny bit bigger. There is a single file
per segment (*.fnm) that resolves the field names into integer IDs,
then the rest of the index uses these integer IDs. So only that
Hi,
Lets say my data source consists of records like so (the example is
Field=Value):
? AA=Value1
? BB=Value2
? CC=Value3
? DD=Value4
And lets say I a second copy of my data but this time it looks like so:
? A=Value1
? B=Value2
? C=Value3
? D=Value4
I..e, same
Hi,
This is my first post to this group.
I'm using Lucene 2.3 on XP machine. I've an index of 3000 pages in a dir
named 'wcrawl', that I want to search through.
In the code, when i'm trying to open IndexSearcher at a specific
'Directory' location, it gives error of
FileNotFound: c:\raw\wcraw