> The JDK comes with some classes that will let you get to
> that elegantly.
You mean clumsily :-).
Bill
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
On Mar 8, 2005, at 5:17 PM, Chris Hostetter wrote:
Earlier in this thread...
: >>> +a -> a
: >>
: >> Hmmm this is a debatable one. It's returning a TermQuery in
this
: >> case for "a". Is that appropriate? Or should it return a
: >> BooleanQuery
: >> with a single TermQuery as required?
:
Your memory is serving you well.
http://www.lucenebook.com/search?query=%22range+query%22+performance
Note the hit in section 6.5.1 - the fact that we used range queries in
the performance section is an indicator that one can really mess things
up if using range queries injudiciously. :) In parti
The version information should be included in the Manifest file inside
the Jar. The JDK comes with some classes that will let you get to
that elegantly.
Otis
--- Paul Mellor <[EMAIL PROTECTED]> wrote:
> Hi guys,
>
> Just a quick query - is there any way that I can determine at runtime
> the
>
I needed to return my hits list in date/time order (instead of
relevancy). So, I implemented a class that converted dates to an int
and stored the integer as a field in my index. I passed a Sort object
to the IndexSearcher (indicating that the sort field was convertible to
int) to get things back
I have the need to create an index which will potentially have a
million+ documents. I know Lucene can accomplish this. However, the
other requirement is that I need to be continually updating it during
the date (adding 1-30 documents/minute). I guess I had thought that I
might try to have an ac
Earlier in this thread...
: >>> +a -> a
: >>
: >> Hmmm this is a debatable one. It's returning a TermQuery in this
: >> case for "a". Is that appropriate? Or should it return a
: >> BooleanQuery
: >> with a single TermQuery as required?
: > Ok.
: > The question how to handle BooleanQuerie
So this is just the old problem of avoiding reading large, less
frequently accessed fields when you are trying to read just the smaller
more frequently accessed fields eg titles.
You can achieve this by:
a) Modifying Lucene using something like the code I originally posted
which stops reading
works like a charm,
thanks!
as a side note, the latest patch with properly
disabled coord helped me a lot as well, made coord
usable.
--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> eks dev wrote:
> > When I reindex with the lucene from the latest svn
> > snapshot, a lot of .tii files that are dele
On Tue, 8 Mar 2005 18:10:26 + (GMT), mark harwood wrote:
"to be able" != "able to be"
> OK, I thought you wanted to count terms within the
> title field. If you want to group counts on the whole
> field value change the loop in my last post to this:
>
> for(int i=0;i {
> String fiel
Chris,
Thank you - will take a look at nutch and let you/the list know
if it was a good fit for us.
On Fri, Mar 04, 2005 at 03:02:03PM -0800, Chris Hostetter wrote:
>
> If your goal is to setup a web based search interface that queries a
> lucene index containing all of the documents from your
I've been playing with the webapp and attempting to search over two
indexes that I've created. The first was 700M the second is 2.3G.
When the webapp attempts to search the second I get a
"ArrayIndexOutOfBoundsException":
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.get(A
On Mar 8, 2005, at 12:38 PM, Morus Walter wrote:
That reminds me of a remark Doug made in the discussion of bug
25820 (http://issues.apache.org/bugzilla/show_bug.cgi?id=25820#c7),
that it would be useful if an empty query string parses to an empty
query. So probably a check for that should be added
>>> "to be able" != "able to be"
OK, I thought you wanted to count terms within the
title field. If you want to group counts on the whole
field value change the loop in my last post to this:
for(int i=0;ihttp://uk.messenger.yahoo.com
-
Erik Hatcher writes:
> >> I think you must have tried this in a transient state when I forgot
> >> to
> >> check in some JavaCC generated files. Try again. This one now
> >> returns
> >> an empty BooleanQuery.
> >>
> > ok.
> > I'm a bit puzzled, since I called javacc myself, so generated files
Hey Mark, thanks for the code sample. I did look into this, but for a book's
title field, for example,
"to be able" != "able to be"
and
"java programmer" != "programmer (java)" - tokenizer will remove the
parentheses
so in my use case at least, a field value isn't simply an array of its terms.
eks dev wrote:
When I reindex with the lucene from the latest svn
snapshot, a lot of .tii files that are deletable
appear (checked with luke).
This is a bug I introduced yesterday. Thanks for catching it!
The term index (.tii) was not closed, and on Windows this makes it
undeleteable. I just com
sergiu gordea wrote:
So .. here is an example of how I parse a simple query string provided
by a user ...
the user checks a few flags and writes "test ko AND NOT bo"
and the resulting query.toString() is saved in the database:
+(+(subject:test description:test keywordsTerms:test koProperties:test
Your requirement was clear but I guess my suggested
solution wasn't.
Here it is in detail:
public class CountTest
{
public static void main(String[] args) throws
Exception
{
RAMDirectory tempDir = new RAMDirectory();
Analyzer analyzer=new WhitespaceAnalyze
When I reindex with the lucene from the latest svn
snapshot, a lot of .tii files that are deletable
appear (checked with luke).
This was not happening with previous version using
exactly the same code for indexing.
At the end of indexing Optimize was succesfully
finished.
Is this a bug?
WinXP,
Ah, I apologize. My use of the word "frequency" was misleading. By that, I
meant, the number of hits/documents, whose fields have that value. Once again:
doc a=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc b=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc c=title:1,keyword
Daniel Naber writes:
> On Tuesday 08 March 2005 14:46, Erik Hatcher wrote:
>
> > > Right. `a AND (NOT b)' parses to `a'
> >
> > Is this what we want to happen for a general purpose next generation
> > Lucene QueryParser though? I'm not sure. Perhaps this should be a
> > ParseException instead?
Erik Hatcher writes:
>
> On Mar 8, 2005, at 4:38 AM, Morus Walter wrote:
> >> I created a modified Query->String converter for my current day time
> >> project (as I use a String representation for the most recently used
> >> drop-down that is stored as a client-side cookie) that explicitly puts
>
On Tuesday 08 March 2005 14:46, Erik Hatcher wrote:
> > Right. `a AND (NOT b)' parses to `a'
>
> Is this what we want to happen for a general purpose next generation
> Lucene QueryParser though? I'm not sure. Perhaps this should be a
> ParseException instead?
As we have no concept of a "warnin
Hi guys,
Just a quick query - is there any way that I can determine at runtime the
version of Lucene that I am using? I'm upgrading a system from v1.3 to
v1.4.3 and I would like to be able to print out the version at startup so
that I can be sure that I have got my paths all correct and haven't
Erik Hatcher wrote:
On Mar 8, 2005, at 4:11 AM, sergiu gordea wrote:
In our project I save search strings, generated with query.toString
in the database and I reconstruct the Query at runtime.
I would appreciate if the new QueryParser will pass the following
assert:
Query query = QueryParser.pa
On Mar 8, 2005, at 4:38 AM, Morus Walter wrote:
I created a modified Query->String converter for my current day time
project (as I use a String representation for the most recently used
drop-down that is stored as a client-side cookie) that explicitly puts
in "OR" between SHOULD BooleanClauses.
You
On Mar 8, 2005, at 4:11 AM, sergiu gordea wrote:
In our project I save search strings, generated with query.toString in
the database and I reconstruct the Query at runtime.
I would appreciate if the new QueryParser will pass the following
assert:
Query query = QueryParser.parse(queryString, ana
The new TermFreqVector code sounds like what you need
here. This gives you fast access to precomputed totals
of term frequencies for each document.
See IndexReader.getTermFreqVector
Send instant messages to your online friends http://uk.messenger.yahoo.com
Neither. :-)
4) Top 10 fieldvalues (for some fields) returned in search results
So, let's say the results of a search were:
doc a=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc b=title:1,keyword:a,contents:somelongmemoryhoggingstring
doc c=title:1,keyword:b,contents:somelongmemoryhog
Why not use the MultiFieldQueryParser(look at
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/queryParser/MultiFieldQueryParser.html)?
This one allow you to specify on which field the search will be done. I
think that for your example 'lucene AND jakarta' will be transform by the
parse
Not sure I get what the requirement is yet:
>>Here's my requirement, ..I need to perform a simple
>>"Top 10 most frequent occurring " from a
search.
Does this mean:
1)Top 10 fieldnames present in each of your matching
documents?
2)Top 10 most frequent terms found in a choice of
field?
3)Top 10
Mark,
On Tue, 8 Mar 2005 09:56:37 + (GMT), mark harwood wrote:
>> But I suppose for Document
>> has to be further subclassed so that the other
>> non-initialized fields can be obtained as well, or
>>
> I don't think Document would be the right place for
> this - as a design pattern it is cast
Hello,
Just store it in two separate fields, and prepare a query:
(title1: myquery ) OR (title2: myquery )
Substitution of myquery by your expressions will work fine.
Saludos,
Jose Miguel
Romain Laboisse escribió:
>Hello,
>
>I am indexing documents which may have more than one title and I wo
Hello,
I am indexing documents which may have more than one title and I would like
to be able to search these titles separately.
For example, a document may have two titles, "Jakarta Lucene" and "Powerful
search engine".
A search on 'lucene AND jakarta' should return this document but a search on
> But I suppose for Document
> has to be further subclassed so that the other
> non-initialized fields can be obtained as well, or
I don't think Document would be the right place for
this - as a design pattern it is cast as a "value
object" or "transfer object" which is passed to
(potentially remo
Erik Hatcher writes:
> > ok.
> > I'm a bit puzzled, since I called javacc myself, so generated files
> > should
> > not matter, but if it's fixed, I don't care about what went wrong.
>
> Let me know if there is still an issue, though I added this exact case
> to TestPrecedenceQueryParser and its
2) Single term queries using +/- flags are parse to a query without
flag
+a -> a
Hmmm this is a debatable one. It's returning a TermQuery in this
case for "a". Is that appropriate? Or should it return a BooleanQuery
with a single TermQuery as required?
I'd prefer, if query parser parses qu
On Mar 8, 2005, at 2:29 AM, Morus Walter wrote:
Erik Hatcher writes:
Your changes look great in general, though I find some issues:
1) 'stop OR stop AND stop' where stop is a stopword gives a parse
error:
Encountered "" at line 1, column 0.
Was expecting one of:
...
...
I think you must have
39 matches
Mail list logo