Thanks Chris, I'll try that.
Bye
- Original Message -
From: "Chris Hostetter" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, July 25, 2006 3:46 AM
Subject: Re: queryParser and sorting question
1) subclass DefaultSimilarity so that tf/idf allways return either
0 or 1 (more info on this can
The query "foo NOT bar AND baz" seems to be interpreted as "+foo -(+bar
+baz)" (using default operator AND). Is this a bug, or a feature?
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTEC
The query "foo bar OR baz" seems to be interpreted as "+foo bar baz", even
when using default operator AND! "foo AND bar OR baz" on the other hand is
interpreted as "(+foo +bar) baz", as expected.
-
To unsubscribe, e-mail: [EM
On Dienstag 25 Juli 2006 04:05, Namit Yadav wrote:
> 1 List SMSIDs of all the SMSes that a phone number had sent (Each SMS
> message will have a globally unique ID)
> 2 List SomeData1, SomeData2, SomeData3 and SomeData4 for a given SMSID.
>
> How can I do this efficiently?
Short answer: use a rel
On Jul 25, 2006, at 3:11 AM, Eric Jain wrote:
The query "foo NOT bar AND baz" seems to be interpreted as "+foo -
(+bar +baz)" (using default operator AND). Is this a bug, or a
feature?
It's been a while since I've touched PrecedenceQueryParser, but I
recall there still being some issues wit
Hi All,
Can anybody help me out on this ..?
I have to search for a particular value over multiple fields and need to
know if grouping is allowed over multiple fields
eg.
AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish )
Introducing paranthesis "(" is giving me lexical error
Thanks and
Krishnendra Nandi wrote:
Can anybody help me out on this ..?
I have to search for a particular value over multiple fields and need to
know if grouping is allowed over multiple fields
eg.
AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish )
Introducing paranthesis "(" is giving me lexica
On Mon, 2006-07-24 at 21:16 -0400, Yonik Seeley wrote:
> > > I can't figure out what the parameters does. ;)
>
> Hopefully the wiki link I gave before will explain the parameters.
Oh, I so totally missed that. Do you want me to java-doc it up and send
you the patch?
--
Hi,
I went through the IndexModifier class. It says that - Although an instance
of this class can be used from more than one thread, you will not get the best
performance. You might want to use IndexReader and IndexWriter directly for
that (but you will need to care about synchronization y
Indexing 1M of logs shouldn't take minutes, so you're probably right.
A problem I've seen is opening/indexing/closing your index writer too often.
You should do something like... (really bad pseudo code here)
IndexWriter IW = new IndexWriter();
for (lots and lots and lots of records) {
IW
Hello,
I am looking for a way to limit the number of search results I retrieve when
searching.
I am only interested in (let's say) the first ten hits of a query.. maybe I
want to look at hits ten..twenty to, but usually only the first results are
important.
Right now lucene searches through th
On 7/25/06, karl wettin <[EMAIL PROTECTED]> wrote:
On Mon, 2006-07-24 at 21:16 -0400, Yonik Seeley wrote:
> > > I can't figure out what the parameters does. ;)
>
> Hopefully the wiki link I gave before will explain the parameters.
Oh, I so totally missed that. Do you want me to java-doc it up a
headhunter wrote:
I am looking for a way to limit the number of search results I retrieve when
searching.
I am only interested in (let's say) the first ten hits of a query.. maybe I
want to look at hits ten..twenty to, but usually only the first results are
important.
Right now lucene search
Hi Yonik,
>> I can't figure out what the parameters does. ;)
>
> Yes, it will fail without slop... I don't think there is a practical
> way around that.
I am trying to analyze your WordDelimiterFilter.
If I have x-men, after analyzing (with catenateAll) I get this:
Analzying "The x-men story
If I use IndexReader and IndexWriter class for inserts/updates, then I need
to handle the threading issues myself. Is there any other class (even in
nightly build) that I can use without having to take care of synchronization.
All this means is your code must ensure only one "writer" (Ind
Hi all,
I need some help from the Lucene experts because I coulnd't find the
best solution for a problem...
The problem: we have article entities which can have multiple keywords:
- article #1: keyword #1, keyword#2, keyword#3
- article #2: keyword#2, keyword#3
- article #3: keyword#3
- article
Hi All
I am trying to match accented characters with non-accented characters in
French/Spanish and other Western European languages. The use case is that the
users may type letters without accents in error and we still want to be able to
retrieve valid matches. The one idea, albeit naïve,
On 7/25/06, Martin Braun <[EMAIL PROTECTED]> wrote:
Hi Yonik,
>> I can't figure out what the parameters does. ;)
>
> Yes, it will fail without slop... I don't think there is a practical
> way around that.
I am trying to analyze your WordDelimiterFilter.
If I have x-men, after analyzing (with c
Thanks Mike for the reply. I will look into Lucene in Action.
I am not very good at threading. So I was looking if there is any api class
(even in nightly builds) on top of the IndexReader/IndexWriter that takes care
of concurrency rules.
Every developer must be facing this problem o
On Tue, 2006-07-25 at 11:42 -0400, Yonik Seeley wrote:
> > > Yes, it will fail without slop... I don't think there is a
> > > practical way around that.
It would of course be much easier if Lucene supported multiple token
dimensions instead of position increment only.
> the x-men are here
>
I am not very good at threading. So I was looking if there is any api class (even in nightly builds) on top of the IndexReader/IndexWriter that takes care of concurrency rules.
This is exactly why IndexModifier was created (so you wouldn't have to
worry about the details of closing/opening In
I want to copy a selection of documents from one index to another. I can
get the Document objects from the IndexReader and write them to the
target index using the IndexWriter. The problem I have is this loses
fields that have not been stored, is there a way round this.
Thanks
Mike
www.
Thanks Mike. Your explanation was really helpful.
I would use the IndexModifier class till the new IndexWriter class comes up.
Thanks once again.
-Vasu
Michael McCandless <[EMAIL PROTECTED]> wrote:
> I am not very good at threading. So I was looking if there is any api class
(eve
Thanks for the suggestion, Erick!
As for why we can't use a relational database, we get all the logs
from an external application. And due to the nature of the business,
we need to continue maintaining the logs. Moreover, the search
requests are very infrequent .. so it doesn't make sense to (alm
Rajan, Renuka wrote:
I am trying to match accented characters with non-accented characters in French/Spanish and
> other Western European languages.
ISOLatin1AccentFilter should do the job, though it works with single
characters only, so "a umlaut" will match "a" but not "ae".
--
(Seems 1.9 javadoc could be just a bit more clear on this.)
The following should do the work:
QueryParser qp = new MultiFieldQueryParser(fields[], analyzer);
Query = qp.parse(qtext);
Notice the difference in semantics as explained in the "deprecated" comment
in 1.9.
Also see the setDefaultOpe
Rajan, Renuka wrote:
> I am trying to match accented characters with non-accented characters
> in French/Spanish and other Western European languages. The use case
> is that the users may type letters without accents in error and we
> still want to be able to retrieve valid matches. The one idea,
I think the problem might be in the part.
At least with Lucene 2.0, parsing result is as expected -
String qtxt = "some text AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish
)";
Query q = new QueryParser("field",new WhitespaceAnalyzer()).parse(qtxt);
System.out.println(q);
--> field:some +
Hi,
i have 3 indexfiles which i access over a parallelreader.
When i make a search, everything works fine, butwhen i want to make a
search and sorting by a special
column i get an error. Here is my code:
Schnipp
Dim field As SortField = New SortField("Streetname")
Dim sortByName As
Just realized that the part should also be grouped, so checked
that this variation also works:
qtxt = "some text AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAME:krish )";
---> field:some +field:text +(AUTHOR_NAME:krish EMPLOYEE_NAME:krish)
qtxt = "(some text) AND ( AUTHOR_NAME:krish OR EMPLOYEE_NAM
The code looks good, *assuming* that the IndexWriter you pass in isn't
closed/opened between files (this would be a problem if you have lots of
files to index..). I've had the IndexWriter.optimize method take a
lng time to complete, so I typically don't do this until I'm entirely
done...
: I want to copy a selection of documents from one index to another. I can
: get the Document objects from the IndexReader and write them to the
: target index using the IndexWriter. The problem I have is this loses
: fields that have not been stored, is there a way round this.
there is no easy w
1) please do not cross post to more then one lucene mailing list. the
appropraites place for questions about using the Java Lucene library is
"java-user"
2) if you want the counts of all documents matching each keyword, then the
TermEnum.docFreq method can solve all of your problems.
if you wan
Hi Chris,
Sorry for the cross posting but i'm newbie at the lucene community and
thanks for your instructions!
if you want to know the counts in the context of a search that has
narrowed the document space, the problem becomes trickier -- you'll find
discussions on it if you search the archives
neils wrote:
> Hi,
>
> i have 3 indexfiles which i access over a parallelreader.
>
> When i make a search, everything works fine, butwhen i want to make a
> search and sorting by a special
> column i get an error.
You need to say exactly what the error is, right? Or else we won't
know, hm
Steven Rowe wrote:
> And, por supuesto, posting what appears to be Visual Basic code
> (presumably to be used with Lucene.Net) to an explicitly *Java* list
> (dude, the name of the list is "java-user") may be prove fruitful than
> you might hope
That should read: ... may prove *less* fruitful
Rajan, Renuka wrote:
I am trying to match accented characters with non-accented characters in French/Spanish and other Western European languages. The use case is that the users may type letters without accents in error and we still want to be able to retrieve valid matches. The one idea, albeit
Few comments -
> (from first posting in this thread)
> The indexing was taking much more than minutes for a 1 MB log file. ...
> I would expect to be able to index at least a of GB of logs within 1 or 2
minutes.
1-2 minutes per GB would be 30-60 GB/Hour, which for a single machine/jvm
is a lot -
Hi Steve,
thanks a lot for your help. I think problem will be that no single terms are
stored in that field. So i will take a look and make some furhter tests.
Regarding your advice to the "java-user" list, i think it is no problem to
send a short vb code to describe the problem. This group is (
hey doron, I solved the problem with
for (String field : fields) {
QueryParser qp = new QueryParser(field, SearchEngine.ANALYZER);
fieldsQuery.add(qp.parse(string), BooleanClause.Occur.SHOULD);
}
that seems to have the exact same effect of your suggestion
MultiFieldQuery
On Jul 25, 2006, at 7:35 PM, Paulo Silveira wrote:
hey doron, I solved the problem with
for (String field : fields) {
QueryParser qp = new QueryParser(field, SearchEngine.ANALYZER);
fieldsQuery.add(qp.parse(string), BooleanClause.Occur.SHOULD);
}
I believe that this will caus
Hello Miles,
thanks for your answer.
I guess the recommended way to implement paging of results is to do your own
query-results caching, right? Or does lucene also do this for me?
Johannes
--
View this message in context:
http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a
On Mittwoch 26 Juli 2006 07:55, headhunter wrote:
> I guess the recommended way to implement paging of results is to do your
> own query-results caching, right?
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-81ddcb6ef8573197a77e0c7b56b44cb27e6d7f09
--
http://www.danielnaber.de
--
Marvin Humphrey wrote:
I believe that this will cause difficulties with prohibited terms. Say
you have these two documents...
Doc 1:
title: a
body: foo
Doc 2:
title: b
body: bar
It's not just prohibited terms. Happens for required terms too. A
search
Hello,
this really doesn't answer my question ;)
I've indeed read the FAQ (though I couldn't believe this point ;) .
Is it recommended to do the search again - discarding the uninteresting
values - because lucene caches the results, or just because lucene is so
damn fast?
Johannes
--
View thi
45 matches
Mail list logo