Sorry. I misunderstood.
As Karsten suggested, perform search for each term and do the logical
operation on the collected hits.
Regards
Ganesh
- Original Message -
From: <[EMAIL PROTECTED]>
To:
Sent: Tuesday, October 14, 2008 6:20 PM
Subject: Re: Searching sets of documents
The p
Yes, StringIndex's public fields make life awkward. Re initialization - I did
think you could try use arrays of byte arrays. First 256 terms can be addressed
using just one byte array, on encountering a 257th term an extra byte array is
allocated. References to terms then require indexing into
http://www.blardone.org/2008/10/12/lucene-query-accented-character/
Is specific about Php, but can be easily use try to solve the same problem
in Java.
I had the same problem as "Christophe from paris", and changing the query to
it's html encoded equivalent makes my search queries work.
So Perh
Hi Spring,
If I got your question correctly, you want to search for Folders/Docs
depending on the condition, right!
Why don't you index the folder name as well and so you could fire a query
saying
Folder:A and (TEXT:x and TEXT:y)
So here the search would run only on folder A for the keywords.
In
Hi All,
i am a beginner to Lucene.
and i am trying to use Lucene 2.4.
when i have set lucene-core-2.4.0.jar & lucene-demos-2.4.0.jar in my CLASSPATH.
and trying to run:
java org.apache.lucene.demo.IndexFiles E:\prabina\lucene-2.4demo\src
it shows the error:
caught a class java.io.FileNotFou
Akanksha Baid wrote:
I have indexed multiple documents - each of them have 3 fields ( id, tag
, text). Is there an easy way to determine the set of tags for a given
query without iterating through all the hits?
For example if I have 100 documents in my index and my set of tag = {A,
B, C}. Query
You could go through this implementation. Have been using this (improvised)
for a while now. There might be better ways to do so too. so you could
check!
http://www.gossamer-threads.com/lists/lucene/java-user/35704?search_string=categorycounts;#35704
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blo
Hi, You may also want to take a look at Carrot2:
http://demo.carrot2.org/demo-stable/main
Lucene documentation references them, but I was disappointed to see that
they had an open source version (really old) and one that you can buy. It
may work for you.
Also, take a look at SOLR's implementatio
: 3> maybe you could provide a custom sorter by using
: SortComparator, although you should look at the warnings
: in the API.
:
: Now I'll wait for Hoss to say "Isn't that what XXX provides" ...
I can't think of anything that would solve this problem direclty, mianly
because i can't think of a
: Could one of you point me to an example of code for querying without using
: the deprecated class Hits ?
The demo code included with Lucene releases was updated in Lucene 2.4 so
that it does not use the Hits class.
-Hoss
-
: For example if I have 100 documents in my index and my set of tag = {A, B, C}.
: Query Q on the text field returns 15 docs with tag A , 10 with tag B and none
: with tag C (total of 25 hits). Is there a way to determine that the set of
: tags for query Q = {A, B} without iterating through all 25
Hello, I am using the reopen method in the IndexReader class. In the case
of the IndexReader being updated, I would like to create a new IndexSearcher
and close the old IndexReader. When closing an instance of IndexReader, do I
have to wait for currently executing searches (through an IndexSearche
So, it appears to me that the criteria for a "good suggestion" is the n-gram
overlap of a given term, not the edit distance.
Thus, if we're looking for "britney", but we mess up and type "birtney",
"kortney" will come up before "birtney."
Is there a way to force the SpellChecker to use the edit
: http://www.blardone.org/2008/10/12/lucene-query-accented-character/
thta post appears to be specificly about a PHP function to convert UTF-8
characters to their HTML equivilents ... which doesn'trelaly seem relevant
to the posters question ...
: > I'm use FrenchAnalyzer for index
..
: Have a look at SOLR (*lucene.apache.org/solr*). It is based on Lucene and
: provides additional functionalities including faceted search.
to more generally answer you question: while you may not find a lot of
info searching for "lucene histogram" you should find loots of info about
achieving
: Actually looking at this a little deeper maybe Lucene could/should
: automatically be doing this "short" optimisation here?
At the moment it can't, the array's in StringIndex are public.
The other thing that would be a bit tricky is the initialization ... i
can't think of any easy way to kn
On 15/10/2008, at 7:37 AM, Chris Gilliam wrote:
Hello Everyone,
New to Lucene..
We currently roughly 100Gig of log files. We are needing to build
a search
application that can return rows of data from the files and combine
the
results?
Does Lucene index the content in the files?
Will i
I'm working on indexing JSON documents via Lucene and I've run into a
bit of a snag. Currently, I'm indexing JSON documents by adding fields
that are path/value pairs. For example, given a JSON document like:
{
"name":
{
"first": "Paul",
"last": "Davis"
}
"jobs": ["hotdog v
Hello Everyone,
New to Lucene..
We currently roughly 100Gig of log files. We are needing to build a search
application that can return rows of data from the files and combine the
results?
Does Lucene index the content in the files?
Will it be able to find matching criteria say a date and then
Hello Everyone,
New to Lucene..
We currently roughly 100Gig of log files. We are needing to build a search
application that can return rows of data from the files and combine the
results?
Does Lucene index the content in the files?
Will it be able to find matching criteria say a date and then
Is there something I could do to Index the documents differently to
accomplish this? Currently I am looking at all the hits to generate the
set of tags for the query.
If I need to implement the same thing within Lucene, I am not sure if I
will gain anything performance wise. Or am I wrong about
Related:
https://issues.apache.org/jira/browse/LUCENE-725
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
I was wondering if the Lucene SpellChecker class was threadsafe,
specifically, indexDictionary().
Such that:
for (int i = 0; i < numReaders; i++) {
//spawn new thread to run:
spellchecker.indexDictionary(new LuceneDictionary(readers[i],
myField));
}
Would work.
Thanks,
Matt
--
Vie
You're on the right track I think... perhaps try using RangeFilter
directly rather than creating your own class. Something like:
Filter filter = RangeFilter.More("lastUpdatedDate","");
searcher.search(query, filter)
If that works for you, then the next step would be to look at
CachingWrapperFilt
Hi Yonik,
Thanks for your reply.
In my case I don't want those document that has Null value for the field
that I am willing to sort
I tried writing my own filter using RangeFilter, but it doesn't work.
I used something like the following in my custom filter.
public class NotNullRangeFilter ext
Hello,
Is there a analyzer that will tokenize the stream such that there's no
repeated tokens in the stream? I have a keyword-field on my document,
so if one keyword already appears on the list there's no point in
having it shown again. Does it make sense having that analyzer? Or
indexing
On Tue, Oct 14, 2008 at 4:35 AM, Reetha Hariharan <[EMAIL PROTECTED]> wrote:
> I am searching using one field, say X and want to sort the results using
> another, say Y (Which can have null values). But I am expecting Sort to
> ignore all the null values and just sort only records that has values i
Hi Michael,
Michael McCandless wrote:
Also, this issue was just opened:
https://issues.apache.org/jira/browse/LUCENE-1419
which would make it possible for classes in the same package
(oal.index) to use their own indexing chain. With that fix, if you
make your own classes in oal.index pa
Well, this is what I thought too.
Thank you.
Original-Nachricht
> Datum: Tue, 14 Oct 2008 02:11:53 -0700 (PDT)
> Von: "Karsten F." <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: RE: Searching sets of documents
>
> Hi spring,
> unit of retrieval in lucene is a
The problem is the logical combination of documents in folders not of terms in
documents.
See original post.
Original-Nachricht
> Datum: Tue, 14 Oct 2008 16:29:15 +0530
> Von: "Ganesh" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: Re: Searching sets of documen
I don' know how tight your result must be, but here's a couple
of ideas
1> you could boost your target by a huge amount, although forming
your query might be "interesting". If you somehow worked the clause
fieldA:5^1 say. I suspect that some of your results wouldn't be on
top, but it might
What is your problem?
If the foldernames are already stored then it could be retrieved from
search. Use DuplicateFilter on field "foldername" to get the unique list of
folders.
Hope this helps.
Regards
Ganesh
- Original Message -
From: <[EMAIL PROTECTED]>
To:
Sent: Tuesday, Octob
The folder name and the document name are stored for each document.
Original-Nachricht
> Datum: Tue, 14 Oct 2008 14:11:09 +0530
> Von: "Ganesh" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: Re: Searching sets of documents
> You should have stored the foldernam
Hi spring,
unit of retrieval in lucene is a document.
There are no joins between document sets like in sql.
What you can do is to collect all hits for each term query on level of
folders and than implement the logical „and“ or „or“ by your own.
For this you could reuse the existing implementation
2.4.0 does have the workaround for that JRE bug.
Mike
Michael Bell wrote:
this is the issue with Java 6's server VM.
Yes I know it's fixed in Sun's beta update to Java 1.6, but did the
workaround get committed to 2.4? It is not documented in the
CHANGELOG.
Thanks
Hello David,
Use TopDocs or TopFieldDocs to collect only required hits.
TopDocs topDocs = searcher.search(query,10)
int docID = topDocs.scoreDocs[index].doc;
Document doc = searcher.doc(docID);
Regards
Ganesh
- Original Message -
From: "David Massart" <[EMAIL PROTECTED]>
You should have stored the foldername or fullpath of the file as part of
Lucene document otherwise it is difficult to retrieve.
Regards
Ganesh
- Original Message -
From: "叶双明" <[EMAIL PROTECTED]>
To:
Sent: Tuesday, October 14, 2008 6:13 AM
Subject: Re: Searching sets of documents
Hi,
I am a newbie.
I just configured lucene using hibernate search. But I find that the sorting
doesn't ignore null values.
I am searching using one field, say X and want to sort the results using
another, say Y (Which can have null values). But I am expecting Sort to
ignore all the null values
Hi,
You could try changing (or extending) TopFieldDocCollector and do your
processing there (that is what I tried... and it worked fine). But that
would mean changing lucene code a little bit.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody,
this is the issue with Java 6's server VM.
Yes I know it's fixed in Sun's beta update to Java 1.6, but did the workaround
get committed to 2.4? It is not documented in the CHANGELOG.
Thanks
-
To unsubscribe, e-mail:
I have indexed multiple documents - each of them have 3 fields ( id, tag
, text). Is there an easy way to determine the set of tags for a given
query without iterating through all the hits?
For example if I have 100 documents in my index and my set of tag = {A,
B, C}. Query Q on the text field r
41 matches
Mail list logo