Lucene's IndexWriter allows users to update documents by Term via this
method signature:
void updateDocument(Term term, Document doc)
But what about updating them by Query? Like so:
void updateDocument(Query query, Document doc)
1) How can this be done? As far as I know there is no such method
si
Is there some sort of default limit imposed on the Lucene indexes?
I try to index 50k or 60k documents but when I use Luke to go inside
the index and check the total # of entries indexed, it shows that
there are only 32768 entries.
It seems liek some sort of limit ... what should I look at to adjus
Hello Folks,
I'm using Lucene 3.0, my code runs fine on Windows but when I test it
on Linux, I run into the following stack trace:
java.io.FileNotFoundException:
/opt/apache/tomcat/webapps/myapp/luceneData/backend_IP/en_US/_1.fdt
(No such file or directory)
at java.io.RandomAccessFile.ope
Hello,
What's a good source to get dictionaries (for spellcorrections) and/or
thesaurus (for synonyms) that can be used with Lucene for non-English
languages such as Fresh, Chinese, Korean etc?
For example, the wordnet contrib module is based on the data set
provided by the Princeton based wordne
Hello,
I was wondering if anyone on this mailing list have ever compiled a
list of algorithms for various non English languages that work well
with the lucene-spellchecker contrib module?
For example, with English using an spellchecker index built using
ngrams and then searched using LevensteinDi
Hello,
I heard Yonik talk about a better dismax query parser for Solr so I
was wondering if Lucene already has this functionality contributed to
its contrib modules?
- Pulkit
-
To unsubscribe, e-mail: java-user-unsubscr...@lucen
e
a match at all and therefore present it in the results?
Just a theory (a bad one perhaps) ... but one which can be easily
blown away by using ANALYZED in your indexer and then trying again.
- Pulkit
On Thu, Nov 18, 2010 at 12:55 PM, Pulkit Singhal
wrote:
> Wow, you live in a really gr
Wow, you live in a really great country and attend an awesome
university where they have classes like "Text Analytics" I'm gonna
send my kid there to study :)
In all seriousness I think the problem may be with how you are
collecting your results.
I find this very amusing:
> 80. 896889 phrase occu
Hello,
I was wondering if there is any API call in Lucene that allows
something like the following:
Step 1: Take the user input
"hello world" you are beautiful
Step 2: QueryParser does its thing
defaultField:hello world defaultField:you defaultField:are
defaultField:beautiful
Step 3: And someho
, 2010 at 4:10 AM, Ian Lea wrote:
> Have you tried explicitly setting norms on/off the way you want with
> Field.setOmitNorms(boolean)?
>
>
> --
> Ian.
>
> On Thu, Nov 18, 2010 at 12:54 AM, Pulkit Singhal
> wrote:
>> Based on my experimentation and what it sa
rdAnalyzer during indexing
and get NORMS.
So much for being elegant, if someone has some way to make it happen,
please let me know.
Thanks.
On Wed, Nov 17, 2010 at 7:09 PM, Pulkit Singhal wrote:
> Greetings!
>
> When using KeywordAnalyzer for indexing a field which has the
> Field
Greetings!
When using KeywordAnalyzer for indexing a field which has the
Field.Index.ANALYZED option selected.
Does the use of KeywordAnalyzer automatically mean that there is no
point in trying to set the index-time boosts on that field in the
document because it will be treated as a full token
Looked at 2.2 api and those methods should be there. So the
NoSuchMethodException makes no sense.
Are you absolutely sure that your integration between PHP & Java is setup
properly and you really are using 2.2?
Could there be multiple versions of lucene jars in your classpath? such that
older ones
>
>
> > -Original Message-
> > From: Pulkit Singhal [mailto:pulkitsing...@gmail.com]
> > Sent: Wednesday, November 10, 2010 2:55 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: IndexWriters and write locks
> >
> > You know that really
, Uwe Schindler wrote:
> Are you using NFS as filesystem? NFS is incompatible to lucene :-)
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -----Original Message-
> > From: Pu
k file itself
> is just a placeholder which is not cleaned up on Ctrl-C. The lock is not the
> file itself, its *on* the file.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -
[correctly] release the lock on process
> exit.
>
> Mike
>
> On Wed, Nov 10, 2010 at 9:38 AM, Pulkit Singhal
> wrote:
> > Hello,
> >
> > 1) On Windows, I often shut down my application server (which has active
> > IndexWriters open) using the ctrl+c keys.
Hello,
1) On Windows, I often shut down my application server (which has active
IndexWriters open) using the ctrl+c keys.
2) I inspect my directories on the file system I see that the write.lock
file is still there.
3) I start the app server again, and do some operations that would require
IndexWr
1) You can attach byte array "Payloads" for every occurrence of a term
during indexing. It will be stored at each term position, during indexing,
and then
can be retrieved during searching. You may want to consider taking this
approach rather than writing bitvectors to a text file. If you feel that
or that document then it would list fields
> from index1 and index2.
>
> On Wed, Oct 27, 2010 at 9:04 PM, Pulkit Singhal >wrote:
>
> > Look interesting, what is the merit in having a second index in order to
> > keep the document id the same? Perhaps I have misund
Look interesting, what is the merit in having a second index in order to
keep the document id the same? Perhaps I have misunderstood. Just want to
understand your motivation here.
On Wed, Oct 20, 2010 at 2:57 PM, Nilesh Vijaywargiay wrote:
> I've written a blog regarding a work around for updati
When you ask:
a) will each feed would form a Lucene document, or
b) will each database row would form a lucene document
I'm inclined to say that really depends on what type of aggregation
tool or logic you are using.
I don't know if "Tika" does it but if there is a tool out there that
can be point
dexExceptions, NPE or
> array out-of-bounds exceptions. There is no checksumming of the index files.
>
> Lance
>
> Pulkit Singhal wrote:
>>
>> Hello Everyone,
>>
>> What happens if:
>> a) lucene index gets written half-way to the disk and then somethin
Is using IndexReader.numDocs() on the Directory instance, the only way
to count the indexed entries?
On Fri, Sep 24, 2010 at 9:40 AM, Pulkit Singhal wrote:
> Hello Everyone,
>
> I want to load the indexed data from the file system using FSDirectory.
> But I also want to be sure if s
Hello Everyone,
I want to load the indexed data from the file system using FSDirectory.
But I also want to be sure if something was actually loaded or if a
new empty directory was created and returned to me.
How can I count the # of entries in the Directory object returned to me?
Thanks!
- Pulkit
Hello Everyone,
What happens if:
a) lucene index gets written half-way to the disk and then something goes wrong?
b) the index gets corrupted on the file system?
When we open that directory location again using FSDirectory implementations:
a) Is there any provision for the code to clean out the p
With RAMDirectory we have the option of providing another Directory
implementation such as FSDirectory that can be wrapped and loaded into
memory:
Directory directory = new RAMDirectory(FSDirectory.open(new
File(fileDirectoryName)));
But after building the index, if I close the IndexWriter then t
27 matches
Mail list logo