I have chenged my logic as suggested..
boolean create = !(new File("indexdir/segments").exists());
System.out.println("Number of objects writing to index=" + count);
index = new IndexWriter(d, new StandardAnalyzer(), create);
But still one problem persists..When i do a huge number of documents,
th
I ran into the following exception during an index update process (1.4.3
on Windows)... and was wondering if any one had an idea what might cause
this. It occured after a new IndexWriter was opened, a document added, and
then IndexWriter.close() was called. This occurs after many successful
updates
Anna Bing wrote:
Firstly the Lucene in Action Book is great. It really helped me with
implementing search for a project.
Sorry if this is the wrong forum but as you are all search people. I
wondered if you could recommend any good books about search
theory/algorithms, readable if that is possible
Hi,
I have two questions regarding the search capability of Lucene.
1) Lucene does the inverted indexing by which we mean it keeps how many
times a particular token is used. Is there a way to find out the list of
most frequently used words in the descending order.
For example:- Suppose I have
Ramya. I don't have an answer to your specific lock file question, but
a couple thoughts.
You say you're using multiple threads to index 50,000 documents. Have
you tried a single thread version first? I'd try that, and then scale
out to multiple threads as needed. We index over ten times that
1. I am trying to pump in large number of documents( to the tune of
5) ... I use muliple threads and i depend on the internal locks of
lucene to synchronize the write access to the index.
try
{
index = new IndexWriter(d, new StandardAnalyzer(), false);
}
How about:
http://www.dcs.gla.ac.uk/Keith/Preface.html
quite an old one but a recognized one, I think.
Also, browse http://www.lt-world.org/ I think.
paul
Le 12 mai 05, à 14:04, Pasha Bizhan a écrit :
Hi,
Managing Gigabytes
http://www.amazon.com/exec/obidos/tg/detail/-/1558605703/
qid=1115
I believe there are links to everal Information Retrieval books on
Lucene's Wiki.
Otis
--- Anna Bing <[EMAIL PROTECTED]> wrote:
> Firstly the Lucene in Action Book is great. It really helped me with
> implementing search for a project.
>
> Sorry if this is the wrong forum but as you are all sea
Firstly the Lucene in Action Book is great. It really helped me with
implementing search for a project.
Sorry if this is the wrong forum but as you are all search people. I
wondered if you could recommend any good books about search
theory/algorithms (readable if that is possible in an algorithm
Hi,
Managing Gigabytes
http://www.amazon.com/exec/obidos/tg/detail/-/1558605703/qid=1115898416/sr=8
-1/ref=pd_csp_1/104-0210366-8377506?v=glance&s=books&n=507846
Pasha Bizhan
http://lucenedotnet.com
> -Original Message-
> From: Anna Bing [mailto:[EMAIL PROTECTED]
> Sent: Thursday, May 1
Salton, Gerald and McGill, Michael J. /Introduction to Modern
Information Retrieval/. McGraw-Hill, 1983.
-Gary
Anna Bing wrote:
Firstly the Lucene in Action Book is great. It really helped me with
implementing search for a project.
Sorry if this is the wrong forum but as you are all search peo
Hi,
yeah, i just added it into simpy when i read René post ;)
congrats for simpy
Sven
Le jeudi 12 mai 2005 à 09:59:18, vous écriviez :
OG> Somebody asked about this today, and I just found this through Simpy:
OG> http://www.unine.ch/info/clef/
OG> Scroll half-way through the page, look on th
Firstly the Lucene in Action Book is great. It really helped me with
implementing search for a project.
Sorry if this is the wrong forum but as you are all search people. I
wondered if you could recommend any good books about search
theory/algorithms, readable if that is possible in an algorithm
Hi John,
> >from a slightly skewed source -- newspapers in a fixed interval
> perhaps. (I don't think "Los Angeles" makes it into every day parlance
You're right there. Most possibly the frequencies in that list are based on
a volume of the Los Angeles Times, that's one of the standard
CLEF-Co
Hi John,
I haven't investigated the sources yet, but you might be right.
However, as you stated, those type of lists directly depend on the
subject, and the source.
Anyway, it is not very important for my study, and I'm sure it will help
me very much.
I will prepare optimized lists if I can obtai
>I suppose this question has been asking before but there is no way to
>search such a thing in the archive.
You can search the lists on:
http://www.mail-archive.com/java-user%40lucene.apache.org/
http://www.mail-archive.com/java-dev%40lucene.apache.org/
...but a lucene search on the mailing l
Hi Otis,
Thank you for the source address.
Best regards,
Ahmet
Otis Gospodnetic wrote:
Somebody asked about this today, and I just found this through Simpy:
http://www.unine.ch/info/clef/
Scroll half-way through the page, look on the right side: 1,000 most
frequent words for several languages.
Ot
Hi René,
That is an excellent source! There exist, more than I wanted.
Thanks a lot.
Best regards,
Ahmet
René Hackl wrote:
Hi Ahmet,
Now I have only Turkish, English, German, and Finnish lists.
At http://www.unine.ch/info/clef/ you can find some more lists you might
find useful.
Best regard
Otis Gospodnetic wrote:
Somebody asked about this today, and I just found this through Simpy:
http://www.unine.ch/info/clef/
Scroll half-way through the page, look on the right side: 1,000 most
frequent words for several languages.
Hmm. I'm not sure how valuable that is. For English "los" a
Somebody asked about this today, and I just found this through Simpy:
http://www.unine.ch/info/clef/
Scroll half-way through the page, look on the right side: 1,000 most
frequent words for several languages.
Otis
Simpy -- si
Hi,
I suppose this question has been asking before but there is no way to
search such a thing in the archive.
Anyway, I need to merge to different type of search but I am not really
sure that the calculus of the score is the same. So if someone can
indicate me how does Lucene to compute it or
Hi Ahmet,
> Now I have only Turkish, English, German, and Finnish lists.
At http://www.unine.ch/info/clef/ you can find some more lists you might
find useful.
Best regards,
René
Sorry Ahmet for the double post, I meant to address the mailing list...
--
+++ Neu: Echte DSL-Flatrates von GMX - S
Yes, try minMergeDocs = 1 but keep in mind that you'll have to be
re-opening a new IndexReader every time your index changes, and it
sounds like this will be very frequent/constant. This may cost you...
Otis
--- Rifflard Mickaël <[EMAIL PROTECTED]> wrote:
> Hi Otis,
>
> If I swap these tw
Hi Otis,
If I swap these two lines, the result is the same.
I want to build an indexing process as fast as possible and I can't use
a greater minMergeDocs because my need is to know all documents of my
index in real time.
Do you think that a FSDirectory with minMergeDocs equals to 1 can
be th
24 matches
Mail list logo