Hi Wenhao,
Its generally better to incrementally buld your index and at the same
tiime.
Considering by this time you'd be a little aware of implementing/using
luceneAPI, here is what you could do.
Open the existing index using 'createnew' set to false
*IndexWriter(Directory d, Analyzer a, boolean
Wenhao Xu,您好!
Index.AddDocment(Docment doc) can do your work. After the previous
action, remember to commit the Index.
=== 2009-11-10 14:40:03 您在来信中写道:===
>Hi, everybody,
> I am new to Lucene and have a question about how to update my index. The
>following is my situation:
>
Hi, everybody,
I am new to Lucene and have a question about how to update my index. The
following is my situation:
1) I create indexes for each text (or varchar) field of a relational
database;
2) This database will be continuously inserted into by new records; and I
need to add indexes of
Right - I followed the release wiki and took it out for 2.9.0 - but then
before 2.9.1 some discussion arose about not taking it out.
Peter Keegan wrote:
> I get your points (btw, I built with 1.6), and I like the easy override.
> But, my build of 2.9.0 didn't produce a dev jar, which is inconsiste
I get your points (btw, I built with 1.6), and I like the easy override.
But, my build of 2.9.0 didn't produce a dev jar, which is inconsistent with
2.9.1. I guess that's the flux you referred to.
Peter
On Mon, Nov 9, 2009 at 8:13 PM, Mark Miller wrote:
> Yeah - its a debatable point. You can
well i suppose we should do this as a last resort.
the sen code is pretty nice, its a lot less complex than smartcn for
example.
also, if you can't modify the internals (just linking to a lib) you are
limited in some regard, like smartcn it looks like this one represents the
hmm with an object gr
Yeah - its a debatable point. You can have issues when building though -
did you build with java 1.5? Then its not like the official build. This
keeps you from confusing yourself about what artifacts are what. You can
override it, but this way you know what you have done. Just because you
have the
Marvin Humphrey wrote:
> On Mon, Nov 09, 2009 at 04:07:55PM -0500, Robert Muir wrote:
>
>> Mark, I think my concern is that Sen itself is LGPL (
>> https://sen.dev.java.net/).
>>
>> this lucene-ja is just a lucene interface to this LGPL library.
>>
>> I think this dependency might be a problem,
The -dev version is confusing when it's the target of a build from an
official release.
A build with patches from an official release might warrant a '-dev'
version, I suppose.
(just my 2 cents.)
Peter
On Mon, Nov 9, 2009 at 7:57 PM, Mark Miller wrote:
> The build/release formula is always in f
On Mon, Nov 09, 2009 at 07:30:40PM -0500, Robert Muir wrote:
> Marvin, in this case its the same folks:
> https://sen.dev.java.net/servlets/ProjectDocumentList?folderID=755&expandFolder=755&folderID=0
> ... dunno if that matters
Not much -- my example still stands. We still can't distribute code
The build/release formula is always in flux - we likely hard coded the
change in 2.9.0 when releasing - we likely won't again in the future.
Some discussion about it came up recently on the list.
--
- Mark
http://www.lucidimagination.com
Peter Keegan wrote:
> OK. I just downloaded the 2.9.0 s
OK. I just downloaded the 2.9.0 sources from
http://mirror.candidhosting.com/pub/apache/lucene/java/lucene-2.9.0-src.zipto
a clean directory. 'ant jar-core' produced:
'build/lucene-core-2.9.jar'
(no -dev version suffix and I changed nothing). Are you saying that it
should have produced 'build/lucen
On Tue, Nov 10, 2009 at 00:44, Michael McCandless
wrote:
> Stepping back, since presumably your app knows what it's storing in
> the directory, can't you filter for files you know you've created?
> What's the larger use case here?
The exact use case where we were using list() is to determine whet
Marvin, in this case its the same folks:
https://sen.dev.java.net/servlets/ProjectDocumentList?folderID=755&expandFolder=755&folderID=0
... dunno if that matters
On Mon, Nov 9, 2009 at 7:02 PM, Marvin Humphrey wrote:
> On Mon, Nov 09, 2009 at 04:07:55PM -0500, Robert Muir wrote:
> > Mark, I think
If you build from sources, it automatically assumes a dev version (you could
have changed it). If you want to override the automatically set version (as
we do it during build), use "ant -Dversion=2.9.1"
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u..
On Mon, Nov 09, 2009 at 04:07:55PM -0500, Robert Muir wrote:
> Mark, I think my concern is that Sen itself is LGPL (
> https://sen.dev.java.net/).
>
> this lucene-ja is just a lucene interface to this LGPL library.
>
> I think this dependency might be a problem, but I am not the expert:
> http://
I know this has been asked before, but I couldn't find the thread.
The jar file produced from a build of 2.9.0 is 'lucene-core-2.9.jar'. For
2.9.1, it is 'lucene-core-2.9.1-dev.jar'. When does the '-dev' get removed?
Peter
I think this entire thread is welcome/best-served on the dev list, because
you are talking about submitting a patch to change the internals of lucene.
On Mon, Nov 9, 2009 at 6:16 PM, Mark Bennett wrote:
> Thanks Robert,
>
> At what point would this whole subject be better served on the dev list?
Thanks Robert,
At what point would this whole subject be better served on the dev list?
I've been a bit confused about that in the past (on the similar named Solr
lists)
Mark
--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408
Mark,
If he agrees, maybe you can bring this up on the java-dev list?
I think other lucene developers could assist to make sure we do the proper
procedures with minimal hassle.
On Mon, Nov 9, 2009 at 6:05 PM, Mark Bennett wrote:
> I have emailed one of the authors. I have also asked about the
I have emailed one of the authors. I have also asked about the other
authors and the other packages you mentioned.
What is the procedure for him, assuming he agrees? Does he have to sign
physical paper, or can this be done electronically?
Also, I suspect he doesn't reside in the US, I don't kno
if he is ok with it i think we need to setup a software grant, etc
in my opinion though, this would be a great thing feature to have in lucene.
(we have similar support for chinese now, but no japanese)
On Mon, Nov 9, 2009 at 5:51 PM, Mark Bennett wrote:
> I'll ask the author.
>
> --
> Mark Ben
I'll ask the author.
--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513
On Mon, Nov 9, 2009 at 2:49 PM, Robert Muir wrote:
> Hi Mark,
>
> I think apache 2.0 would be easiest. But I think BSD also works.
> Its a lit
Hi Mark,
I think apache 2.0 would be easiest. But I think BSD also works.
Its a little strange Sen is LGPL when the underlying dictionaries, chasen,
mecab (what it was ported from), all BSD/bsd-like.
or they are multi-licensed with BSD being one of them.
I also agree and hear you about how the gl
Hi Robert,
Thank you for helping sort through this, and for the Wiki link.
A few thoughts here:
1: I think the author will change the license if I ask, he's been very
supportive (though seems to be working on other things these days).
If you ran the Universe, which specific license would yo
Mark, I think my concern is that Sen itself is LGPL (
https://sen.dev.java.net/).
this lucene-ja is just a lucene interface to this LGPL library.
I think this dependency might be a problem, but I am not the expert:
http://www.apache.org/legal/resolved.html#category-a
On Mon, Nov 9, 2009 at 4:01
Hello Robert,
On Mon, Nov 9, 2009 at 12:34 PM, Robert Muir wrote:
> Mark, has there been any change to the LGPL dependency?
>
> On Mon, Nov 9, 2009 at 2:55 PM, Mark Bennett wrote:
>
>
The only code I'm modifying at the moment is the lucene-ja section, which is
the integration between core SEN a
Mark, has there been any change to the LGPL dependency?
On Mon, Nov 9, 2009 at 2:55 PM, Mark Bennett wrote:
> As some of you may recall I've been working on getting the SEN Japanese
> morphological analyzer working with 2.9. (and also with Solr 1.4, but
> that's not for this list)
>
> I'm getti
As some of you may recall I've been working on getting the SEN Japanese
morphological analyzer working with 2.9. (and also with Solr 1.4, but
that's not for this list)
I'm getting close to having a patch for JIRA. However, a couple items:
1: The code is not currently hosted on Apache (it's over
If all you do is exact match, you can create non-unique indexes on
columns, or functional indexes.
If the database index is optimal, there should not be much performance
difference between database approach vs Lucene approach.
Lucene's inverted index is just one kind of data structure for qui
On Mon, Nov 9, 2009 at 12:19 PM, Benjamin Heilbrunn wrote:
> After making my post i found this (without taking a deeper look):
>
> http://issues.apache.org/jira/browse/LUCENE-1260
>
> Looks like a solution for that problem.
Indeed the most recent patch there looks almost exactly like what
you're
Hi Mike,
thanks for your reply.
After making my post i found this (without taking a deeper look):
http://issues.apache.org/jira/browse/LUCENE-1260
Looks like a solution for that problem.
Why wasn't it applied to lucene?
Benjamin
-
On Mon, Nov 9, 2009 at 11:04 AM, Benjamin Heilbrunn wrote:
> i've got a problem concerning encoding of norms.
> I want to use int values (0-255) instead of float interpreted bytes.
>
> In my own Similarity-Class, which I use for indexing and searching, I
> implemented the static methods encodeNor
Hi,
i've got a problem concerning encoding of norms.
I want to use int values (0-255) instead of float interpreted bytes.
In my own Similarity-Class, which I use for indexing and searching, I
implemented the static methods encodeNorms, decodeNorms and
getNormDecoder.
But because they are static a
There is one on Salmon Run that I am using.. it seems to work pretty
well.. add the words "Salmon Run" to your Google search..
-Original Message-
From: shashi@gmail.com [mailto:shashi@gmail.com] On Behalf Of
Shashi Kant
Sent: Monday, November 09, 2009 10:41 AM
To: java-user@lucene
Hi All,
For those who are interested, the official Lucid Solr trainings are now
available in Europe. The first training - "Introduction to Solr" is a 3
days training covering the basics and some of the more advance features
of Solr. It is scheduled for 30th November (till 2nd December) and wil
Take a look at Bayesian text classification, which might be more
efficient for your needs. Google it.
There are several other text classification methods - depending your
needs, you can dig into them.
On Mon, Nov 9, 2009 at 10:33 AM, lucenenew wrote:
>
> i want to classify sentences stored as s
On Sun, Nov 8, 2009 at 4:58 PM, Daniel Noll wrote:
>> Well... you can use oal.index.IndexFileNameFilter.getFilter() to
>> filter for only the Lucene index files, or, you could filter for the
>> additional files you know you've placed in the index directory?
>
> This is the workaround we're curren
So many questions..
>>Which one will be better
As in.
* Faster to implement?
* Faster to search?
* Faster to update?
* Cheaper in licenses?
* More robust?
* Easier to maintain?
* Easier to backup?
Are results sorted by :
* quality (e.g. when using fuzzy text matching)?
* distance?
* pric
Does this look like a real leak John? You're definitely closing every
reader you get back from getReader?
Mike
On Sun, Nov 8, 2009 at 10:41 PM, John Wang wrote:
> I am seeing the samething, but only when IndexWriter.getReader is called at
> a high rate.
>
> from lsof, I see file handles growing
40 matches
Mail list logo