Dear all,
I would like to extract feature vectors for each document that is relevant
to a query and write
it out to a file.
Is there a way in Lucene where I can specify a parameter to do this?
or which part of the code deals with the feature vectors related to the
documents so that I can modify th
Strange. That's all I got from the log beside the first line I wrote to
show starting merging with a time stamp.
On Sun, Apr 14, 2013 at 4:58 PM, Robert Muir wrote:
> Your stack trace is incomplete: it doesn't even show where the OOM
> occurred.
>
> On Sun, Apr 14, 2013 at 7:48 PM, Wei Wang wro
Your stack trace is incomplete: it doesn't even show where the OOM occurred.
On Sun, Apr 14, 2013 at 7:48 PM, Wei Wang wrote:
> Unfortunately, I got another problem. My index has 9 segments (9 dvdd
> files) with total size is about 22GB. The merging step eventually failed
> and I saw an error me
Unfortunately, I got another problem. My index has 9 segments (9 dvdd
files) with total size is about 22GB. The merging step eventually failed
and I saw an error message:
Exception in thread "main" java.lang.IllegalStateException: this writer hit
an OutOfMemoryError; cannot complete forceMerge
That makes sense.
BTW, I checked the jar file. Exactly as you pointed out, the services files
only contains info from lucene-core, without codec from lucene-codecs.
After adding the maven plugin, now it is running.
Thanks!
On Sun, Apr 14, 2013 at 3:26 PM, Uwe Schindler wrote:
> Hi,
>
> > Thank
Hi,
> Thanks for the hint. I will double check the jar file.
>
> I am just a bit puzzled that if the indexing step recognizes 'Disk' codec and
> creates index properly, the merge step that immediately follows indexing
> seems should also recognize the 'Disk' codec.
This is easy to explain: By cr
Thanks for the hint. I will double check the jar file.
I am just a bit puzzled that if the indexing step recognizes 'Disk' codec
and creates index properly, the merge step that immediately follows
indexing seems should also recognize the 'Disk' codec.
On Sun, Apr 14, 2013 at 3:03 PM, Uwe Schindle
Are you sure that you use the ServicesResourceTransformer in your shade config?
http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer
The problem is: lucene-core.jar and lucene-codecs.jar both contain codec
components and their classes
Yes, I used Maven Shade plugin, but still have this problem. Here is the
Maven output during packaging:
[INFO] --- maven-shade-plugin:2.0:shade (default) @
audience-profile-indexer ---
[INFO] Including commons-collections:commons-collections:jar:3.2.1 in the
shaded jar.
[INFO] Including org.mockit
If you create a single JAR file out of multiple Lucene JAR files use a tool
like Maven Shade plugin, otherwise, required metadata propreties
(META-INF/services) files in the JAR files are not correctly merged together.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.
Hi Adrien,
The Lucene42Codec works well to generate the index with
DiskDocValuesFormat. But when I tried to merge the index segments by
calling:
IndexWriter iw = new IndexWriter(directory, iw_config);
...
iw.forceMerge(1);
I got the following error message:
Caused by: java.lang.IllegalArgumentE
Hi community,
I am looking for a description or paper about the SmartChineseAnalyzer and
the JapaneseAnalyzer.
The SmartChineseAnalyzer uses (Hierarchical?) Hidden Markov Models?
The JapaneseAnalyzer(Kuromoji) uses Conditional Random Fields?
Thx. :)
--
View this message in context:
http://
12 matches
Mail list logo