Hello,
just a guess, have you tried escaping the space in your multi-word terms
with backslash?
isoweek,iso\ week
Regards
Bernd
Am 14.03.22 um 15:54 schrieb Trevor Nicholls:
I have technical data which I am querying with Lucene; one of the features
of the content is that a large number of te
While writing some tools to build and maintain lucene indexes I noticed
some strange behavior during testing.
A doc disappears from lucene index while using IndexWriter updateDocument.
The API of lucene 6.4.2 states:
"Updates a document by first deleting the document(s) containing term and
then ad
ongField, isn't it?
Regards
Bernd
Am 25.10.2017 um 12:17 schrieb Alan Woodward:
> Hi Bernd,
>
> You add a separate StoredField with the same name.
>
>> On 25 Oct 2017, at 11:11, Bernd Fehling
>> wrote:
>>
>> With Lucene 6.6.2 I'm tryin
With Lucene 6.6.2 I'm trying to get a LongPoint value indexed and stored.
Old code:
LegacyLongField dateField = new LegacyLongField("modified", lastModified,
Field.Store.YES);
Because LegacyLongField is deprecated I tried LongPoint.
New code:
LongPoint dateField = new LongPoint("modified", last
les (of the downloaded
> conda package and the one in the source package) and they are identical.
> So, I think the issue should be somewhere else, otherwise I would face the
> same error while trying with conda-forge. No?
>
> Amin
>
>
> On Tue, Oct 24, 2017 at 8:05 AM, Bern
Hi Amin,
PRIxMAX is a "C" conversion specifier macro for integer type of uintmax_t.
It looks like a bug in jcc3.
The original code is:
sprintf(buffer, "%0*"PRIxMAX, (int) hexdig, hash);
Could be that a space between '"' and PRIxMAX is missing.
A quick fix for testing could be either enter a spac
Now we have a SynonymGraphFilter but also need other filters
to be graph-aware. I was already thinking about a ShingleGraphFilter.
But if a ShingleGraphFilter outputs a graph and is located before
SynonymGraphFilter where is the advantage?
The SynonymGraphFilter cannot consume arbitrary graphs.
Do
, the API docs say "...Injecting synonyms – here,
synonyms of a token should be added after that token..."
But as I already mentioned the synonyms are added before the token.
Are the docs outdated?
Regards
Bernd
Am 13.02.2017 um 17:31 schrieb Michael McCandless:
> On Mon,
moving the SPF filters in your test? Or otherwise
> simplify your test so it's closer to what my test case is doing?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Feb 13, 2017 at 7:52 AM, Michael McCandless
> wrote:
>> Thanks Bernd; I'
If you use only
> whitespace tokenizer and SGF does the issue reproduce?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Feb 10, 2017 at 10:07 AM, Bernd Fehling
> wrote:
>> Example for position end and positionLength of SGF.
>>
>&g
cCandless:
> On Thu, Feb 9, 2017 at 2:40 AM, Bernd Fehling
> wrote:
>> I tried SynonymGraphFilter with my setup and it works right away.
>> It payed of that I did some modifications on my filters while
>> testing 6.3 with my setup.
>
> Good!
>
>>
; SynonymGraphFilter will produce a correct graph (unlike SynonymFilter)
> and the Lucene query parsers (not sure about Solr's query parser fork)
> will correctly detect the graph and create the right query.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
>
ep doing what you are doing today, you should switch
> to SynonymGraphFilter followed by FlattenGraphFilter: it will make the
> same tokens as the current SynonymFilter, but will necessarily be
> buggy in the multi-token case.
>
> Mike McCandless
>
> http://blog.mikemccandless.
I just tried Solr 6.4.1 and noticed that SynonymFilterFactory is
deprecated, as reported in the logs.
I hope that this is just to note that there is also an alternative
SynonymGraphFilterFactory now available.
And _not_ that SynonymFilterFactory will disappear, because it runs my
multi-word Synon
Am 18.11.2016 um 08:58 schrieb Bernd Fehling:
> Hi Mike,
>
> let me explain.
>
> First, after looking deeper inside I noticed that the Filters are used
> like a stack and called backwards. So the first incrementToken goes
> to the last filter in the chain. That one also us
ou know it
> spanned "wow", "that's", "funny".
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Nov 17, 2016 at 10:22 AM, Bernd Fehling
> wrote:
>> Currently I'm tackling a problem with SynonymFilter wh
Currently I'm tackling a problem with SynonymFilter while going from 4.10.4 to
6.3.0.
For a special solution I need to know if a word (or multiword) is producing
synonyms in SynonymFilter.
Therefore I suggest the enhancement of "hasSynonyms" for SynonymFilter.
A workaroud would be to buffer all
why. I think you should ask
>> on the solr-user list?
>>
>> Or maybe try to change your deletes to be by Term instead of Query?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Thu, Aug 4, 2016 at 7:03 AM, Bernd Fehling
While increasing the indexing load of version 5.5.3 I see
threads where one merging thread is blocking other merging threads.
But is this concurrent merging?
Bernd
"Lucene Merge Thread #6" - Thread t@40280java.lang.Thread.State: BLOCKED
at org.apache.lucene.index.IndexWriter.mergeMiddle(I
eted queries are when you delete by query, but I don't think DIH would
> be doing that unless you asked it to ... maybe a Solr user/dev knows better?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Fri, Jul 29, 2016 at 3:21 AM, Bernd Fehling <
> bern
Can you revert that and re-test?
>
> I'm not sure why DIH is using updateDocument instead of addDocument ...
> maybe ask on the solr-user list?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Jul 28, 2016 at 10:07 AM, Bernd Fehling <
> bernd.feh
g
> https://issues.apache.org/jira/browse/LUCENE-6161
>
> Have you changed any IndexWriterConfig settings from defaults?
>
> What are your unique id fields like? How many bytes in length?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Jul 28,
While trying to get higher performance for indexing it turned out that
BufferedUpdateStreams is breaking indexing performance.
public synchronized ApplyDeletesResult applyDeletesAndUpdates(...)
At IndexWriterConfig I have setRAMBufferSizeMB=1024 and the Lucene 4.10.4 API
states:
"Determines the a
Hi list,
can anyone give some hints about removing duplicates from a MultiPhraseQuery?
I have the list with:
List termarray = (MultiPhraseQuery) myquery).getTermArrays();
But the lucene javadocs have only add, no remove or delete.
Only idea so far is to build a temporary MultiPhraseQuery and it
This question might be stupid, but why are there different type attributes?
We have , , , ... but also "word", "shingle", ...
Why not , , ...???
Is there a deeper logic behind this or just historically grown and not yet
unified?
Regards
Bernd
--
Hi Tom,
I just see that you have Linux with 2.6 kernel.
Have you already -XX:+UseLargePages as performance option enabled and in use?
Solaris 9 has it on by default but with Linux HugePages must be enabled.
http://www.oracle.com/technetwork/java/javase/tech/largememory-jsp-137182.html
Just an id
Hi list,
a stupid question about the naming of the index files.
While using lucene (and solr) 4.2 I still see files with "Lucene41" in the name.
This is somewhat confusing if lucene 4.x produces files with "Lucene4y".
This also means indexes built with 4.2 or 4.3 are fully compatible with 4.1 ?
R
/4123628/com-sun-jdi-invocationexception-occurred-invoking-method
Regards
Bernd
Am 14.11.2012 14:19, schrieb Robert Muir:
> On Wed, Nov 14, 2012 at 4:04 AM, Bernd Fehling
> wrote:
>> Hi list,
>> while walking through the code with debugger (eclipse juno) I get
-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
>> Sent: Wednesday, November 14, 2012 1:18 PM
>> To: java-user@lucene.apache.org
>> Su
While inspecting the content of topDocs.ScoreDoc I see 4 variables:
- doc
- fields
- score
- shardIndex
But ScoreDoc knows only about 3 (doc, score, shardIndex) is this the problem?
Regards
Bernd
Am 14.11.2012 13:04, schrieb Bernd Fehling:
> Hi list,
> while walking through the cod
Hi list,
while walking through the code with debugger (eclipse juno) I get the following:
com.sun.jdi.InvocationException occurred invoking method.
This is while trying to see org.apache.lucene.search.ScoreDoc
So the debugger seams to have a problem with the toString() of ScoreDoc.java
which looks
gt; easy custom filter to create though
>>
>> FWIW,
>> Erick
>>
>>
>> On Tue, Nov 13, 2012 at 7:02 AM, Robert Muir wrote:
>>
>>> On Mon, Nov 12, 2012 at 10:47 PM, Bernd Fehling
>>> wrote:
>>>> By the way, why does Tri
what I want.
>
> Found in a fortune cookie according to legend:
> "A programmer had a problem. He solved it with regular expressions. Now he
> has two problems".
>
>
>
>
> On Mon, Nov 12, 2012 at 9:04 AM, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de> wr
moen, Ingar ; Hauklien, Øystein ; Hedalen, Trond ; Kvam, Erik" -->
"brennmoeningarhauk"
Now this explains the sorting (shit in --> shit out).
But why is the first string reduced to "a", wrong regular expression?
Bernd
Am 12.11.2012 14:51, schrieb Bernd Feh
box and you should see which of the
> steps does the translation. Although changing it to "a" is really weird,
> it's almost certainly something you've defined in the indexing analysis
> chain.
>
> FWIW,
> Erick
>
>
> On Mon, Nov 12, 2012 at 8:19 A
Hi list,
a user reported wrong sorting of our search service running on solr.
While chasing this issue I traced it back through lucene into the index.
I have a text field for sorting
(stored,indexed,tokenized,omitNorms,sortMissingLast)
and three docs with author names.
If I trace at org.apache.lu
post from docs or create a copy of the
> page inside lucene's distribution!
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> Fr
se-lucenes-mmapdirectory-on-64bit.html
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
>
I'm trying to run CheckIndex as seperate tool on a large index to get
nice infos about number of terms, number of tokens, ... but always get OOM
exception.
Already have JAVA_OPTS -d64 -Xmx25g -Xms25g -Xmn6g
Any idea how to use CheckIndex on huge index size?
Opening index @ /srv/www/solr/sol
LUCENE-4237 - add ant task to generate optionally ALL javadocs
https://issues.apache.org/jira/browse/LUCENE-4237
Am 19.07.2012 07:59, schrieb Robert Muir:
> On Thu, Jul 19, 2012 at 1:53 AM, Bernd Fehling
> wrote:
>> ...
>> Robert Muir added a comment - 12/Apr/12 16:24
>
...
Robert Muir added a comment - 12/Apr/12 16:24
We can save 10MB with this patch, which nukes the 'index'.
I guarantee you nobody will miss it. Just click this thing and see how
useless it is (since its every method etc in all of lucene).
...
Yeah, "nobody will miss it" and "see how useless it i
Dear developers,
while upgrading from 3.6.x to 4.x I have to rewrite some of my code and
search for the new methods and/or classes. In 3.6.x and older versions
the API Javadoc interface had an "Index" which made it easy to find the
appropriate methods. The button to call the "Index" was located in
Dear list,
I'm in the need of query switched filters (to turn filters on and off by query
parameter).
I've already send my idea to the solr list and asked for opinions, but no
complains from there.
http://lucene.472066.n3.nabble.com/skipping-parts-of-query-analysis-for-some-queries-td3382239.h
Hi Vineet,
nice site and documentation, but what is the sence of "sign-up" and "login"?
Regards
Bernd
Am 30.08.2011 22:28, schrieb Vineet Sinha:
Hey guys,
We have been working hard on building a helpful site for Lucene Architecture
and Documentation. We have been updating the content and work
. You could try
attaching to the Solr instance with jConsole and use that to trigger
garbage collections to see what that could tell you...
Best
Erick
On Tue, Jun 21, 2011 at 8:39 AM, Bernd Fehling
wrote:
Currently I'm using version 3.2.
I used already 4.x some month ago but there was to
c).
Best
Erick
On Tue, Jun 21, 2011 at 5:32 AM, Bernd Fehling
wrote:
I'm trying to understand the logic of/behind fieldCache.
Who has written this peace of code or has good knowledge about it?
Why is it under the hood of jetty?
I see FieldCache$StringIndex with
- f_dccollection
- f_d
I'm trying to understand the logic of/behind fieldCache.
Who has written this peace of code or has good knowledge about it?
Why is it under the hood of jetty?
I see FieldCache$StringIndex with
- f_dccollection
- f_dcyear
- f_dctype
but also
- dctitle --> f_dctitle --> f_dccreator
- title --> f_
punctuation, and try again
3) rewrite the query, quoting all punctuation, and try again
would that work for you?
On 5/5/2011 3:26 AM, Bernd Fehling wrote:
Dear list,
I need a QueryValidator and don't mind writing one but don't want
to reinvent the wheel in case there is already some
Dear list,
I need a QueryValidator and don't mind writing one but don't want
to reinvent the wheel in case there is already something.
Is this the right list for talking about a QueryValidator or
should it belong to SOLR?
What do I mean with a QueryValidator?
I think about something like valida
o removing replicateAfter startup removes the write.lock when starting
with an optimized index and replication on a master.
To solve this tiny issue I would recommend to also send an optimize
after sending a commit if the index has state optimize=true.
Bernd
Am 03.05.2011 09:22, schrieb Bernd Fehling
be somewhere around the DeletionPolicy...
Regards, Bernd
Am 02.05.2011 17:45, schrieb Michael McCandless:
On Mon, May 2, 2011 at 9:17 AM, Bernd Fehling
wrote:
Dear list,
some questions about the index.
(questions go to the lucene list because it is more about the index itself)
First my
Dear list,
some questions about the index.
(questions go to the lucene list because it is more about the index itself)
First my results from CheckIndex:
Segments file=segments_l6 numSegments=1 version=FORMAT_3_1 [Lucene 3.1]
Checking only these segments: _79s:
1 of 1: name=_79s docCount=28146
rt the bugs in Jetty to Jetty itself
>
> Thanks,
> Uwe!
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Bernd Fehling [mailto:bernd.fehl...@uni-b
Hi Robert,
thanks to you and Yonik for looking into this.
As soon as Apache jira is back online I will try your jetty version
and give feedback.
Regards,
Bernd
> On Fri, Feb 25, 2011 at 9:09 AM, Bernd Fehling
> wrote:
> > Hi Yonik,
> >
> > good point, yes we are using J
:09 AM, Bernd Fehling
> wrote:
>> Hi Yonik,
>>
>> good point, yes we are using Jetty.
>> Do you know if Tomcat has this limitation?
>
> Tomcat's defaults are worse - you need to configure it to use UTF-8 by
> default for URLs.
> Once you do, it passes
Hi Yonik,
good point, yes we are using Jetty.
Do you know if Tomcat has this limitation?
Regards,
Bernd
Am 25.02.2011 14:54, schrieb Yonik Seeley:
> On Fri, Feb 25, 2011 at 8:48 AM, Bernd Fehling
> wrote:
>> So Solr trunk should already handle Unicode above BMP for field
3 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
>> Sent: Friday, February 25, 2011 2:19 PM
>> To: simon.willna...@gmail.com
>> Cc: java-user@lucene.apache.org
&
utf-8 code.
Regards,
Bernd
Am 25.02.2011 13:43, schrieb Simon Willnauer:
> On Fri, Feb 25, 2011 at 1:02 PM, Bernd Fehling
> wrote:
>> Hi Simon,
>>
>> thanks for the details.
>>
>> My platform supports and uses code above BMP (0x1 and up).
>> So t
ll be available?
Regards,
Bernd
Am 25.02.2011 12:04, schrieb Simon Willnauer:
> Hey Bernd,
>
> On Fri, Feb 25, 2011 at 11:23 AM, Bernd Fehling
> wrote:
>> Dear list,
>>
>> a very basic question about lucene, which version of
>> unicode can be handled (indexed and s
Dear list,
a very basic question about lucene, which version of
unicode can be handled (indexed and searched) with lucene?
It looks like lucene can only handle the very old Unicode 2.0
but not the newer 3.1 version (4 byte utf-8 unicode).
Is that true?
Regards,
Bernd
--
Hi Simon,
thanks a lot for your good explanation.
Best wishes,
Bernd
Am 03.01.2011 13:51, schrieb Simon Willnauer:
> Hey Bernd,
>
> On Mon, Jan 3, 2011 at 1:35 PM, Bernd Fehling
> wrote:
>> Dear list,
>>
>> some questions about the names of the index files.
&g
Dear list,
some questions about the names of the index files.
With an older Lucene/Solr 4.x version from trunk my index looks like:
_2t1.fdt
_2t1.fdx
_2t1.fnm
_2t1.frq
_2t1.nrm
_2t1.prx
_2t1.tii
_2t1.tis
segments_2
segments.gen
With a most recent version from trunk it looks like:
_3a9.fdt
_3a9.fd
t any result!
>
> I'd suggest to read a book about Lucene/Solr first :-)
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Bernd
ng the encoded value,
> just don't
> store it.
> You can still search on the encoded value if in that case
>
> Which is a way of saying that I don't know, off the top of my
> head, how
> you'd
> index one thing and store the result of analysis...
>
>
y",
> your
> display would be something like "run on empti".
>
> And if you're doing pure lucene, you can see this by enumerating the terms
> in your
> dcdocid field.
>
> Best
> Erick
>
> On Fri, Nov 26, 2010 at 2:10 AM, Bernd Fehling <
&
tokens go of course through your analyzer and the returned tokens
> are indexed as terms. Where is the problem?
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message--
irst, I'd be sure the value in question is in the document just before
> sending it to be added to your index to see if the value you think
> is in there really is. Something like Document.get() and see if
>
> Best
> Erick
>
> On Thu, Nov 25, 2010 at 8:08 AM, Bernd Feh
I used KeywordAnalyzer and KeywordTokenizer as templates for
a new analyzer.
The analyzer works fine but the result never reaches the index.
My analyzer is called in "DocInverterPerField.processFields"
with "stream.incrementToken()".
...
try {
boolean hasMoreTokens = stream.incrementToken();
68 matches
Mail list logo