RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

2021-05-19 Thread Gietzen, Markus
Hi again, I found the difference causing the slow down. It's NRTCachingDirectory#doCacheWrite method. With the implementation of 8.8 it's slow. With the version of 8.3 it's fast. Hope it helps, Markus -Original Message- From: Gietzen, Markus Sent: Wednesday, 19

RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

2021-05-19 Thread Gietzen, Markus
fine. Now 8.8 performs as fast as 8.3! I will check the differences and put them in step by step to find out which change causes the slow-down. I’ll report here. Bye, Markus From: Michael McCandless Sent: Wednesday, 19 May 2021 13:39 To: Lucene Users ; Gietzen, Markus Subject: Re

Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

2021-05-19 Thread Gietzen, Markus
WindowsNativeDispatcher.CreateFile0 Add the end of the mail I added two example-stacktraces that show this behavior. Has someone an idea what change might cause this or if I need to do something different in 8.8 compared to 8.3? Thanks for any help, Markus Here is an example stacktrace that is causing such a try

RE: umlauts / diacritic expansion

2019-04-16 Thread Markus Jelsma
Hello Michael, For the case of normalizing ü to ue, take a look at the german normalizer [1]. Regards, Markus [1] https://lucene.apache.org/core/7_6_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html -Original message- > From:Ralf Heyde >

RE: 8.0.0 ClassCastException in ValueSource

2019-03-27 Thread Markus Jelsma
Hello Adrian, I opened LUCENE-8741 ClassCastException in ValueSource$ScoreAndDoc. Thanks, Markus https://issues.apache.org/jira/browse/LUCENE-8741 -Original message- > From:Adrien Grand > Sent: Tuesday 26th March 2019 18:58 > To: Lucene Users Mailing List > Subjec

8.0.0 ClassCastException in ValueSource

2019-03-20 Thread Markus Jelsma
his a known issue? Thanks! Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Query-of-Death Lucene/Solr 7.6

2019-02-08 Thread Markus Jelsma
Please let me know. Thanks, Markus -Original message- > From:Markus Jelsma > Sent: Friday 8th February 2019 11:08 > To: java-user@lucene.apache.org > Subject: Query-of-Death Lucene/Solr 7.6 > > Hello, > > While working on SOLR-12743, using 7.6 on two nodes

Query-of-Death Lucene/Solr 7.6

2019-02-08 Thread Markus Jelsma
produces just a 9 MB toString() for the query. I could not find anything like this in Jira. I did think of LUCENE-8479 and LUCENE-8531 but they were about graphs, this problem looked related though. Existing issue? New bug? Many thanks, Markus ps. in Solr i even got an &#

RE: An example for creating SynonymMap Object?

2018-10-15 Thread Markus Jelsma
Hello Baris, The expand parameter defaults to true, so you should not have to add both rules. If you are using Solr, you can easily check it in the analysis tab. If not, printing the resulting Query object works as well. Regards, Markus -Original message- > From:baris

RE: An example for creating SynonymMap Object?

2018-10-15 Thread Markus Jelsma
Hello Baris, Check out the filter factory and the map parser for a more low level example: https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymGraphFilterFactory.java https://github.com/apache/lucene-solr/blob/master/lucene/a

RE: Lucene same search result for worlds with and without spaces

2018-06-20 Thread Markus Jelsma
larissues" with "similar issues" (and vice versa) you might want to check out DictionaryCompoundWordTokenFilter and/or HyphenationCompoundWordTokenFilter. Although English hardly uses compound words, the token filters still do their job quite nicely. Regards, Markus -Origin

RE: Rewrite SynonymQuery to support payloads

2018-05-24 Thread Markus Jelsma
Query could i borrow, and what not? Or, if there is a better way, should i instead try to add payload support to an extended SynonymQuery, would that be easier? And how should i do that? What would be the best to tackle this issue? Many thanks, Markus -Original message- > From:Al

Rewrite SynonymQuery to support payloads

2018-05-23 Thread Markus Jelsma
would also cause both clauses to score if they match. So, how can i transform a SynonymQuery into something that i can wrap into PayloadScoreQuery on Lucene/Solr 7.x? Many thanks, Markus - To unsubscribe, e-mail: java-user

Multiple languages, boosting and, stemming and KeywordRepeat

2018-05-14 Thread Markus Jelsma
ere any real solutions to this problem? Removing the RemoveDuplicates filter looks really silly. Many thanks! Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: German decompounding/tokenization with Lucene?

2017-09-16 Thread Markus Jelsma
Sorry, i would if i were on Github, but i am not. Thanks again! Markus -Original message- > From:Uwe Schindler > Sent: Saturday 16th September 2017 12:45 > To: java-user@lucene.apache.org > Subject: RE: German decompounding/tokenization with Lucene? > > Send a pull re

RE: German decompounding/tokenization with Lucene?

2017-09-16 Thread Markus Jelsma
Hello Uwe, Thanks for getting rid of the compounds. The dictionary can be smaller, it still has about 1500 duplicates. It is also unsorted. Regards, Markus -Original message- > From:Uwe Schindler > Sent: Saturday 16th September 2017 12:16 > To: java-user@lucene.apache.org

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
which CharFilter provides. But that won't allow you to set TypeAttribute. Perhaps i am missing something completely and am stupid, probably :) Thanks, Markus -Original message- > From:Tommaso Teofili > Sent: Wednesday 14th June 2017 23:49 > To: java-user@lucene.apache.org &

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
Hello Erick, no worries, i recognize you two. I will take a look at your references tomorrow. Although i am still fine with eight bits, i cannot spare any more but one. If Lucene allows us to pass longer bitsets to the BytesRef, it would be awesome and easy to encode. Thanks! Markus

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
mited to 8 bits. Although we can easily fit our reduced treebank in there, we also use single bits to signal for compound/subword, and stemmed/unstemmed and some others. Hope this helps. Regards, Markus -Original message- > From:Erik Hatcher > Sent: Wednesday 14th June 2017

RE: Using POS payloads for chunking

2017-06-14 Thread Markus Jelsma
use spans and phrase queries to find chunks of multiple POS-tags. This would be the first approach i can think of. Treating them as regular tokens enables you to use regular search for them. Regards, Markus -Original message- > From:José Tomás Atria > Sent: Wednesday 14t

RE: Term no longer matches if PositionLengthAttr is set to two

2017-05-04 Thread Markus Jelsma
Ok, we decided not to implement PositionLengthAttribute for now due to, it either is a bad applied (how could one even misapply that attribute?) or Solr's QueryBuilder has a weird way of dealing with it or.. well. Thanks, Markus -Original message- > From:Markus Jelsma > S

RE: Term no longer matches if PositionLengthAttr is set to two

2017-05-01 Thread Markus Jelsma
Hello again, apologies for cross-posting and having to get back to this unsolved problem. Initially i thought this is a problem i have with, or in Lucene. Maybe not, so is this problem in Solr? Is here anyone who has seen this problem before? Many thanks, Markus -Original message

Term no longer matches if PositionLengthAttr is set to two

2017-04-25 Thread Markus Jelsma
query time seems to be a problem. Any thoughts on this issue? Is it a bug? Do i not understand PositionLengthAttribute? Why does it affect term/document matching? At query time but not at index time? Many thanks, Markus ---

RE: Lucene

2017-02-08 Thread Markus Jelsma
official, second is old but maybe still relevant. Please not this is usually not to be used in production. Regards, Markus -Original message- > From:Anthony Van > Sent: Wednesday 8th February 2017 22:51 > To: java-user@lucene.apache.org > Subject: Lucene > > Good

RE: question

2017-01-16 Thread Markus Jelsma
Yes, they should be the same unless the field is indexed with shingles, in that case order matters. Markus -Original message- > From:Julius Kravjar > Sent: Monday 16th January 2017 18:20 > To: java-user@lucene.apache.org > Subject: question > > May I have one que

Offset bug in WordDelimiterFilter?

2016-12-06 Thread Markus Jelsma
ays the length of the original term. So if a user queries for a sigular term, the whole plural (original) is highlighted. Am i missing something? Bug? Thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@

Range query on date field

2016-11-24 Thread Markus Jelsma
s T and Z are somehow lowercased by the query parser. I feel incredible stupid so many thanks in advance! Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Upgrade 6.2.x Char* API's

2016-09-21 Thread Markus Jelsma
does? In what class has CharacterUtils changed its name to? Is it still usable for extending parties? Thanks, Markus -Original message- > From:Uwe Schindler > Sent: Wednesday 21st September 2016 13:30 > To: java-user@lucene.apache.org > Subject: RE: Upgrade 6.2

Upgrade 6.2.x Char* API's

2016-09-21 Thread Markus Jelsma
.util Is there a Jira a have missed? Many thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: LowerCaseFilter gone in 6.2.0

2016-08-31 Thread Markus Jelsma
Thanks for pointing to that issue. It also explains other errors. Markus -Original message- > From:Uwe Schindler > Sent: Wednesday 31st August 2016 11:32 > To: java-user@lucene.apache.org > Cc: 'Michael McCandless' > Subject: RE: LowerCaseFilter gone in 6.2

LowerCaseFilter gone in 6.2.0

2016-08-31 Thread Markus Jelsma
Hello - i'm upgrading a project that uses Lucene to 6.2.0 and get the compile error that LowerCaseFilter does not exists. And, so it seems, the JavaDoc is gone too. I've checked CHANGES.txt and there is no mention of it, not even in the API changes section. Any ideas? Thanks, Mar

RE: BlendedTermQuery causing negative IDF?

2016-04-19 Thread Markus Jelsma
doesn't exceed docCount. I'd like to try DFISimilarity and ClassicSimilarity as well, but for some reason the unit tests do not accept the similarity defined in the test's schema.xml?! Thanks! Markus -Original message- > From:Ahmet Arslan > Sent: Tuesday

BlendedTermQuery causing negative IDF?

2016-04-19 Thread Markus Jelsma
uted from: 1.0 = termFreq=1.0 1.2 = parameter k1 0.75 = parameter b 2.0 = avgFieldLength 2.56 = fieldLength What am i doing wrong? Or did i catch a bug? Thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Problem with porter stemming

2016-03-14 Thread Markus Jelsma
Hi - if you don't want specific words passed through a stemmer, you need to supply a CharArraySet with exclusions as the second argument to its constructor. Markus -Original message- > From:Dwaipayan Roy > Sent: Monday 14th March 2016 15:31 > To: java-user@lucene.apache

RE: Jira issue for possibly transient resource issue, or a Lucene or JVM bug?

2016-01-21 Thread Markus Jelsma
Thanks, i missed that! Glad its already resolved. Markus -Original message- > From:Ishan Chattopadhyaya > Sent: Thursday 21st January 2016 12:01 > To: java-user@lucene.apache.org > Subject: Re: Jira issue for possibly transient resource issue, or a Lucene or > JVM b

Jira issue for possibly transient resource issue, or a Lucene or JVM bug?

2016-01-21 Thread Markus Jelsma
Hi - we get the above issue as well some times. I've noticed Lucene-dev mails on this issue [1] but i couldn't find a corresponding Jira issue? Any pointer to that one? Many thanks, Markus [1] http://mail-archives.apache.org/mod_mbox/lucene-dev/201601.mbox/%3CCAPsWd+OWZpRLXCyX

RE: propagate Query.rewrite call to super.rewrite after 5.4 upgrade

2015-12-17 Thread Markus Jelsma
ition()? Is rewrite going to be called at some point where i can return a new Query object with decreased boost? Thanks, Markus -Original message- > From:Adrien Grand > Sent: Thursday 17th December 2015 14:40 > To: solr-user ; java-user@lucene.apache.org > Subject: Re: propag

propagate Query.rewrite call to super.rewrite after 5.4 upgrade

2015-12-17 Thread Markus Jelsma
unit test at the point i want to retrieve docs and assert their positions in the result set: ScoreDoc[] docs = searcher.search(spanfirstquery, 10).scoreDocs; I am probably missing something but any ideas to share? Many thanks! Markus

Re: Lucene Query to String

2015-11-10 Thread Markus Boese
Mit freundlichen Grüßen, Markus Boese > Hi Markus, what is the logic behind your query parser? How the query is expected to be rewritten ? I've never seen that kind of rewritten query, but if you tell us what you are expecting to rewrite, maybe would be easier to hel

Lucene Query to String

2015-11-10 Thread Markus Boese
abcd[ , 1] +f:1' Could anyone explain what lucene whats to tell me with '[ ,1]' ? I know lucene supports range queries but there are contains something like this '[1 TO 4]', thus no comma included... -- Regards, Markus Boese

FileNotFoundException in recovery

2015-08-04 Thread Markus Heiden
Hi, I sometimes get FileNotFoundExceptions from the recovery of a core in my log. Does anyone know the reason for this? As I understand Solr this may (or should) not happen. Markus 2015-08-04 15:06:07,646|INFO|mpKPXpbUwp|org.apache.solr.update.UpdateLog|Starting to buffer updates. FSUpdateLog

Re: Index-boosting not working in 5.2.1?

2015-07-01 Thread Markus Hegi - Nagavkar
uot; - scoring works fine "Tetra*" - here, I get all the same scores. I am building an auto-suggest, based on ontology terms. Scoring is crucial there, and also, that I find parts of words. Markus Simplified test code: public void simple(String inp) throws IOException { try {

Index-boosting not working in 5.2.1?

2015-07-01 Thread Markus Hegi - Nagavkar
t an identical score of: 1.4142135 What could be the problem? Some of my code: ... FieldType ft=new FieldType(); ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS); ft.setStored(true); ft.setTokenized(true); Field f=new Field(name, value, ft); f.setBoost(0.001f); doc.add(f); ... Markus

RE: LUCENE-5388 AbstractMethodError

2014-01-30 Thread Markus Jelsma
Hi Uwe, You're right. Although using the analysis package won't hurt the index, this case is evidence that it's a bad thing, especially if no backport is made. I'll port my code to use the updated API of 5.0. Thanks guys, Markus -Original message- > Fro

RE: LUCENE-5388 AbstractMethodError

2014-01-30 Thread Markus Jelsma
Maven against the most recent release of Solr and/or Lucene. If that stays a problem we may have to build stuff against branch_4x instead. Thanks, Markus -Original message- > From:Uwe Schindler > Sent: Thursday 30th January 2014 11:18 > To: java-user@lucene.apache.org > Su

LUCENE-5388 AbstractMethodError

2014-01-30 Thread Markus Jelsma
.x we must override that specific method: analyzer is not abstract and does not override abstract method createComponents(String,Reader) in Analyzer :) So, any hints on how to deal with this thing? Wait for 4.x backport of 5388, or do something clever like <...> fill in the blanks. Man

Coordination factor disabled for BM25 and other new scoring models

2013-08-22 Thread Markus Jelsma
nyone here that can shed some light on this? Thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Final token filters

2013-08-19 Thread Markus Jelsma
bits without copying the rest of the stuff around? Thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: "read past EOF" when merge

2012-11-05 Thread Markus Jelsma
https://issues.apache.org/jira/browse/SOLR-4032 -Original message- > From:Mark Miller > Sent: Sat 03-Nov-2012 14:20 > To: java-user@lucene.apache.org > Subject: Re: "read past EOF" when merge > > Can you file a JIRA Markus? This is probably related

RE: "read past EOF" when merge

2012-11-02 Thread Markus Jelsma
No this is not using NFS but EXT3 on SSD. Thanks -Original message- > From:Michael McCandless > Sent: Fri 02-Nov-2012 16:22 > To: java-user@lucene.apache.org > Subject: Re: "read past EOF" when merge > > On Fri, Nov 2, 2012 at 6:53 AM, Markus Jelsma &

RE: "read past EOF" when merge

2012-11-02 Thread Markus Jelsma
nHandler$3.write(ReplicationHandler.java:932) Markus -Original message- > From:Michael McCandless > Sent: Fri 02-Nov-2012 11:46 > To: java-user@lucene.apache.org > Subject: Re: "read past EOF" when merge > > Are you able to reproduce the corruption? &

RE: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-05 Thread Markus Jelsma
- > From:Thomas Matthijs > Sent: Thu 04-Oct-2012 15:55 > To: java-user@lucene.apache.org > Subject: Re: Highlighter IOOBE with modified > HyphenationCompoundWordTokenFilter > > And to include the code > > On Thu, Oct 4, 2012 at 3:52 PM, Markus Jelsma > wrote: > &

RE: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Markus Jelsma
eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > . > > Anyone to point me in the right direction? I've checked the LIA book on how > to manipulate the tokenstream and thought it

Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Markus Jelsma
to manipulate the tokenstream and thought it should be alright. My analysis tests also yield good results, nothing strange to be found. Or could it be an error in the highlighter that only now shows up? Thanks, Markus - To

Re: what's the status of droids project(http://incubator.apache.org/droids/)?

2011-08-23 Thread Markus Jelsma
You should ask on the Droids list but there's some activity in Jira. And did you consider Apache Nutch? On Tuesday 23 August 2011 10:17:50 Li Li wrote: > hi all > I am interested in vertical crawler. But it seems this project is not > very active. It's last update time is 11/16/2009 ---

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-18 Thread Markus Jelsma
> [X] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > [X] I/we build them from source via an SVN/Git checkout. > > [] Other (someone in your company mirrors them internally or via a > downst

Antwort: Re: Re: Highlighter wildcard problems: NoClassDefFoundError in Linux/CentOS 5.4, works in Windows XP

2010-07-30 Thread Markus Roth
and greetings, Markus Ian Lea An java-user

Antwort: Re: Highlighter wildcard problems: NoClassDefFoundError in Linux/CentOS 5.4, works in Windows XP

2010-07-30 Thread Markus Roth
First of all, thanks for your response. But how can that be true if a search-term without a wildcard (and the highlighting of the results) works fine? Greetings, Markus Ian Lea

Highlighter wildcard problems: NoClassDefFoundError in Linux/CentOS 5.4, works in Windows XP

2010-07-30 Thread Markus Roth
adClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336) ... 19 more Anyone got

Stopwords and Wildcards

2010-06-30 Thread Markus Mehrwald
? Thanks, Markus - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Exact match with fuzzy query

2010-06-12 Thread Markus Mehrwald
Am 12.06.2010 13:57, schrieb Ahmet Arslan: I am using lucene 3.0.1. I use a MultiFieldQueryParser with a GermanAnalyzer. In my index are some values among others one document with the title "bauer". I append to every word in my query a ~0.8 (here I am not sure if this is the way to do it). If I

Exact match with fuzzy query

2010-06-11 Thread Markus Mehrwald
the "er" if I am not using the fuzzy parameter. Can someone please tell my in a few words why? How can I do a fuzzy search which also finds exact matches? Thanks, Markus - To unsubscribe, e-mail: java-user-unsu

Restricting the result set with hierarchical ACL

2009-03-02 Thread Markus Malkusch
index was searched. I tried to get all allowed document ids (there's a field for the id) and put them into a BooleanQuery (id1 or id2, ...), but then I get a BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 So how can I restrict my search results with lucene?

Restricting the result set with hierarchical ACL

2009-03-02 Thread markus
index was searched. I tried to get all allowed document ids (there's a field for the id) and put them into a BooleanQuery (id1 or id2, ...), but then I get a BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 So how can I restrict my search results with lucene?

Re: Terms with different boosts

2008-09-11 Thread Markus Lux
Hi Guy, I think that isn't a problem related to fields. I experienced this kind of error caused by an limitation of the underlying file system. The problem was that I had too much InputStreams open that had never been closed. Please check that in your code and tell us if it worked. M

Injecting additional tokens

2008-09-01 Thread Markus Lux
;z4" at indexing time. There may also be several other characters that could be deleted in a new token. How could I manage that? Is there any predefined Tokenizer/Filter for this? Or am I wrong and there is a better way to get this done? Thanks. -- Markus

Re: Alternate spelling suggestion (was [Resent] Document boosting based on .. semantics? )

2008-02-29 Thread Markus Fischer
'm also providing the second best results as alternative (did you mean x or y?). The results have been very good so far, thanks again! - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

[Resent] Document boosting based on .. semantics?

2008-02-19 Thread Markus Fischer
r I should properly start a separate thread ... Has someone an advice how to approach this kind of problems? Is it appropriate/can it be solved with Lucene? Am I right here on this list anyway? :) thanks for any feedback, - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Different fields in the same and index and query boosting

2006-02-26 Thread Markus Fischer
eadings in HTML documents, e.g. title:(term)^8 h1:(term)^7 ... h6:(term)^2 content:(term)^1 . I was wondering if this is actually necessary. The number of existing h1 to h6 fields with content decreases with the amount of documents. To give the fields title and h1, which are the most used ones anyway,

Re: Stemming german words

2006-01-31 Thread Markus Fischer
Jonathan, what should I say, I'm feeling like an idiot now. Of course you're right. This actually solves the issue ;) thanks and sorry for wasting time, - Markus Jonathan O'Connor wrote: Markus, As I'm sure you know, "sucht" is also an inflection of "suc

Stemming german words

2006-01-31 Thread Markus Fischer
eanings in german (Suche = the Search, Sucht => addicttion). Is there a way to tune the stemmer or are there alternatives available or should I look for another stemmer for the german language? thanks for any pointers, - Markus ---

Re: Using one physical lucene index for multiple projects

2005-09-01 Thread Markus Fischer
and "AND" all queries for that key too. I'm just wondering whether all in all that's a good idea or not and would else I could do. - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Using one physical lucene index for multiple projects

2005-09-01 Thread Markus Fischer
would only add a new parameter to the Vector and and then disatch it to the method based on its signature. thanks, - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Using one physical lucene index for multiple projects

2005-08-31 Thread Markus Fischer
which ones. - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Using one physical lucene index for multiple projects

2005-08-31 Thread Markus Fischer
is key and the client can only access documents with his key. The goal is not about the ultimate security solution but not to have run multiple Lucene instances on the machines. I this a good idea to do it that way or would someone recommend another practice? t

Creating parser query "by hand"

2005-08-29 Thread Markus Fischer
is site again and I can't find an example on how it works to actually create the tokens myself and pass them to the searcher. Any help would be appriciated. thanks, - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Public access to the stemmer (germanstemmer in my case)

2005-08-13 Thread Markus Fischer
er idea and maybe I just overlooked a public interface to the stemmer output? Or I'm approaching the whole highlight search term from the wrong direction? thanks for any pointers, - Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question for Wildcard Search:

2005-06-22 Thread Markus Atteneder
> > > Sure. Simply index reversed words. > Since I do not have much experience with lucene can you explain it more exactly for me? THX! -- Weitersagen: GMX DSL-Flatrates mit Tempo-Garantie! Ab 4,99 Euro/Monat: http://www.gmx.net/de/go/dsl ---

Question for Wildcard Search:

2005-06-22 Thread Markus Atteneder
There is a possibility for searching with the "*" and "?" wildcard at the end and in the middle of a search string, but not at the beginning, is there way to do this? -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail

Updateing Documents:

2005-06-21 Thread Markus Atteneder
I am looking for a SearchEngine for our Intranet and so i deal with Lucene. I have read the FAQ and some Postings and i got first experiences with it and now i have some questions. 1. Is lucene a suitable SearchEngine for a Intranetsearch? I've experienced with poi and pdfbox for indexing Word/Ex

Re: Hypenated word

2005-06-13 Thread Markus Wiederkehr
On 6/13/05, Andy Roberts <[EMAIL PROTECTED]> wrote: > On Monday 13 Jun 2005 13:18, Markus Wiederkehr wrote: > > I see, the list of exceptions makes this a lot more complicated than I > > thought... Thanks a lot, Erik! > > > > I expect you'll need to do some

Re: Hypenated word

2005-06-13 Thread Markus Wiederkehr
I see, the list of exceptions makes this a lot more complicated than I thought... Thanks a lot, Erik! Markus On 6/13/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > On Jun 13, 2005, at 7:08 AM, Markus Wiederkehr wrote: > > I work on an application that has to index OCR text

Re: OutOfMemory when indexing

2005-06-13 Thread Markus Wiederkehr
may be completely wrong... Markus On 6/13/05, Stanislav Jordanov <[EMAIL PROTECTED]> wrote: > High guys, > Building some huge index (about 500,000 docs totaling to 10megs of plain > text) we've run into the following problem: > Most of the time the IndexWriter process

Updating documents

2005-06-13 Thread Markus Wiederkehr
not stored get lost. So is there any way to preserve fields that were not stored? Reconstructing these fields is to expensive in my application. Thanks in advance, Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Hypenated word

2005-06-13 Thread Markus Wiederkehr
something like that at http://www.lucenebook.com/. Thanks in advance, Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: managing docids for ParallelReader (was Augmenting an existing index)

2005-06-03 Thread Markus Wiederkehr
f you add the documents in the same order to both indexes and perform > the same deletions on both indexes then they'll have the same numbers. Would it be possible to write an IndexReader that combines two indexes by a common field, for example a document ID? And how performant wo

Re: managing docids for ParallelReader (was Augmenting an existing index)

2005-05-31 Thread Markus Wiederkehr
ot; IndexReader subclass that generates termDoc > lists on the fly by looking in an external database. This would require > a mapping between Lucene document ids and external document IDs. A > FieldCache, as described above, could serve that purpo

Re: ACLs and Lucene

2005-05-30 Thread Markus Wiederkehr
not in the other(s) the link between them gets lost. How do I prevent this? Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

ACLs and Lucene

2005-05-30 Thread Markus Wiederkehr
in advance, Markus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]