Scoring question - Get Score of Best Query in BooleanQuery

2007-05-06 Thread Thomas Thomas
Hello everyone, Whenever I search a word in my web application, I search in some default fields, e.g. I search the word "hello", I generate these queries : title:hello headlines:hello summary:hello content:hello Which I add in a BooleanQuery (BooleanClause.Occur.SHOULD) What I want to achieve

problem understanding the documentation for the TieredMergePolicy class

2012-06-12 Thread thomas
ePolicy.html#findMerges%28org.apache.lucene.index.SegmentInfos%29> Would somebody be so kind to explain it to me? Thanks, thanks a lot Thomas

Lucene and XML Architecture

2007-07-19 Thread Thomas
ombination of native XML stores and Lucene? Are there any problems that could arise from this combination? - Thomas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene and XML Architecture

2007-07-20 Thread Thomas
Thanx a lot Patrick! That's exactly what I was hoping for. I'll give it a shot. -Thomas Patrick Turcotte wrote: Hi, There is a Lucene-eXist trigger that allows you to do just that. Take a look at patch http://sourceforge.net/tracker/index.php?func=detail&aid=1654205&g

Scoring similarity by the position of the terms

2012-03-22 Thread Thomas Rewig
Similarity. Lucene has been developed and grown and I was wondering if you can now do the same thing in a simpler and more straigth forward way. Maybe with some of the newer SpanQuerys or a other use of payloads. Does anyone have any idea where to start? Regards Thomas

Re: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Thomas Matthijs
And to include the code On Thu, Oct 4, 2012 at 3:52 PM, Markus Jelsma wrote: > I forgot to add that this is with today's build of trunk. > > -Original message- >> From:Markus Jelsma >> Sent: Thu 04-Oct-2012 15:42 >> To: java-user@lucene.apache.org >> Subject: Highlighter IOOBE with modif

lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
Hello, I have some custom queries & scorer that need to able to construct the "global" docIds (doc + docBase). But when i use these in a QueryWrapperFilter they no longer work, because QueryWrapperFilter.getDocIdSet uses a "private context" (context.reader().getContext();) which always has a docB

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 2:29 PM, Thomas Matthijs wrote: > On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: >> Hi, >> >> This is a known problem currently. I think there is already an issue open, >> so this was not solved for 4.0 (I don't have the issu

Lucene-MoreLikethis

2013-01-15 Thread Thomas Keller
Hey, I have a question about "MoreLikeThis" in Lucene, Java. I built up an index and want to find similar documents. But I always get no results for my query, mlt.like(1) is always empty. Can anyone find my mistake? Here is an example. (I use Lucene 4.0) public class HelloLucene { public

updateDocument question

2013-02-06 Thread Becker, Thomas
I've built a search prototype feature for my application using Lucene, and it works great. The application monitors a remote system and currently indexes just a few core attributes of the objects on that system. I get notifications when objects change, and I then update the Lucene index to kee

RE: updateDocument question

2013-02-07 Thread Becker, Thomas
estion Hi Thomas, On Wed, Feb 6, 2013 at 2:50 PM, Becker, Thomas wrote: > I've built a search prototype feature for my application using Lucene, and it > works great. The application monitors a remote system and currently indexes > just a few core attributes of the object

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > On 20/02/2013 11:28, Paul Taylor wrote: > >> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >> that use NormalizeCharMap for replacing characters in the anyalzers are not >> working. >> >> bump, anybody I thought a s

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > > On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > >> On 20/02/2013 11:28, Paul Taylor wrote: >> >>> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >>> that use Norma

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 12:19 PM, Thomas Matthijs wrote: > On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > >> >> On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: >> >>> On 20/02/2013 11:28, Paul Taylor wrote: >>> >>>> Just upd

Detecting when an index was not closed properly

2013-04-05 Thread Becker, Thomas
We are doing some crash resiliency testing of our application. One of the things we found is that the Lucene index seems to get out of sync with the database pretty easily. I suspect this is because we are using near real time readers and never actually calling IndexWriter.commit(). I'm tryin

RE: Detecting when an index was not closed properly

2013-04-09 Thread Becker, Thomas
ginal Message- From: Becker, Thomas [mailto:thomas.bec...@netapp.com] Sent: Friday, April 05, 2013 1:33 PM To: java-user@lucene.apache.org Subject: Detecting when an index was not closed properly We are doing some crash resiliency testing of our application. One of the things we found is tha

Re: Taking backup of a Lucene index

2013-04-17 Thread Thomas Matthijs
On Wed, Apr 17, 2013 at 12:57 PM, Ashish Sarna wrote: > I want to take back-up of a Lucene index. I need to ensure that index files > would not change when I take their backup. > > > I am concerned about the housekeeping/merge/optimization activities which > Lucene performs internally. I am not

Re: RAMDirectory and expungeDeletes()/optimize()

2013-05-21 Thread Thomas Matthijs
On Tue, May 21, 2013 at 3:12 PM, Konstantyn Smirnov wrote: > I want to refresh the topic a bit. > > Using the Lucene 4.3.0, I could'n find a method like expungeDeletes() in > the > IW anymore. http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes()

Re: Taking backup of a Lucene index

2013-06-06 Thread Thomas Matthijs
On Thu, Jun 6, 2013 at 7:38 AM, Lance Norskog wrote: > The simple answer (that somehow nobody gave) is that you can make a copy > of an index directory at any time. Indexes are changed in "generations". > The segment* files describe the current generation of files. All active > indexing goes on i

What to do with Lucene Version parameter on upgrade

2013-06-20 Thread Becker, Thomas
I'm relatively new to Lucene and am in the process of upgrading from 4.0 to 4.3.1. I'm trying to figure out if I need to leave my version at LUCENE_40 or if it is safe to change it to LUCENE_43. Does this parameter directly determine the index format? I have some existing indexes from 4.0 but

Re: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist

2013-07-13 Thread Thomas Matthijs
On Sat, Jul 13, 2013 at 10:25 AM, VIGNESH S wrote: > Hi, > > I tried indexing in Desktop..It works fine. > The above error loading error comes only in android.. > Any comments.. Don't strip META-INF/services files out of the jars

RE: query on exact match in lucene

2013-07-17 Thread Becker, Thomas
Sounds like you need a PhraseQuery. -Original Message- From: madan mp [mailto:madan20...@gmail.com] Sent: Wednesday, July 17, 2013 7:40 AM To: java-user@lucene.apache.org Subject: query on exact match in lucene how to get exact string match ex- i am searching for file which consist of s

Partial word match using n-grams

2013-07-18 Thread Becker, Thomas
One of our main use-cases for search is to find objects based on partial name matches. I've implemented this using n-grams and it works pretty well. However we're currently using trigrams and that causes an interesting problem when searching for things like "abc ab" since we first split on whi

RE: Partial word match using n-grams

2013-07-18 Thread Becker, Thomas
dataset you might consider allowing leading wildcards so that you could easily find all words, for example, containing abc with *abc*. If your dataset is larger, you might consider something like ReversedWildcardFilterFactory (Solr) to speed this type of matching. I look forward to other opinion

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas
.almost. Y. You're right. FuzzyQuery is not at all what you want. Don't know if your data is actually as simple as this example. Do you need to tokenize on whitespace? Would it make sense to replace spaces in the query with underscores and then trigramify the whole query as i

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas
ad the entire string. If the string is broken on _ already, then NGramFilter already receives the individual terms and you can put a Filter in front that will pass through a padded token? Shai On Fri, Jul 19, 2013 at 3:45 PM, Becker, Thomas wrote: > In general the data for this field is tha

RE: Partial word match using n-grams

2013-07-30 Thread Becker, Thomas
other implications, of course, but you get the idea There are a zillion possibilities here in terms of combining various filterFactories Best Erick On Fri, Jul 19, 2013 at 9:06 AM, Becker, Thomas wrote: > Sorry, at indexing time it's not broken on anything. In other words > qu

IndexSearcher - open file handles by deleted files

2010-05-26 Thread Thomas Rewig
s not automatically if i close searcher.close()? Do I have to close something else, than all IndexSearchers and Directorys? Or am I wrong with my assumption, and the problem is somewhere else? Best Thomas - To unsubscri

Introduction to flexible indexing?

2010-06-14 Thread Thomas Koch
understand this page and help to get it in shape. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Fielded Queries Question

2010-07-06 Thread Thomas Nguyen
Hello All, Can someone explain to me how fielded queries work with phrases? My first thought is that the phrase is broken down into terms and those terms are then fielded and separated with the AND operator. An example would be the following: name:"Tom Jones" --> name:"Tom" AND name:"Jones" I

Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas
Hello Guys, I trying to understand how lucene score is calculated. So 'm using the searcher.explain() function. But the output it gives is really confusing for me. Below are the details of the query that I gave and o/p it gave me Query: *It is definitely a CES deal that will be over in Sep or Oct

Re: Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas
m frequency, idf and field norm > > 0.07028562 = (MATCH) fieldWeight(payload:ces in 550), product of: > > 1.0 = *tf(*termFreq(payload:ces)=1) > > 2.2491398 = *idf(*docFreq=157, maxDocs=551) > > 0.03125 = *fieldNorm*(field=payload, doc=550) > >

Restore documents marked as deleted

2010-10-06 Thread Philippe Thomas
Hi, I was indexing some documents, but my program crashed after several days of work. If I reopen this index it is empty. I guess the reason is that auto-commit was not set and I never performed a commit. (Lesson learned) So probably all documents are marked as "deleted" and re-opening the i

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-11-10 Thread Thomas Rewig
amount of the deleted file handles will be stable - but first at a amount of 500 or so. Thanks in advance Thomas I integrated your SearchManager class into our code, but I am still seeing file handles marked deleted in the index directory. I am running the following command on Linux: sudo watch

Deleted File Handles - Index Writer

2010-11-12 Thread Thomas Rewig
used" by the indexwriter) grows. Is that possible and if yes why does the indexwriter do it? Is there a max Value of deleted handles an IndexWriter could own, because I don't want to chrash the system because of too much open filehandles? Thanks in advance. Thomas --

Re: Deleted File Handles - Index Writer

2010-11-18 Thread Thomas Rewig
help. Thomas I've found a case, only with compound file, where IndexWriter holds open a SegmentReader on the pre-compound-file files... I'm working on a test case& fix. Mike On Fri, Nov 12, 2010 at 5:49 AM, Thomas Rewig wrote: Hello, I use the searcherManager for LiveIndexin

Re: Deleted File Handles - Index Writer

2010-11-19 Thread Thomas Rewig
.0.2 Release version or have I wait for a future release? Thanks for your help. Thomas Listen Read phonetically

Check Numeric Fields

2011-03-11 Thread Thomas Rewig
t the NumericRangeQuery query does not work? I use lucene v. 3.0.2. Thanks in advance! Thomas

name matching / mapping

2011-07-06 Thread Thomas Rewig
s all names of the second id-space and the first id-space is used for the querrys. String[] suggestions = spellchecker.suggestSimilar("john w.", 5); But is there a better approach? Can someone point me in the right direction for a effective approach? Thanks in

TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-15 Thread Thomas Rewig
I would expect if I do a 'exact matching' Term Query. Each index was indexed with its associated LuceneVersion. I tested it with luke and with my own Code - the result was always the same. Is it a new feature in Lucene 3.3.0 or a b

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-18 Thread Thomas Rewig
Doc.Id=227606 id=716893 name=aim Is there a way to guarantee the inner sorting of same scores? Or how can I avoid that documente with special characters have the same score as documente of exact matches? Thanks in advance! Thomas Am 18.07.2011 10:08, schrieb Ian Lea: I'm not su

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-19 Thread Thomas Rewig
re=12,2324 Doc.Id=8060id=709579name=aim溝脇しほみ 1Score=12,2324 Doc.Id=227606id=716893name=aim To avoid these problems right from the start, I need to use a different analyser for indexing? (So that the docs 'aim溝脇しほみ' and 'aim' have different scores) Thank

Do duplicate documents affect term scoring?

2011-11-27 Thread Stephen Thomas
List, I am indexing a subset of Wikipedia. I have 4 years worth of data, and have taken snapshots of each document at each month in the 4 year span. Thus, I have 4*12=36 versions of each document. (I keep track of the timestamp in a field.) I have noticed that in many cases, a Wikipedia document d

Scoring a document using LDA topics

2011-11-28 Thread Stephen Thomas
List, I am trying to incorporate the Latent Dirichlet Allocation (LDA) topic model into Lucene. Briefly, the LDA model extracts topics (distribution over words) from a set of documents, and then represents each document with topic vectors. For example, documents could be represented as: d1 = (0,

Re: Scoring a document using LDA topics

2011-11-29 Thread Stephen Thomas
ote about this sometime back...maybe this would help you. > http://sujitpal.blogspot.com/2011/01/payloads-with-solr.html > > -sujit > > On Mon, 2011-11-28 at 12:29 -0500, Stephen Thomas wrote: >> List, >> >> I am trying to incorporate the Latent Dirichlet Allocation

Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
List, I have written my own CustomAnalyzer, as follows: public TokenStream tokenStream(String fieldName, Reader reader) { // TODO: add calls to RemovePuncation, and SplitIdentifiers here // First, convert to lower case TokenStream

Re: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
>> -Original Message- >> From: stephen.warner.tho...@gmail.com >> [mailto:stephen.warner.tho...@gmail.com] On Behalf Of Stephen Thomas >> Sent: Tuesday, November 29, 2011 5:20 PM >> To: java-user@lucene.apache.org >> Subject: Custom Filter for Splitting CamelCase?

Re: Search in a specific ScoreDiopoc result

2013-09-17 Thread Thomas Guttesen
Kkkutterujjjbbb hgggja Den 17/09/2013 12.55 skrev "David Miranda" : > > Hi, > > I want to do a kind of 'facet search', that initial research in a field of > all documents in the Lucene index, and second search in other field of the > documents returned to the first research. > > Currently I'm do th

org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-01 Thread Thomas Fuchs
.run(Thread.java:695) I don't think thats an expected behavior and it is a bug in org.apache.lucene.search.TopScoreDocCollector. Am I wrong? Regards - Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-02 Thread Thomas Fuchs
Hi, I couldn't reproduce the problem in the following test case, so let's drop this. Regards - Thomas -- import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import org.apache.lucene.document.*; import org.apache.lucene.index

How to ignore a ,

2016-11-28 Thread Thomas Johnson
; when we search for "Doe*" Thank you. Thomas W. Johnson, Senior Programmer 678-397-1663 tjohn...@paperhost.com<mailto:tjohn...@paperhost.com> [PaperHost] [asdf]<http://bit.ly/PaperHost_Twitter> Follow PaperHost on T

java 17 and older lucene (4.x)

2022-09-26 Thread Thomas Matthijs
Hello, Just wondering if anyone has patched lucene 4.x for usage with java 17+ and willing to share their work? anything would be appreciated. No we cannot upgrade lucene, and will likely spend time to try to backport/patch it ourselves, but maybe someone already has? if anyone has interest in

is there an histogram feature in lucene ak Magelan

2008-10-13 Thread Thomas Birnbaum
350 damage unrepaired 30 metallic 60 something like this... is there a way to do the same with lucene? thx thomas. -- GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion! http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

Filtering accents

2008-12-30 Thread legrand thomas
Dear all, I'd like my lucene searches to be insensitive to (French) accents. For example, considering a indexed term "métal", I want to get it when searching for "metal" or "métal" . I use lucene-2.3.2 and the searches are performed with: IndexSearcher.search(query,filter,sorter), Another filte

Creating document fields by providing termvector directly (bypassing the analyzing/tokenizing stage)

2009-04-21 Thread Thomas Pönitz
] b[2] c[1]. The old discussion had no real solution but it is also a bit outdated, maybe someone has a better idea now. Greets, Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-

Index and search terms containing character "-"

2009-05-31 Thread legrand thomas
Hi, I have a problem using TermQuery and FuzzyQuery for terms containing the character "-". Considering I've indexed "jack" and "jack-bauer" as 2 tokenized captions, I get no result when searching for "jack-bauer". Moreover, "jack" with a TermQuery returns the two captions.   What should I do t

Re: Index and search terms containing character "-"

2009-06-02 Thread legrand thomas
d strongly recommend you get a copy of Luke, it's invaluable for questions like this because it lets you look at what's actually in your index. It'll also show you how queries get broken down when pushed through various analyzers... BTW, nice test case for demonstrating what you w

Re: Loading an index into memory

2009-07-24 Thread Thomas Becker
/www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM >>>> _WL_QA_HM_sports_photos_072009&cat=sports >>>> >>> - >>> To unsubscribe, e-mail: java-user-u

2.9 - leftover (deleted) filehandles after upgrade

2009-07-29 Thread Thomas Becker
mpDir); with IndexSearcher indexSearcherTmp = new IndexSearcher(tmpDir, true); No errors in the logfiles, no catched exceptions, etc. I'm a kinda out of ideas at the moment. I googled and tried couple of things (IndexWriter.setUseCompoundFile(true), etc.) but didn't find a solution. A

lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
y took {} ms", durationMillis); } return docs; } I'm wondering why others are experiencing better performance with 2.9 and why our implementations performance is going bad. Maybe our way of using the 2.9 api is not the best and sorting is definetly

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Missed the attachment, sorry. Thomas Becker wrote: > Hi all, > > I'm experiencing a performance degradation after migrating to 2.9 and running > some tests. I'm getting out of ideas and any help to identify the reasons why > 2.9 is slower than 2.4 are highly appreci

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Urm and uploaded here: http://ankeschwarzer.de/tmp/graph.jpg Sorry. Thomas Becker wrote: > Missed the attachment, sorry. > > Thomas Becker wrote: >> Hi all, >> >> I'm experiencing a performance degradation after migrating to 2.9 and running >> some tests.

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
nerCache is a Map containing field + parser * (contracttocontentgroup prefix) as the key and as a value yet another map. * The latter map finally contains the docIds as key and positionvalue for this * prefix as value. * * @author Thomas Becker (thomas.bec...@net-m.de) * */ pub

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
with lucene 2.4. I will now try a freshly build 2.9 index and see if performance improves. Maybe that already solves the issue...stupid me... We're updating the index every 30 min. at the moment and it gets optimized after each update. Mark Miller wrote: > Thomas Becker wrote: >> Hey Mar

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
ry as well?! Will check that. Thanks a lot for your support! Cheers, Thomas Mark Miller wrote: > A few quick notes - > > Lucene 2.9 old api doesn't appear much worse than Lucene 2.4? > > You save a lot with the new Intern impl, because thats not a hotspot > anymore. But t

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http://www.net-m.de Registergericht: Amts

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Mark Miller wrote: > Thomas Becker wrote: >> Hey Mark, >> >> yes. I'm running the app on unix. You see the difference between 2.9 and 2.4 >> here: >> >> http://ankeschwarzer.de/tmp/graph.jpg >> > Right - I know your measurements showed

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Hi Uwe, already done. See my last message. Cheers, Thomas Uwe Schindler wrote: > On 2.9. NIOFS is only used, if you use FSDirectory.open() instead of > FSDirectory.getDirectory (Deprecated). Can you compare when you use instead > of FSDirectory.open() the direct ctor of SimpleFSDir vs.

Problems with ItemBasedRecommender with Lucene

2009-09-16 Thread Thomas Rewig
e fields... I'm using lucene 2.4.1 and java version "1.6.0_16". Do anyone have an idea to avoid the growing memory. Or do somebody know an other approche for a "realtime Item based Recommender" with Lucene? Regards Thomas --

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
gt; https://issues.apache.org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http:/

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
g, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
ry types you are using as well? (eg qualities) >> >> And grab invocations if its possible. >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> Thomas Becker wrote: >>> Tests run on tmpfs: >>> config: impl=Sepa

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
guess, based on > the 2.9 new api profiling, is that your queries may not be agreeing with > some of the changes somehow. Along with the profiling, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas B

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
IndexSearcher.search was called only > once. > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- Thomas Becker Senior JEE Deve

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
t was only > one search, you must have two segments and therefore no optimized index for > this to be correct? > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > Fo

Re: Problems with ItemBasedRecommender with Lucene

2009-09-17 Thread Thomas Rewig
You use Lucene 2.9 is there a way to do this with Lucene 2.4.1 because I can't find e.g. the "PayloadEncoder" or do I have to wait for the release? Regards Thomas You might want to ask on mahout-user, but I'm guessing Ted didn't mean a new field for every item-item,

Using TermVectorMapper to compute term frequency across documents

2009-10-12 Thread Thomas D'Silva
getTermFreqVector(). I do not require the term frequency within a document. Thanks, Thomas HashMap termDocCount = new HashMap(); TermQuery tagQuery = new TermQuery(tagTerm); TopDocs docs = searcher.search(tagQuery, numDocs); for (int i=0 ; i public void map(String term, int frequency

Re: Using TermVectorMapper to compute term frequency across documents

2009-10-15 Thread Thomas D'Silva
while to compute the document,tag probabilities. Thanks, Thomas On Wed, Oct 14, 2009 at 8:15 AM, Grant Ingersoll wrote: > > On Oct 12, 2009, at 10:46 PM, Thomas D'Silva wrote: > >> Hi, >> >> I am trying to compute the counts of terms of the documents return

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Thomas Becker
be careful. Load on the DB Server will surely increase. Hope that helps. Cheers, Thomas Paul Taylor wrote: > I'm building a lucene index from a database, creating 1 about 1 million > documents, unsuprisingly this takes quite a long time. > I do this by sending a query to the db o

Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-08 Thread legrand thomas
Hi, I often get a FileNotFoundException when my single IndexWriter commits while the IndexReader also tries to read. My application is multithreaded (Tomcat uses the business APIs); I firstly thought the read/write access was thread-safe but I probably forget something.  Please help me to unde

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
xWriter is committing) is perfectly fine.  The reader searches the point-in-time snapshot of the index as of when it was opened. But: what filesystem are you using?  NFS presents challenges, for example. Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: > Hi, > > I often get a Fi

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
McCandless a écrit : De: Michael McCandless Objet: Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException À: java-user@lucene.apache.org Date: Samedi 9 Janvier 2010, 14h51 Can you post the full FNFE stack trace? Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: >

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
ginal questions...: commit/read does not require any external synchronization or locking.  You should generally keep your IW open indefinitely and just periodically commit and/or get a new reader (IndexWriter.getReader()) as needed. Mike On Sat, Jan 9, 2010 at 10:06 AM, legrand thomas wrote: > &g

Re: If you could have one feature in Lucene...

2010-02-25 Thread Thomas Guttesen
For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Med venlig hilsen Thomas Guttesen

google's index layout, lucene on hbase(?)

2010-03-11 Thread Thomas Koch
; ( or http://tinyurl.com/yjr45ut ) The mail is about a lucene index{reader|writer} on top of cassandra and whether sth. like this could also be done with hbase. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail

[ANN] Eclipse GIT plugin beta version released

2010-03-31 Thread Thomas Koch
http://www.infoq.com/news/2010/03/egit-released http://aniszczyk.org/2010/03/22/the-start-of-an-adventure-egitjgit-0-7-1/ Maybe, one day, some apache / hadoop projects will use GIT... :-) (Yes, I know git.apache.org.) Best regards, Thomas Koch, http://www.ko

Indexing PDF documents with structure information

2007-08-13 Thread Thomas Arni
like the page or the chapter, where the relevant information is. As anyone have similar requirements? Which of these tools are the best to fit my requirements? Thanks for your help Thomas - To unsubscribe, e-mail: [EMAIL PROT

Re: Searching Diacritics

2007-08-27 Thread thomas arni
You can extend the DefaultAnalyzer. The only thing you have to do, is to rewrite the method tokenStream like this: /** Constructs a [EMAIL PROTECTED] StandardTokenizer} filtered by a [EMAIL PROTECTED] StandardFilter}, a [EMAIL PROTECTED] LowerCaseFilter} and a [EMAIL PROTECTED] StopFilter}.

Sorting with MultiSearcher

2007-11-08 Thread WATHELET Thomas
Hi, I have few Indexes with the same structure. I'm using MultiSearcher to search into those indexes and when I try to sort the result by field the result is sort by field and by index (we have all results from index1 and then index2,...) but I would like to have the result sorted on the all result

RE: Sorting with MultiSearcher

2007-11-08 Thread WATHELET Thomas
EMAIL PROTECTED] Sent: 08 November 2007 13:22 To: java-user@lucene.apache.org Subject: Re: Sorting with MultiSearcher Any other info or code snippets? I sort on multisearchers all the time and have never seen that behavior. - Mark (sorting on multisearchers since Lucene 1.4 ) WATHELET Thomas w

Date sorting problem [ IndexSearcher | Hits | Sort | Float ]

2008-03-08 Thread legrand thomas
Dear all, I'm trying to sort query results using a date criteria. My dates are stored as "long" in the database (I cannot change this) and indexed as untokenized. The sorted resuIts I get aren't consistent. This problem does not occur if the number are "smaller". Am I doing something wrong ? I

Max length

2008-04-15 Thread WATHELET Thomas
iddle. Any body can help me thank's Thomas WATHELET Development Team Tel.: (+352) 4300 24752 E-mail: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> <>

lucene can't find segments file

2008-04-17 Thread Hoelzl, Thomas
file. It contains the following files. master:/home/thomas/keywordsearch/etc # ls /usr/local/jboss-3.2.7/server/default/conf/index/ .. _2.cfs segments.gen segments_9 I have checked the index using luke and it is good. In addition it works on Windows. Can anybody tell me why it is se

AW: lucene can't find segments file

2008-04-17 Thread Hoelzl, Thomas
e.org Betreff: Re: lucene can't find segments file It seems likely you are using an older version of Lucene to access an index created by a newer version of Lucene? Mike Hoelzl, Thomas wrote: > Hi all! > > I have some problems running my lucene application on linux (suse). > > l

Exact string

2008-04-30 Thread WATHELET Thomas
et something special to my search query? I need help... Thanks in advence. Thomas WATHELET Development Team Tel.: (+352) 4300 24752 E-mail: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> <>

Search for long titles - wildcard queries

2008-05-10 Thread legrand thomas
Dear all, I'm a recent Lucene user and I'm looking for the best way to perform searches over long titles (ad titles on a website). For example, if the following documents exist: - TITLE, "Fender telecaster" - TITLE, "Land rover defender" - TITLE, "I sale a wonderful fender st

Re: Question about indexing (BrazilianAnalyzer)

2008-06-04 Thread Thomas Arni
; c). Probably the problem is with this accents.. You can check this if you adapt the method tokenStream() in the BrazilianAnalzyer by including the ISOLatin1AccentFilter in the filter chain. Thomas Vinicius Carvalho said the following on 03/06/08 20:51: Hello there! I'm indexing documents u

advanced WildcardQuery

2008-07-16 Thread legrand thomas
ardQuery with the term "pretty*car". I also want to get this document when searching for "pretty*sale*". How should I do ? Is it really possible ? I use lucene 2.3.1. Thanks in advance, Thomas Legrand

  1   2   3   >