Re: MappedByteBuffer duplicates

2017-02-24 Thread Greg Bowyer
You may need to enable this https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L167 If you are a Sun^H^H^H Oracle JVM. On Fri, Feb 24, 2017, at 11:31 AM, Kameron Cole wrote: > Actually, at a certain point, they have crashed the mac

Re: docid is just a signed int32

2016-08-18 Thread Greg Bowyer
What are you trying to index that has more than 3 billion documents per shard / index and can not be split as Adrien suggests? On Thu, Aug 18, 2016, at 07:35 AM, Cristian Lorenzetto wrote: > Maybe lucene has maxsize 2^31 because result set are java array where > length is a int type. > A suggest

Re: Why we need org.apache.lucene.codecs.Codec

2016-08-04 Thread Greg Bowyer
loading of Codecs > > > > > On Thu, 04 Aug 2016 20:39:46 +0530 Greg Bowyer > <gbow...@fastmail.co.uk>wrote > > > > > Codecs are loaded with the java service loader interface. That file is > > the hook used to tell the service loader that this jar

Re: Why we need org.apache.lucene.codecs.Codec

2016-08-04 Thread Greg Bowyer
Codecs are loaded with the java service loader interface. That file is the hook used to tell the service loader that this jar implements Codec. Lucene internally calls service loader and asks what codecs are there. On Wed, Aug 3, 2016, at 11:23 PM, aravinth thangasami wrote: > I don't understand

Re: get enumeration of all terms starting at a given term after lucene 4

2016-07-28 Thread Greg Bowyer
I am confused by your example, MultiFields.get allows you to ask for a specific field. On Thu, Jul 28, 2016, at 09:00 AM, Mukul Ranjan wrote: > Hi All, > > How to get enumeration of all terms starting at a given term. I have > upgrade lucene version from lucene 3.6 to lucene 5.5.2. After 3.6, > i

Re: Compression algorithm for posting lists

2016-03-28 Thread Greg Bowyer
The posting list is compressed using a specialised technique aimed at pure numbers. Currently the codec uses a variant of Patched Frame of Reference coding to perform this compression. A good survey of such techniques can be found in the good IR books (https://mitpress.mit.edu/books/information-r

Re: Spaces in regular expressions

2016-02-25 Thread Greg Bowyer
Possibly not helpful but some time ago Russ Cox implemented a code search at Google. His design is documented here https://swtch.com/~rsc/regexp/regexp4.html On Wed, Feb 24, 2016, at 08:01 AM, Kudrettin Güleryüz wrote: > I appreciate the pointers Jack. More on that, where can I read more on > ena

Re: Wikipedia Index

2012-06-19 Thread Greg Bowyer
It depends on what you want, but the wikipedia data dumps can be found here http://en.wikipedia.org/wiki/Wikipedia:Database_download On 19/06/12 17:03, Elshaimaa Ali wrote: I only have the source text on a mysql database Do you know where I can download it in xml and is it possible to split the

Index pruning

2012-05-29 Thread Greg Bowyer
Hi all I am playing about with the index pruning contrib package, I want to see if it will make a faster and slightly smaller index for me. However when I try either Carmel or RIDF methods it just ends up deleting all my postings for the two fields of interest. My command line for RIDF is as

Re: [MAVEN] Heads up: build changes

2012-05-09 Thread Greg Bowyer
Sorry this was my fault, I found that my bsf jars were broken in my ant install. On 08/05/12 14:32, Greg Bowyer wrote: greg@localhost ~ $ java -version java version "1.7.0_04" Java(TM) SE Runtime Environment (build 1.7.0_04-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21,

Re: [MAVEN] Heads up: build changes

2012-05-08 Thread Greg Bowyer
nality used in the lucene/site/xsl/index.xsl stylesheet, which is invoked from the 'process-webpages' target. What JDK/version/vendor/platform are you using? Steve -----Original Message- From: Greg Bowyer [mailto:gbow...@fastmail.co.uk] Sent: Tuesday, May 08, 2012 4:54 PM To: java

Re: [MAVEN] Heads up: build changes

2012-05-08 Thread Greg Bowyer
For me ant generate-maven-artifacts if giving me this error, any thoughts ? -- %< -- process-webpages: [xslt] Processing /home/greg/projects/lucene-solr/lucene/build.xml to /home/greg/projects/lucene-solr/lucene/build/docs/index.html [xslt] Loading stylesheet /home/greg/projects/luce

Re: PyLucene Error Message

2012-03-29 Thread Greg Bowyer
I can seem to reproduce this, it seems like there might be a race condition here, its quite hard to reproduce I have filed a defect with pylucene (https://issues.apache.org/jira/browse/PYLUCENE-17). On 29/03/12 12:36, Greg Bowyer wrote: There seems to be a bug in pylucene where it races if

Re: PyLucene Error Message

2012-03-29 Thread Greg Bowyer
already tried a sleep(1) when populating the queue, with no results. From: Greg Bowyer [mailto:gbow...@fastmail.co.uk] Sent: 29 March 2012 18:09 To: David Mosca Cc: java-user@lucene.apache.org Subject: Re: PyLucene Error Message Its a bit crap, but can you stick a time.sleep(0.5) just after the

Re: PyLucene Error Message

2012-03-29 Thread Greg Bowyer
self.queue = queue self.jvm = jvm def run(self): self.jvm.attachCurrentThread() I have tried lucene.getVMEnv().attachCurrentThread() instead but I still get the same error message. Thanks, David *From:*Greg Bowyer [mailto:gbow...@fastmail.co.uk] *Sent:* 29 March 2012 17:30 *To:* Da

Re: PyLucene Error Message

2012-03-29 Thread Greg Bowyer
: I have re-attached the log. Thanks, David -Original Message- From: Greg Bowyer [mailto:gbow...@fastmail.co.uk] Sent: 29 March 2012 16:55 To: java-user@lucene.apache.org Subject: Re: PyLucene Error Message I dont see any attached log, can you attach the log please. -- Greg On 29/03

Re: PyLucene Error Message

2012-03-29 Thread Greg Bowyer
I dont see any attached log, can you attach the log please. -- Greg On 29/03/2012 07:35, David Mosca wrote: Hello, I am using Lucene version 3.4 through the Python extension (pylucene) in a multi-threaded script. When I launch the script I sometimes get a fatal error message (log attached) a

Re: Concurrency and multiple merge threads

2012-02-18 Thread Greg Bowyer
Your not very clear about where you see the specific slow operations, at search or re-index time. I am going to go out on a limb here and suggest that maybe its at index time, and maybe the yourkit trace showing the 5 merge threads awaiting the monitor is the cause of your issues. You claim

Re: Merging results from two searches on two separate Searchers

2012-02-14 Thread Greg Bowyer
Out of sheer curiosity what makes scores different across queries, I am not suggesting they should be the same, just filling in terrible gaps in my knowledge that I have not quite fathomed yet during source diving On 14/02/12 16:46, Uwe Schindler wrote: Scores are only compatible if the query

Re: Adding metadata to Lucene indexes?

2011-11-03 Thread Greg Bowyer
I would look at the meta data for this, the magic document is something that I did previously for exactly this problem, and two weeks later we removed it as so much of the code started having to check if the document was the magic document. The only thing with lucene metadata is that solr, cur

Re: Strange bug when we enable faceting

2011-11-02 Thread Greg Bowyer
given, since this is true for most queries this causes the caching to occur at (2). Sorry for asking redundant questions on the mailing list :S On 02/11/11 17:09, Greg Bowyer wrote: Ignore this !! I discovered through testing and code review today just what things the filter cache is us

Re: Strange bug when we enable faceting

2011-11-02 Thread Greg Bowyer
Ignore this !! I discovered through testing and code review today just what things the filter cache is used for and why my previous thinking was wrong, I had the cache set too large to accomodate all of the other things the filter cache stores. On 02/11/11 11:17, Greg Bowyer wrote: When I

Strange bug when we enable faceting

2011-11-02 Thread Greg Bowyer
cache rapidly fills rapidly, with cache items that are unlikely to ever be required again. Does anyone have any ideas on what could be causing this? -- Greg Bowyer - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Case insensitive sortable column

2011-10-11 Thread Greg Bowyer
I might be missing something here but cant you just lowercase during indexing ? On 11/10/11 09:48, Senthil V S wrote: Hi, I'm new to Lucene. I have records and I wanna sort them by fields. I've created indexes for those fields with 'not_analyzed'. The sort is case sensitive. In a sense, *A...*