hithighlighter bug

2007-01-09 Thread Jason
before? is this a genuine new bug or something of which the lucene folk (or at least whoever wrote the highlighter) are aware? can anyone think of a way to fix this without scanning every element in my result text for rogue spaces? Thanks in advance Jason. --

how can I filter my search to not include items containing a particular field and value?

2007-01-10 Thread Jason
field name/value pairs. I'm sure it must be simple - I just cant see how to do it. thanks. Jason. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: how can I filter my search to not include items containing a particular field and value?

2007-01-11 Thread Jason
earch the mail archive for one of several explications if you are thinking of the NOT operator like a boolean logic operator. It's not, quite. On 1/10/07, Jason <[EMAIL PROTECTED]> wrote: how can I filter my search to not include items containing a particular field and value? I want e

where is the proper place to report lucene bugs?

2007-01-11 Thread Jason
can someone please tell me where the most appropriate place to report bugs might be - in this case for the hit-highlighter contribution Thanks Jason. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail

Re: where is the proper place to report lucene bugs?

2007-01-11 Thread Jason
ween '< e' is not a good thing. Thanks for the response. Jason. Grant Ingersoll wrote: From the resources section of the website, the Issue Tracking link is: http://issues.apache.org/jira/browse/LUCENE Also, it is helpful if you have done a preliminary search on the topic and som

about the wordnet program.

2006-01-12 Thread jason
hi, i am trying to use the Lucene WordNet program for my application. However, i got some problems. When i incorporate these files, Syns2Index.java, SynLookup.java, and SynExpand.java, I find some variables are not defined. For instance, in Syns2Index. java, writer.setMergeFactor( writ

Re: about the wordnet program.

2006-01-13 Thread jason
.NO)); --> doc.add( Field.UnIndexed( F_SYN , cur)); For SynLookup and SynExpand, tmp.add( tq, BooleanClause.Occur.SHOULD); --> tmp.add(tq, true, false); On 1/13/06, Daniel Naber <[EMAIL PROTECTED]> wrote: > > On Donnerstag 12 Januar 2006 16:25, jason wrote: > > >

One problem of using the lucene

2006-01-16 Thread jason
Hi, I got a problem of using the lucene. I write a SynonymFilter which can add synonyms from the WordNet. Meanwhile, i used the SnowballFilter for term stemming. However, i got a problem when combining the two fiters. For instance, i got 17 documents containing the Term "support" and the follo

Re: One problem of using the lucene

2006-01-16 Thread jason
the picture for > the time being while trouble shooting. > > If you are using QueryParser, are you using the same analyzer? If > this is the case, what is the .toString of the generated Query? > >Erik > > > On Jan 16, 2006, at 3:54 AM, jason wrote: > > > H

Re: One problem of using the lucene

2006-01-17 Thread jason
gain. yours truly Jiang Xing On 1/17/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Jan 17, 2006, at 12:14 AM, jason wrote: > > It is adding tokens into the same position as the original token. > > And then, > > I used the QueryParser for searching and the snow

Re: One problem of using the lucene

2006-01-17 Thread jason
Ok, i will try it. On 1/17/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > On Jan 17, 2006, at 5:58 AM, jason wrote: > > I have test the snowballFilter and it does not stem the term > > "support". It > > means the term "support"

Use the lucene for searching in the Semantic Web.

2006-01-17 Thread jason
Hi friends, How do you think use the lucene for searching in the Semantic Web? I am trying using the lucene for searching documents with ontological annotation. But i do not get a better model to combine the keywords information and the ontological information. regards jiang xing

Re: Use the lucene for searching in the Semantic Web.

2006-01-17 Thread jason
gt; Erik > > > On Jan 17, 2006, at 9:34 AM, jason wrote: > > > Hi friends, > > > > How do you think use the lucene for searching in the Semantic Web? > > I am > > trying using the lucene for searching documents with ontological > > a

Re: Question.

2006-02-05 Thread jason
You can get the term frequency matrix first. Then, select the most frequent terms. One letter has said how to build the term frequency matrix. regards jiang xing On 2/6/06, Pranay Jain <[EMAIL PROTECTED]> wrote: > > I have earlier used lucene and I must say it has performed bug free for > the >

Re: two problems of using the lucene.

2006-02-05 Thread jason
Hi, I try to read the source code of the lucene. But i only find in the TermScorer.java where the tf/idf measure is really implemented. I guess that whether the Queryparser class will convert each word into a termquery first. Then, queries such as the the Booleanquery are built. The source code o

understand the queryNorm and the fieldNorm.

2006-02-06 Thread jason
Hi, I have a problem of understanding the queryNorm and fieldNorm. The following is an example. I try to follow what said in the Javadoc "Computes the normalization value for a query given the sum of the squared weights of each of the query terms". But the result is different. ID:0 C:/PDF2Text/S

To understand the queryNorm and fieldNorm

2006-02-06 Thread jason
Hi, I have a problem of understanding the queryNorm and fieldNorm. The following is an example. I try to follow what said in the Javadoc "Computes the normalization value for a query given the sum of the squared weights of each of the query terms". But the result is different. ID:0 C:/PDF2Text/S

Re: understand the queryNorm and the fieldNorm.

2006-02-06 Thread jason
hi, thx. I think i forget the ^0.5 cheers Jason On 2/6/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > Hi Jason, > I get the same thing for the queryNorm when I calculate it by hand: > 1/((1.7613963**2 + 1.326625**2)**.5) = 0.45349488111693986 > > -Yonik > > On 2

Re: Stemmer algorithms

2006-02-13 Thread jason
Hi, I have test some stemmer algorithms in my application. However, i think we'd better writer a weaker algorithm. I mean, the Porter and some other algorithms are too strong. maybe an algorithm which can convert plural to single noun is enough. On 2/14/06, Yilmazel, Sibel <[EMAIL PROTECTED]> wro

Add a module to the lucene

2006-03-14 Thread jason
class to read the vectors and use our own defined measure to calculate their similarity. How do you think of it? regards jason

Add a module to the lucene!!!

2006-03-14 Thread jason
class to read the vectors and use our own defined measure to calculate their similarity. How do you think of it? regards jason

Add more module to the lucene

2006-03-14 Thread jason
the vectors of documents from the index structure. Then, we can use our own similarity measures. FYI. Regards jason.

Add a module to the lucene

2006-03-15 Thread jason
class to read the vectors and use our own defined measure to calculate their similarity. How do you think of it? regards jason

Re: how to cluster documents

2006-03-21 Thread jason
I guess you should use some text mining tools. you can use googl find them. I remember UIUC recently releases one tool. It is very good. On 3/21/06, Valerio Schiavoni <[EMAIL PROTECTED]> wrote: > > Hello, > not sure if the term 'cluster' is the correct one, but here what i would > like to do: > gi

for the similarity measure

2006-04-27 Thread jason
Hi, After reading the code, I found the similarity measure in Lucene is not the same as the cosine coefficient measure commonly used. I dont know it is correct. And I wonder whether i can use the cosine coefficient measure in lucene or maybe the Dice's coefficient, Jaccard's coefficient and overla

Re: Vector space model

2006-04-28 Thread jason
Hi, I am also interested in this problem. Regards Jason On 4/28/06, trupti mulajkar <[EMAIL PROTECTED]> wrote: > > hi > > i am trying to implement the vector space model for lucene. > i did find some code for generating the vectors, but can any1 suggest a > bett

Re: frequent keyword computation within a search ( and timeinterval )

2012-01-05 Thread Jason Rutherglen
> Short answer is that no, there isn't an aggregate > function. And you shouldn't even try If that is the case why does a 'stats' component exist for Solr with the SUM function built in? http://wiki.apache.org/solr/StatsComponent On Thu, Jan 5, 2012 at 1:37 PM, Erick Erickson wrote: > You will

Re: frequent keyword computation within a search ( and timeinterval )

2012-01-05 Thread Jason Rutherglen
red > SUM, stats would do it. > > Erick > > On Thu, Jan 5, 2012 at 7:23 PM, Jason Rutherglen > wrote: >>> Short answer is that no, there isn't an aggregate >>> function. And you shouldn't even try >> >> If that is the case why does a 'st

date issues

2012-02-22 Thread Jason Toy
n the mailing list and on google and not sure what to use, I would appreciate any pointers. Thanks. Jason - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: date issues

2012-02-22 Thread Jason Toy
ya > www.findbestopensource.com > > > On Thu, Feb 23, 2012 at 11:55 AM, Jason Toy wrote: > >> I have a solr instance with about 400m docs. For text searches it is >> perfectly fine. When I do searches that calculate the amount of times a >> word appeared in the doc

Re: RAMDirectory unexpectedly slows

2012-06-04 Thread Jason Rutherglen
If you want the index to be stored completely in RAM, there is the ByteBuffer directory [1]. Though I do not see the point in putting an index in RAM, it will be cached in RAM regardless in the OS system IO cache. 1. https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/ap

Re: RAMDirectory unexpectedly slows

2012-06-04 Thread Jason Rutherglen
t. Is that right? > > What about the ByteBufferDirectory? Can this specific directory utilize the > 2GB memory I grant to the app? > > On Mon, Jun 4, 2012 at 10:58 PM, Jason Rutherglen < > jason.rutherg...@gmail.com> wrote: > >> If you want the index to be stored

Looking for case studies for 'Lucene and Solr: The Definitive Guide' from O'Reilly

2012-12-17 Thread Jason Rutherglen
Cloud * Hadoop integration Thanks, Jason Rutherglen, Jack Krupansky, and Ryan Tabora http://shop.oreilly.com/product/0636920028765.do - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-ma

Lucene VSM scoring

2013-07-09 Thread Jason Z.
Hi, In the Lucene docs it mentions that Lucene impements a tf-idf weighting scheme for scoring. Is there anyway to modfiy Lucene to implement a custom weighting scheme for the VSM? Thank you.

Monitoring low level IO

2010-06-03 Thread Jason Rutherglen
This is more of a unix related question than Lucene specific however because Lucene is being used, I'm asking here as perhaps other people have run into a similar issue. On an Amazon EC2 merge, read, and write operations are possibly blocking due to underlying IO. Is there a tool that you have use

CFP for Surge Scalability Conference 2010

2010-06-14 Thread Jason Dixon
n Surge is just what you've been waiting for. For more information, including CFP, sponsorship of the event, or participating as an exhibitor, please contact us at su...@omniti.com. Thanks, -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.co

Re: Last Call: Lucene Revolution CFP Closes Tomorrow Wednesday, June 23, 2010, 12 Midnight PDT

2010-06-22 Thread Jason Rutherglen
Grant, I can probably do the 3 billion document one from Prague, or a realtime search one... I spaced on submitting for ApacheCon. Are there cool places in the Carolinas to hang? Cheers bro, Jason On Tue, Jun 22, 2010 at 10:51 AM, Grant Ingersoll wrote: > Lucene Revolution Call

CFP for Surge Scalability Conference 2010

2010-07-02 Thread Jason Dixon
icipating as an exhibitor, please visit the Surge website or contact us at su...@omniti.com. Thanks, -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241 - To unsubscribe, e-mail: jav

Last day to submit your Surge 2010 CFP!

2010-07-09 Thread Jason Dixon
your business sponsor/exhibit at Surge 2010, please contact us at su...@omniti.com. Thanks! -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241 - To unsubscribe, e-mail: java-user-unsubscr

Register now for Surge 2010

2010-08-02 Thread Jason Dixon
your seat to this year's event! http://omniti.com/surge/2010/register Thanks, -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.

Surge 2010 Early Registration ends Tuesday!

2010-08-27 Thread Jason Dixon
t and guarantee your seat to this year's event! -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-

Recreate segment infos

2010-10-04 Thread Jason Rutherglen
Lets say the segment infos file is missing, and I'm aware of CheckIndex, however is there a tool to recreate a segment infos file? - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail:

Re: Recreate segment infos

2010-10-05 Thread Jason Rutherglen
egment is given the same name as the first segment that > shares it.  However, unfortunately, because of merging, it's possible > that this mapping is not easy (maybe not possible, depending on the > merge policy...) to reconstruct.  I think this'll be the hardest part > :) > &

Re: API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Jason Rutherglen
In a word, no. You'd need to customize the Lucene source to accomplish this. On Wed, Nov 10, 2010 at 1:02 PM, Burton-West, Tom wrote: > Hello all, > > We have an extremely large number of terms in our indexes.  I want to be able > to extract a sample of the terms, say something like every 128th

Re: API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Jason Rutherglen
Yeah that's customizing the Lucene source. :) I should have gone into more detail, I will next time. On Wed, Nov 10, 2010 at 2:10 PM, Michael McCandless wrote: > Actually, the .tii file pre-flex (3.x) is nearly identical to the .tis > file, just that it only contains every 128th term. > > If you

Storing an ID alongside a document

2011-02-02 Thread Jason Rutherglen
I'm curious if there's a new way (using flex or term states) to store IDs alongside a document and retrieve the IDs of the top N results? The goal would be to minimize HD seeks, and not use field caches (because they consume too much heap space) or the doc stores (which require two seeks). One pos

Re: Storing an ID alongside a document

2011-02-02 Thread Jason Rutherglen
s branch) > > -Yonik > http://lucidimagination.com > > > On Wed, Feb 2, 2011 at 1:03 PM, Jason Rutherglen > wrote: > >> I'm curious if there's a new way (using flex or term states) to store >> IDs alongside a document and retrieve the IDs of the top N resul

Re: Storing an ID alongside a document

2011-02-03 Thread Jason Rutherglen
> there is a entire RAM resident part and a Iterator API that reads / > streams data directly from disk. > look at DocValuesEnum vs, Source Nice, thanks! On Thu, Feb 3, 2011 at 12:20 AM, Simon Willnauer wrote: > On Thu, Feb 3, 2011 at 3:23 AM, Jason Rutherglen > wrote: >>

Last/max term in Lucene 4.x

2011-02-18 Thread Jason Rutherglen
This could be a rhetorical question. The way to find the last/max term that is a unique per document is to use TermsEnum to seek to the first term of a field, then call seek to the docFreq-1 for the last ord, then get the term, or is there a better/faster way?

Re: Last/max term in Lucene 4.x

2011-02-19 Thread Jason Rutherglen
h the existing) to automatically store the max term? On Sat, Feb 19, 2011 at 3:33 AM, Michael McCandless wrote: > I don't quite understand your question Jason... > > Seeking to the first term of the field just gets you the smallest term > (in unsigned byte[] order, ie Unicode order

Re: Last/max term in Lucene 4.x

2011-02-20 Thread Jason Rutherglen
rd. How would I seek to the last term in the index using VarGaps? Or do I need to interact directly with the FST class (and if so I'm not sure what to do there either). Thanks Mike. On Sun, Feb 20, 2011 at 2:51 PM, Michael McCandless wrote: > On Sat, Feb 19, 2011 at 8:42 AM, Jason Rutherg

Re: Last/max term in Lucene 4.x

2011-02-21 Thread Jason Rutherglen
ordered IDs stored in the index, so that remaining documents (that lets say were left in RAM prior to process termination) can be indexed. It's an inferred transaction checkpoint. On Mon, Feb 21, 2011 at 5:31 AM, Michael McCandless wrote: > On Sun, Feb 20, 2011 at 8:47 PM, Jason Rutherglen &

Proper way to deal with shared indexer exception

2011-02-25 Thread Jason Tesser
AlreadyClosedException ace OR ClosedChannelException OR IOException what would be the best to do with my shared searcher * * 2. is reopen enough? or should I get a brand new searcher? Thanks, Jason Tesser dotCMS Lead Development Manager 1-305-858-1422

Is ConcurrentMergeScheduler useful for multiple running IndexWriter's?

2011-03-04 Thread Jason Rutherglen
ConcurrentMergeScheduler is tied to a specific IndexWriter, however if we're running in an environment (such as Solr's multiple cores, and other similar scenarios) then we'd have a CMS per IW. I think this effectively disables CMS's max thread merge throttling feature? ---

Append Codec random testing

2011-03-21 Thread Jason Rutherglen
I'm seeing an error when using the misc Append codec. java.lang.AssertionError at org.apache.lucene.store.ByteArrayDataInput.readBytes(ByteArrayDataInput.java:107) at org.apache.lucene.index.codecs.BlockTermsReader$FieldReader$SegmentTermsEnum._next(BlockTermsReader.java:661) at org.apache.luce

Re: DocIdSet to represent small numberr of hits in large Document set

2011-04-05 Thread Jason Rutherglen
I think Solr has a HashDocSet implementation? On Tue, Apr 5, 2011 at 3:19 AM, Michael McCandless wrote: > Can we simply factor out (poach!) those useful-sounding classes from > Nutch into Lucene? > > Mike > > http://blog.mikemccandless.com > > On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman > w

Lucene Util question

2011-04-08 Thread Jason Rutherglen
Is http://code.google.com/a/apache-extras.org/p/luceneutil/ designed to replace or augment the contrib benchmark? For example it looks like SearchPerfTest would be useful for executing queries over a pre-built index. Though there's no indexing tool in the code tree? -

found a bug, not sure if its lucene or solr

2011-06-03 Thread Jason Toy
in the document. For that reason I believe the bug is in solr and not in lucene, but I'm not certain. Jason Toy socmetrics http://socmetrics.com @jtoy

Re: Index size and performance degradation

2011-06-13 Thread Jason Rutherglen
> I don't think we'd do the post-filtering solution, but instead maybe > resolve the deletes "live" and store them in a transactional data I think Michael B. aptly described the sequence ID approach for 'live' deletes? On Mon, Jun 13, 2011 at 3:00 PM, Michael McCandless wrote: > Yes, adding dele

Re: Index size and performance degradation

2011-06-13 Thread Jason Rutherglen
> deletions made by readers merely mark it for > deletion, and once a doc has been marked for deletions it is deleted for all > intents and purposes, right? There's the point-in-timeness of a reader to consider. > Does the N in NRT represent only the cost of reopening a searcher? Aptly put, and

how to approach phrase queries and term grouping

2011-06-22 Thread Jason Guild
mentioned in the text if that helps. Thanks for any help you can provide. Jason - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: i'm having some trouble with class FSDirectory

2011-08-24 Thread Sendros, Jason
Hi Mostafa, Try looking through the API for help with these types of questions: http://lucene.apache.org/java/3_3_0/api/all/org/apache/lucene/store/FSDi rectory.html You can use a number of FSDirectory subclasses depending on your circumstances. Hope this helps! Jason -Original Message

RE: Lucene scoring and random result order

2011-08-25 Thread Sendros, Jason
You can sort on multiple values. Keep the primary sort as a relevancy sort, and choose something else to sort on to keep the rest of the responses fairly static. http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/search/So rt.html Example: Sort sortBy = new Sort(new SortField[] { Sort

RE: deleting with sorting and max document

2011-09-14 Thread Sendros, Jason
Vincent, I think you may be looking for the following method: http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/index/Inde xWriter.html#deleteDocuments%28org.apache.lucene.search.Query%29 Jason -Original Message- From: v.se...@lombardodier.com [mailto:v.se

RE: searching / sorting on timestamp and update efficiency

2011-09-22 Thread Sendros, Jason
n to avoid memory leaks. Jason -Original Message- From: Sam Jiang [mailto:sam.ji...@karoshealth.com] Sent: Thursday, September 22, 2011 10:18 AM To: java-user@lucene.apache.org Subject: searching / sorting on timestamp and update efficiency Hi all I have some questions about how I sh

RE: Case insensitive sortable column

2011-10-11 Thread Sendros, Jason
If that's not an option, create another column with the same data lowercased and search on the new column while displaying the original column. Jason -Original Message- From: Greg Bowyer [mailto:gbow...@shopzilla.com] Sent: Tuesday, October 11, 2011 10:43 PM To: java

Re: ElasticSearch

2011-11-16 Thread Jason Rutherglen
> even high complexity as ES supports lucene-like query nesting via JSON That sounds interesting. Where is it described in the ES docs? Thanks. On Wed, Nov 16, 2011 at 1:36 PM, Peter Karich wrote: >  Hi, > > its not really fair to compare NRT of Solr to ElasticSearch. > ElasticSearch provides

Re: ElasticSearch

2011-11-16 Thread Jason Rutherglen
The docs are slim on examples. On Wed, Nov 16, 2011 at 3:35 PM, Peter Karich wrote: > >>> even high complexity as ES supports lucene-like query nesting via JSON >> That sounds interesting.  Where is it described in the ES docs?  Thanks. > > "Think of the Query DSL as an AST of queries" > http://w

BigInteger usage in numeric Trie range queries

2011-11-28 Thread Jason Rutherglen
Even though the NumericRangeQuery.new* methods do not support BigInteger, the underlying recursive algorithm supports any sized number. Has this been explored? - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For

deleteDocuments(Term... terms) takes a long time to do nothing.

2013-12-13 Thread Jason Corekin
or Lucene 4.6. If anyone has any ideas as to what I might be doing wrong, I would really appreciate reading what you have to say. Thanks in advance. Jason private void cloneDB() throws QueryNodeException { Document doc

Re: deleteDocuments(Term... terms) takes a long time to do nothing.

2013-12-14 Thread Jason Corekin
;,filename, Field.Store.YES)); On Sat, Dec 14, 2013 at 1:28 AM, Jason Corekin wrote: > Let me start by stating that I almost certain that I am doing something > wrong, and that I hope that I am because if not there is a VERY large bug > in Lucene. What I am trying to d

Re: deleteDocuments(Term... terms) takes a long time to do nothing.

2013-12-14 Thread Jason Corekin
Mike, Thanks for the input, it will take me some time to digest and trying everything you wrote about. I will post back the answers to your questions and results to from the suggestions you made once I have gone over everything. Thanks for the quick reply, Jason On Sat, Dec 14, 2013 at 5:13

Re: deleteDocuments(Term... terms) takes a long time to do nothing.

2013-12-16 Thread Jason Corekin
tried to search by query I used to filenames stored in each document as the query, which was essentially equivalent to deleting by term. You email helped me to realize this and in turn change my query to be time range based, which now takes seconds to run. Thank You Jason Corekin >It sou

codec mismatch

2014-02-14 Thread Jason Wee
Hello, This is my first question to lucene mailing list, sorry if the question sounds funny. I have been experimenting to store lucene index files on cassandra, unfortunately the exception got overwhelmed. Below are the stacktrace. org.apache.lucene.index.CorruptIndexException: codec mismatch: a

Re: codec mismatch

2014-02-17 Thread Jason Wee
le name? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, Feb 14, 2014 at 3:13 AM, Jason Wee wrote: > > Hello, > > > > This is my first question to lucene mailing list, sorry if the question > > sounds funny. > > > > I have

Re: codec mismatch

2014-03-06 Thread Jason Wee
wrongly. It was set from 0 all the time when it should be set based on lucene called seek(position). Thank you again. Jack, it is educational purpose and we think lucene is a fantastic software and we would like to learn it in details. Jason On Mon, Feb 17, 2014 at 10:31 PM, Jack Krupansky

background merge hit exception

2014-04-03 Thread Jason Wee
eScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482) We do not know what is wrong as our understanding on lucene is limited. Can someone give explanation on what is happening, or which might be the possible error source is? Thank you and any advice is appreciated. /Jason

Re: background merge hit exception

2014-04-08 Thread Jason Wee
terConfig(Version.LUCENE_46, > analyzer); yes, we were and still referencing lucene_46 in our analyzer. /Jason On Sat, Apr 5, 2014 at 9:01 PM, Jose Carlos Canova < jose.carlos.can...@gmail.com> wrote: > Seems that you want to force a max number of segments to 1, > On a previous threa

Re: background merge hit exception

2014-04-09 Thread Jason Wee
time with large merge segments, that is 50. if (writer != null && forceMerge) { writer.forceMerge(50); writer.commit(); } With these changed, the exceptions reported initially, is no longer happening. Thank you again. Jason On Tue, Apr 8, 2014 at 8:50 PM, Jose Carlo

make data search as index progress.

2014-04-14 Thread Jason Wee
needed to pass in IndexWriter and DirectoryReader to make it searchable. Thanks and appreciate any advice. /Jason

Re: make data search as index progress.

2014-04-15 Thread Jason Wee
dex speed get very very slow (like 10-20doc per second) unfortunately and at times, after index on N files, it just stalled forever, am not sure what went wrong. /Jason On Mon, Apr 14, 2014 at 9:01 PM, Jose Carlos Canova < jose.carlos.can...@gmail.com> wrote: > Hello, > &

Re: make data search as index progress.

2014-05-02 Thread Jason Wee
different settings for the index writer config and merge policy. Thank for the lengthy information and we have also make our code reachable via github.com /Jason On Wed, Apr 16, 2014 at 10:55 AM, Jose Carlos Canova < jose.carlos.can...@gmail.com> wrote: > No, the index remains, you c

Lucene Indexing performance issue

2014-10-22 Thread Jason Wu
pplication. Can you give me some suggestions about my issue? Thank you, Jason

Re: Making lucene indexing multi threaded

2014-10-27 Thread Jason Wu
Hi Nischal, I had similar indexing issue. My lucene indexing took 22 mins for 70 MB docs. When i debugged the problem, i found out the indexWriter.addDocument(doc) taking a really long time. Have you already found the solution about it? Thank you, Jason -- View this message in context: http

RE: Making lucene indexing multi threaded

2014-10-27 Thread Jason Wu
n and take 22 mins. Did you have any similar experience like the above before? Thank you, Jason -- View this message in context: http://lucene.472066.n3.nabble.com/Making-lucene-indexing-multi-threaded-tp4087830p4166116.html Sent from the Lucene - Java Users mailing list archive at Nabbl

Re: Making lucene indexing multi threaded

2014-10-27 Thread Jason Wu
DB result set - When I loop the result set, I reuse the same Document instance. - At the end of each loop, I call indexWriter.addDocument(doc) 4. After all docs are added, call IndexWriter.commit() 5. IndexWriter.close(); Thank you, Jason -- View this message in context

Re: Java8 and lucene version

2015-05-06 Thread Jason Wee
) immediately. hth jason On Thu, May 7, 2015 at 4:19 AM, Pushyami Gundala wrote: > Hi, We are using lucene 2.9.4 version for our application that has search. > We are planning on upgrading our application to run on java 8. My Question > is when we move to java 8 does the lucene-2.9.

Re: Global ordinal based query time join documentation

2015-06-06 Thread Jason Wee
https://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/join/src/test/org/apache/lucene/search/join/TestJoinUtil.java?view=markup&pathrev=1671777 https://svn.apache.org/viewvc?view=revision&revision=1671777 https://issues.apache.org/jira/browse/LUCENE-6352 hth jason On Fr

Re: Request for help with Lucene search engine

2015-06-26 Thread Jason Wee
maybe start with this? https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java hth jason On Fri, Jun 26, 2015 at 7:40 PM, Rim REKIK wrote: > Dear, > I m trying Lucene to work with Lucene search engine. But I m asking if &

Lucene IndexSearcher PrefixQuery seach getting really slow after a while

2016-11-03 Thread Jason Wu
Hi Team, We are using lucene 4.8.1 to do some info searches every day for years. However, recently we encounter some performance issues which greatly slow down the lucene search. After application running for a while, we are facing below issues, which IndexSearcher PrefixQuery taking much lon

Re: term frequency

2016-11-24 Thread Jason Wee
the exception line does not match the code you pasted, but do make sure your object actually not null before accessing its method. On Thu, Nov 24, 2016 at 5:42 PM, huda barakat wrote: > I'm using SOLRJ to find term frequency for each term in a field, I wrote > this code but it is not working: > >

Re: [VOTE] Lucene logo contest

2020-06-16 Thread Jason Gerlowski
Option "A" On Tue, Jun 16, 2020 at 8:37 PM Man with No Name wrote: > > A, clean and modern. > > On Mon, Jun 15, 2020 at 6:08 PM Ryan Ernst wrote: >> >> Dear Lucene and Solr developers! >> >> In February a contest was started to design a new logo for Lucene [1]. That >> contest concluded, and I

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-02 Thread Jason Gerlowski
A1, A2, D (binding) On Wed, Sep 2, 2020 at 10:47 AM Michael McCandless wrote: > > A2, A1, C5, D (binding) > > Thank you to everyone for working so hard to make such cool looking possible > future Lucene logos! And to Ryan for the challenging job of calling this > VOTE :) > > Mike McCandless >

[ANNOUNCE] Apache Lucene 8.6.3 released

2020-10-08 Thread Jason Gerlowski
The Lucene PMC is pleased to announce the release of Apache Lucene 8.6.3. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. This

Re: Replicating Lucene Index with out SOLR

2008-08-28 Thread Jason Rutherglen
. This way each id generated can be traced back to a server but still increments. This is helpful with conflict resolution. Right now I am writing code to use this id for the Ocean conflict resolution. Cheers, Jason On Thu, Aug 28, 2008 at 12:57 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wro

Realtime Search for Social Networks Collaboration

2008-09-03 Thread Jason Rutherglen
mplementation is perfect and there is a lot that can be improved on. It might be helpful to figure out together what helpful things can be added. If this sounds like something of interest to anyone feel free to send your input. Take care, Jason -

Re: Realtime Search for Social Networks Collaboration

2008-09-03 Thread Jason Rutherglen
for social networks interested in realtime search to get involved as it may be something that is difficult for one company to have enough resources to implement to a production level. I think this is where open source collaboration is particularly useful. Cheers, Jason Rutherglen [EMAIL PROTECTED] On W

Re: Realtime Search for Social Networks Collaboration

2008-09-04 Thread Jason Rutherglen
interested in trying it out. Take care, Jason On Thu, Sep 4, 2008 at 9:08 AM, Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello Jason, > I have been trying to do this for a long time on my own. keep up the good > work. > > What I tried was a document cache using apache coll

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Jason Rutherglen
In Ocean I had to use a transaction log and execute everything that way like SQL database replication. Then let each node handle it's own merging process. Syncing the indexes is used to get a new node up to speed, otherwise it's avoided for the reasons mentioned in the previous email. On Fri, Se

Re: Incremental Indexing.

2008-09-08 Thread Jason Rutherglen
lized things to allow updating of parts of the inverted index. If you're interested in working on it, feel free to let me know. Cheers, Jason 2008/9/8 장용석 <[EMAIL PROTECTED]>: > Hi~. > I hava a question about lucene incremental indexing. > > I want to do incremental indexing my

  1   2   3   >