Re: An incorrect sentence in Javadoc at o.a.l.queryparser.surround.parser?

2014-12-04 Thread Shinichiro Abe
Thank you for the reply! I've fixed the solr wiki page in advance. Shinichiro Abe On 2014/12/04, at 22:36, Mike Drob wrote: > I believe this is already filed as > https://issues.apache.org/jira/browse/SOLR-4572 > > Getting the wiki page fixed would be great as well, though! > > On Wed, Dec 3

Re: How best to compare tow sentences

2014-12-04 Thread Chris Hostetter
: For a number of years I've been doing this for some time by creating a : RAMDirectory, creating a document for one of the sentence and then doing a : search using the other sentence and seeing if we get a good match. This has : worked reasonably well but since improving the performance of other

Re: Index replication strategy

2014-12-04 Thread Vijay B
Looks very promising. I will check out your blog. On Thu, Dec 4, 2014 at 9:38 AM, Shai Erera wrote: > Ooops, didn't notice that :). > > So you'll need to upgrade to Lucene 4.4.0 in order to use it. You can read > some details as well as example code here: > http://shaierera.blogspot.com/2013/05/

Compiling and running Lucene/Solr based on github does not seem to work

2014-12-04 Thread Michael Wechner
Hi I have cloned the github version of Lucene/Solr yesterday https://github.com/apache/lucene-solr and was running ant compile ant test successfully. Also Jetty seems to startup fine, but when I access http://localhost:8983/solr/ then I receive HTTP ERROR: 503 Problem accessing /solr

Re: Index replication strategy

2014-12-04 Thread Michael Sokolov
There are also Solr replication options - older snapshot-style replication, and newer Solr Cloud, but if you are not using solr now, you will incur some transitional costs since you would need to alter your indexing and possibly querying code to use it -Mike On 12/04/2014 09:38 AM, Shai Erera

Re: Index replication strategy

2014-12-04 Thread Shai Erera
Ooops, didn't notice that :). So you'll need to upgrade to Lucene 4.4.0 in order to use it. You can read some details as well as example code here: http://shaierera.blogspot.com/2013/05/the-replicator.html. Shai On Thu, Dec 4, 2014 at 4:36 PM, Vijay B wrote: > As indicated in my post, we use L

Re: Index replication strategy

2014-12-04 Thread Vijay B
As indicated in my post, we use Lucene 4.2.1. On Thu, Dec 4, 2014 at 9:29 AM, Shai Erera wrote: > Do you use Lucene or Solr? Lucene also has a replication module, which will > allow you to replicate index changes. > > On Thu, Dec 4, 2014 at 4:19 PM, Vijay B wrote: > > > Hello, > > > > We index

Re: Index replication strategy

2014-12-04 Thread Shai Erera
Do you use Lucene or Solr? Lucene also has a replication module, which will allow you to replicate index changes. On Thu, Dec 4, 2014 at 4:19 PM, Vijay B wrote: > Hello, > > We index docs coming from database nightly. Current index is sitting on > NFS. Due to obvious performance reasons, we are

RE: How best to compare tow sentences

2014-12-04 Thread Oliver Christ
Conceptually this use case is similar to what translation memories do. For an open-source TM engine, have a look at http://okapi.opentag.com/, and its default TM engine (Pensieve TM). Cheers, Oli -Original Message- From: Barry Coughlan [mailto:b.coughl...@gmail.com] Sent: Wednesday,

Index replication strategy

2014-12-04 Thread Vijay B
Hello, We index docs coming from database nightly. Current index is sitting on NFS. Due to obvious performance reasons, we are switching are planning to switch to local index. W have cluster of 4 servers and with NFS it was not a problem for us until now to share the index. but going forward, we a

Re: An incorrect sentence in Javadoc at o.a.l.queryparser.surround.parser?

2014-12-04 Thread Mike Drob
I believe this is already filed as https://issues.apache.org/jira/browse/SOLR-4572 Getting the wiki page fixed would be great as well, though! On Wed, Dec 3, 2014 at 7:44 PM, Shinichiro Abe wrote: > Hi, > > That Javadoc says "N is ordered, and W is unordered." > > https://github.com/apache/luce

Re: How best to compare tow sentences

2014-12-04 Thread Barry Coughlan
There are various implementations of Damerau-Levenshtein online. I don't know how much it will improve your results however. Why are you not indexing all of the strings? If you don't have to compute all possible pairs, then you are better off without Lucene. Note that the cosine similarity calcul

Re: How best to compare tow sentences

2014-12-04 Thread parnab kumar
Hi, If you are comparing two song titles which are usually very short you are better of using custom set of several features rather than using one of cosine or levenstein or jaccard. You may use the combination of the following: 1. cosine sim score 2. Jaccard overlap coeff 3. how many words in th

An incorrect sentence in Javadoc at o.a.l.queryparser.surround.parser?

2014-12-04 Thread Shinichiro Abe
Hi, That Javadoc says "N is ordered, and W is unordered." https://github.com/apache/lucene-solr/blob/trunk/lucene/queryparser/src/java/org/apache/lucene/queryparser/surround/parser/QueryParser.java#L39 "W is ordered, and N is unordered." I think this is correct because WQuery() returns ordered S