RE: # of daily/weekly/monthly Solr downloads?

2014-12-09 Thread Alexey Kozhemiakin
Hi, according to slides #3 it's 250,000+ monthly downloads. http://www.slideshare.net/anshumg/ease-of-use-in-apache-solr -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, December 10, 2014 01:25 To: solr-user@lucene.apache.org Subject: # of da

RE: SegmentInfos exposed to /admin/luke

2014-12-03 Thread Alexey Kozhemiakin
our proposal anyway. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 3 December 2014 at 06:35, Alexey Kozhemiakin wrote: > Dear All,

SegmentInfos exposed to /admin/luke

2014-12-03 Thread Alexey Kozhemiakin
ppy to push the changes to Solr afterwards. Thank you, Alexey Kozhemiakin

RE: Empty documents in Solr\lucene 3.6

2014-04-15 Thread Alexey Kozhemiakin
The system was up and running for long time(months) without any updates. There was no crashes for sure, at least support team says so. Logs indicate that at some point there was not enough disk space (caused by weekend index optimization). Were there any other similar cases or it's unique for us

Empty documents in Solr\lucene 3.6

2014-04-15 Thread Alexey Kozhemiakin
Dear Community, We've faced a strange data corruption issue with one of our clients old solr setup (3.6). When we do a query (id:X OR id:Y) we get 2 nodes, one contains normal doc data, another is empty (). We've looked inside lucene index using Luke - same story, one of documents is empty. Wh

RE: Grouping performance improvement

2014-02-20 Thread Alexey Kozhemiakin
You can think of using facets by category field instead of grouping. It will be faster and categorization can be done against multiple category fields. Try different facet methods. If you don't need number of documents in each category and number of unique categories is relatively low, you mig

RE: Facet optimization for facet.method=enum and "exists" case

2014-02-13 Thread Alexey Kozhemiakin
not > ready for commit". Also, including comments in the code like > //nocommit will cause it to fail the "ant precommit" step. This is > often useful to get other eyeballs on the code early. > > But it's up to you. > > Best, > Erick > > > On M

Facet optimization for facet.method=enum and "exists" case

2014-02-10 Thread Alexey Kozhemiakin
Dear All, Background: We have a dataset containing hundreds of millions of records, we facet by dozens of fields with many of facet-excludes and have relatively small number of unique values in fields, around thousands. Before executing search, our users work with "advanced search" and goal is t

RE: Lowering query time

2014-02-04 Thread Alexey Kozhemiakin
Btw "timing" for distributed requests are broken at this moment, it doesn't combine values from requests to shards. I'm working on a patch. https://issues.apache.org/jira/browse/SOLR-3644 -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, February

RE: Internal shard communication - performance?

2013-08-11 Thread Alexey Kozhemiakin
Hi Tim, Torsten, Please review following threads which covers chatty shard-shard and shard-replica conversations, and since you index large volumes of data it can be a potential bottleneck in your case. http://lucene.472066.n3.nabble.com/Sharding-and-Replication-td4071614.html http://lucene.4

RE: Sharding and Replication

2013-08-09 Thread Alexey Kozhemiakin
+1 I'd like to vote for this issue https://issues.apache.org/jira/browse/SOLR-4956 It would be useful to have this parameters configurable. When we index hundreds of millions of documents to 4 shard SolrCloud in batches of 20K - overhead of this chatty conversation with replicas and other sh

RE: Document Similarity Algorithm at Solr/Lucene

2013-08-05 Thread Alexey Kozhemiakin
We considered MLT component to implemented a sort of "near exact duplicate detection" - which is probably very similar to your task. http://wiki.apache.org/solr/MoreLikeThis You may think of MoreLikeThis as a two phase process (transform a document to query and run it): 1a) it tokeniz

SolrCloud requires fixed ip address?

2013-08-05 Thread Alexey Kozhemiakin
Dear All, Our SolrCloud cluster(4 nodes, 4 shards, Embedded Zookeeper) failed to start after VMs we started after weekend. We shut down 4VM in our private cloud for weekend and started SOLR in the same order as they were initialized - first zookeeper-hosting node and then 3 other nodes. Unfor

SolrCloud RemoteSolrException: We are not the leader

2013-08-05 Thread Alexey Kozhemiakin
Dear All, We are facing strange issue with SolrCloud (4.4 with Embedded Zookeeper). Cluster consists of 2 shards and 4 nodes. 4th node cannot be added to cluster and stays in "recovering" state with following error in logs. Picture from admin cloud interface http://imageshack.us/photo/my-images

RE: Delete all documents in the index

2012-09-06 Thread Alexey Kozhemiakin
One more thanks for posting this! I struggled with the same issue yesterday and solved it with _version_ hint from mailing list . Alex. -Original Message- From: Mark Mandel [mailto:mark.man...@gmail.com] Sent: Thursday, September 06, 2012 1:53 AM To: solr-user@lucene.apache.org Subject