Re: Could not load collection from ZK:

2018-09-20 Thread Matteo Grolla
Hi everybody, I'm facing the same problem on solr 7.3. Probably requesting a longer session to zk (the default 10s seems too short) will solve the problem but I'm puzzled by the fact that this error is reported by solrj as a SolrException with status code 400 (BAD_REQUEST). in ZkStateReader

analyzing infix suggester building in near real time LUCENE-5477

2018-05-21 Thread Matteo Grolla
Hi everyone, I'm evaluating suggesters that can can be in near real time and I came across https://issues.apache.org/jira/browse/LUCENE-5477. Is there a way to use this functionality from solr? Thanks very much Matteo Grolla

Uninverting stats on solr 5 and beyond

2018-04-09 Thread Matteo Grolla
Hi, on solr 4 the log contained informations about time spent and memory consumed uninverting a field. Where can I find this information on current version of solr? Thanks --excerpt from solr 4.10 log-- INFO - 2018-04-09 15:57:58.720; org.apache.solr.request.UnInvertedField; UnInverted mul

Re: size-estimator-lucene-solr.xls error in disk space estimator

2017-04-27 Thread Matteo Grolla
Right Alessandro that's another bug Cheers 2017-04-27 12:30 GMT+02:00 alessandro.benedetti : > +1 > I would add that what is called : "Avg. Document Size (KB)" seems more to > me > "Avg. Field Size (KB)". > Cheers > > > > - > --- > Alessandro Benedetti > Search Consultant, R&D Sof

size-estimator-lucene-solr.xls error in disk space estimator

2017-04-27 Thread Matteo Grolla
It seems me that the estimation in MB is in fact an estimation in GB the formula includes the avg doc size, which is in kb, so the result is in kb and should be divided by 1024 to obtain the result in MB. But it's divided by 1024*1024

Re: matchAllDocsQuery instead of WildCardQuery from lucene qp with df and *

2016-07-28 Thread Matteo Grolla
Hi Alessandro, your shot in the dark was interesting, but the behaviour doesn't depend on the field being mandatory, it works like this for every field. So it seems just wrong df=field&q=* should be translated as field:* not as *:* 2016-07-28 10:32 GMT+02:00 Matteo Grolla : >

Re: matchAllDocsQuery instead of WildCardQuery from lucene qp with df and *

2016-07-28 Thread Matteo Grolla
(field); // *:* -> MatchAllDocsQuery if ("*".equals(termStr)) { if ("*".equals(field) || getExplicitField() == null) { return newMatchAllDocsQuery(); } } 2016-07-28 9:40 GMT+02:00 Matteo Grolla : > I noticed the behaviour in solr 4.10 and 5.4.1 >

Re: matchAllDocsQuery instead of WildCardQuery from lucene qp with df and *

2016-07-28 Thread Matteo Grolla
I noticed the behaviour in solr 4.10 and 5.4.1 2016-07-28 9:36 GMT+02:00 Matteo Grolla : > Hi, > I'm surprised by lucene query parser translating this query > > http://localhost:8983/solr/collection1/select?df=id&q=* > > in > > MatchAllDocsQuery(*:*) >

matchAllDocsQuery instead of WildCardQuery from lucene qp with df and *

2016-07-28 Thread Matteo Grolla
Hi, I'm surprised by lucene query parser translating this query http://localhost:8983/solr/collection1/select?df=id&q=* in MatchAllDocsQuery(*:*) I was expecting it to execute: "id:*" is it a bug or a desired behaviour? If desired can you explain why?

solr-8258

2016-07-07 Thread Matteo Grolla
Hi, the export handler returns 0 for null numeric values. Can someone explain me why it doesn't leave the field off the record like string or multivalue fields? thanks Matteo

export handler date fields

2016-07-07 Thread Matteo Grolla
Hi, is there a reason why the export handler doesn't support date fields? thanks Matteo Grolla

Re: problems with nested queries

2016-05-23 Thread Matteo Grolla
white cat" > > Can you open a JIRA for this? > > -Yonik > > > On Mon, May 16, 2016 at 10:23 AM, Matteo Grolla > wrote: > > Hi everyone, > > I have a problem with nested queries > > If the order is: > > 1) query > > 2) nested query (embedded in _q

problems with nested queries

2016-05-16 Thread Matteo Grolla
Hi everyone, I have a problem with nested queries If the order is: 1) query 2) nested query (embedded in _query_:"...") everything works fine if it is the opposite, like this http://localhost:8983/solr/test/select?q=_query_:%22{!lucene%20df=name_t}(\%22black%20dog\%22)%22%20OR%20name_t:%22whi

Re: query logging using query rest api

2016-05-02 Thread Matteo Grolla
ot use GET > HTTP method ( -XGET ) and pass parameters in POST (-d). > > Try to remove the -XGET parameter. > > On Thu, Apr 28, 2016 at 11:18 AM, Matteo Grolla > wrote: > > > Hi, > > I'm experimenting the query rest api with solr 5.4 and I'm noticing >

query logging using query rest api

2016-04-28 Thread Matteo Grolla
Hi, I'm experimenting the query rest api with solr 5.4 and I'm noticing that query parameters are not logged in solr.log. Here are query and log line curl -XGET 'localhost:8983/solr/test/query' -d '{"query":"*:*"}' 2016-04-28 09:16:54.008 INFO (qtp668849042-17) [ x:test] o.a.s.c.S.Request

Re: optimize requests that fetch 1000 rows

2016-02-12 Thread Matteo Grolla
bulk of qtime. > > -- Jack Krupansky > > On Thu, Feb 11, 2016 at 11:33 AM, Matteo Grolla > wrote: > > > virtual hardware, 200ms is taken on the client until response is written > to > > disk > > qtime on solr is ~90ms > > not great but acceptable >

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
-- Jack Krupansky > > On Thu, Feb 11, 2016 at 10:36 AM, Matteo Grolla > wrote: > > > Hi Jack, > > response time scale with rows. Relationship doens't seem linear but > > Below 400 rows times are much faster, > > I view query times from solr logs and

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
like? Is it complex or use wildcards or function > queries, or is it very simple keywords? How many operators? > > Have you used the debugQuery=true parameter to see which search components > are taking the time? > > -- Jack Krupansky > > On Thu, Feb 11, 2016 at 9:42 AM, Matt

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
Is this a scenario that was working fine and suddenly deteriorated, or has > it always been slow? > > -- Jack Krupansky > > On Thu, Feb 11, 2016 at 4:33 AM, Matteo Grolla > wrote: > > > Hi, > > I'm trying to optimize a solr application. > > The b

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
[image: Immagine incorporata 1] 2016-02-11 16:05 GMT+01:00 Matteo Grolla : > I see a lot of time spent in splitOnTokens > > which is called by (last part of stack trace) > > BinaryResponseWriter$Resolver.writeResultsBody() > ... > solr.search.Re

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
Matteo Grolla : > Hi Yonic, > after the first query I find 1000 docs in the document cache. > I'm using curl to send the request and requesting javabin format to mimic > the application. > gc activity is low > I managed to load the entire 50GB index in the filesystem cach

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
k activity anymore. Time improves now queries that took ~30s take <10s. But I hoped better I'm going to use jvisualvm's sampler to analyze where time is spent 2016-02-11 15:25 GMT+01:00 Yonik Seeley : > On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla > wrote: > > Thanks

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
; On Thu, 2016-02-11 at 11:53 +0100, Matteo Grolla wrote: > > I'm working with solr 4.0, sorting on score (default). > > I tried setting the document cache size to 2048, so all docs of a single > > request fit (2 requests fit actually) > > If I execute a qu

Re: optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
nd it takes 15s execute it with rows = 400 and it takes 3s it seems that below rows = 400 times are acceptable, beyond they get slow 2016-02-11 11:27 GMT+01:00 Upayavira : > > > On Thu, Feb 11, 2016, at 09:33 AM, Matteo Grolla wrote: > > Hi, > > I'm trying to optim

optimize requests that fetch 1000 rows

2016-02-11 Thread Matteo Grolla
Hi, I'm trying to optimize a solr application. The bottleneck are queries that request 1000 rows to solr. Unfortunately the application can't be modified at the moment, can you suggest me what could be done on the solr side to increase the performance? The bottleneck is just on fetching the re

Re: realtime get requirements

2016-01-12 Thread Matteo Grolla
ok, suggester was responsible for the long time to load. Thanks 2016-01-12 15:47 GMT+01:00 Matteo Grolla : > Thanks Shawn, > On a production solr instance some cores take a long time to load > while other of similar size take much less. One of the differences between > th

Re: realtime get requirements

2016-01-12 Thread Matteo Grolla
Thanks Shawn, On a production solr instance some cores take a long time to load while other of similar size take much less. One of the differences between these cores is the directoryFactory. 2016-01-12 15:34 GMT+01:00 Shawn Heisey : > On 1/12/2016 2:50 AM, Matteo Grolla wrote: > >

realtime get requirements

2016-01-12 Thread Matteo Grolla
Hi, can you confirm me that realtime get requirements are just: true json true ${solr.ulog.dir:}

Re: enable disable filter query caching based on statistics

2016-01-05 Thread Matteo Grolla
that some of your clauses are very restrictive, I > wonder what happens if > you add a cost in. fq's are evaluated in cost order (when > cache=false), so what happens > in this case? > &fq={!cache=false cost=101}n_rea:xxx&fq={!cache=false > cost=102}provincia:&f

Re: enable disable filter query caching based on statistics

2016-01-05 Thread Matteo Grolla
gt;> > a) Use the LeastFrequentlyUsed or LFU eviction policy. > >> > b) Set the size to whatever number of fqs you find suitable. > >> > You can do this like so: > >> > >> > autoWarmCount="10"/> > >> > You shoul

Re: enable disable filter query caching based on statistics

2016-01-05 Thread Matteo Grolla
//yonik.com/advanced-filter-caching-in-solr/ > > > On Tue, Jan 5, 2016 at 7:28 PM Matteo Grolla > wrote: > > > Hi, > > after looking at the presentation of cloudsearch from lucene > revolution > > 2014 > > > > > https://www.youtube.com/wa

Re: SOLR replicas performance

2016-01-05 Thread Matteo Grolla
Hi Luca, not sure if I understood well. Your question is "Why are index times on a solr cloud collecton with 2 replicas higher than on solr cloud with 1 replica" right? Well with 2 replicas all docs have to be deparately indexed in 2 places and solr has to confirm that both indexing went well

enable disable filter query caching based on statistics

2016-01-05 Thread Matteo Grolla
Hi, after looking at the presentation of cloudsearch from lucene revolution 2014 https://www.youtube.com/watch?v=RI1x0d-yO8A&list=PLU6n9Voqu_1FM8nmVwiWWDRtsEjlPqhgP&index=49 min 17:08 I recognized I'd love to be able to remove the burden of disabling filter query caching from developers the p

Re: add and then delete same document before commit,

2015-11-18 Thread Matteo Grolla
ommit happened between the original > insert and the delete? Just askin'... > > Best, > Erick > > On Wed, Nov 18, 2015 at 8:21 AM, Matteo Grolla > wrote: > > Thanks Shawn, > >I'm aware that solr isn't transactional and I don't need this > pro

Re: add and then delete same document before commit,

2015-11-18 Thread Matteo Grolla
maintained by successive solr version. 2015-11-18 16:51 GMT+01:00 Shawn Heisey : > On 11/18/2015 8:21 AM, Matteo Grolla wrote: > > On Solr 4.10.3 I'm noting a different (desired) behaviour > > > > 1) add document x > > 2) delete document x > > 3) commit > &

Re: add and then delete same document before commit,

2015-11-18 Thread Matteo Grolla
On Solr 4.10.3 I'm noting a different (desired) behaviour 1) add document x 2) delete document x 3) commit document x doesn't get indexed. The question now is: Can I count on this behaviour or is it just incidental? 2014-11-05 22:21 GMT+01:00 Matteo Grolla : > Perfectly clear, >

Re: restore quorum after majority of zk nodes down

2015-10-30 Thread Matteo Grolla
eper nodes > down. > > -- Pushkar Raste > On Oct 29, 2015 4:33 PM, "Matteo Grolla" wrote: > > > Hi Walter, > > it's not a problem to take down zk for a short (1h) time and > > reconfigure it. Meanwhile solr would go in readonly mode. > &g

Re: restore quorum after majority of zk nodes down

2015-10-29 Thread Matteo Grolla
d.org/ (my blog) > > > > On Oct 29, 2015, at 10:08 AM, Matteo Grolla > wrote: > > > > I'm designing a solr cloud installation where nodes from a single cluster > > are distributed on 2 datacenters which are close and very well connected. > > let's

restore quorum after majority of zk nodes down

2015-10-29 Thread Matteo Grolla
I'm designing a solr cloud installation where nodes from a single cluster are distributed on 2 datacenters which are close and very well connected. let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's say that DC1 goes down and the cluster is left with zk3. how can I restore a zk

Re: simple test on solr 5.2.1 wrong leader elected on startup

2015-10-15 Thread Matteo Grolla
2 and using those as solrhome for the nodes created the collection with bin/solr create -c test so it's using the builtin schemaless configuration there's nothing custom, should be all pretty standard 2015-10-15 17:42 GMT+02:00 Alessandro Benedetti : > Hi Matteo, > > On 15 Octob

simple test on solr 5.2.1 wrong leader elected on startup

2015-10-15 Thread Matteo Grolla
Hi, I'm doing this test collection test is replicated on two solr nodes running on 8983, 8984 using external zk 1)turn on solr 8984 2)add,commit a doc x con solr 8983 3)turn off solr 8983 4)turn on solr 8984 5)shortly after (leader still not elected) turn on solr 8983 6)8984 is elected as le

Re: error reporting during indexing

2015-09-29 Thread Matteo Grolla
t > at a time when the batch has errors and rely on Solr overwriting > any docs in the batch that were indexed the first time. > > Best, > Erick > > On Mon, Sep 28, 2015 at 2:27 PM, Matteo Grolla > wrote: > > Hi, > > if I need fine grained error reporting I use Ht

error reporting during indexing

2015-09-28 Thread Matteo Grolla
Hi, if I need fine grained error reporting I use Http Solr server and send 1 doc per request using the add method. I report errors on exceptions of the add method, I'm using autocommit so I'm not seing errors related to commit. Am I loosing some errors? Is there a better way? Thanks

splitshard on live node: performance impact

2015-07-06 Thread Matteo Grolla
Hi, what is the performance impact of issuing a splitshard on a live node used for searches?

Re: optimal shard assignment with low shard key cardinality using compositeId to enable shard splitting

2015-05-30 Thread Matteo Grolla
lso_ does is force all of the work for a query onto one >> node and all indexing for a particular producer ditto. And will cause you to >> manually monitor your shards to see if some of them grow out of proportion >> to others. And >> >> I think it would be much le

optimal shard assignment with low shard key cardinality using compositeId to enable shard splitting

2015-05-21 Thread Matteo Grolla
Hi I'd like some feedback on how I'd like to solve the following sharding problem I have a collection that will eventually become big Average document size is 1.5kb Every year 30 Million documents will be indexed Data come from different document producers (a person, owner of his documents) an

Re: please confirm: pseudo join queries can only be performed on fields of exactly the same type

2015-05-18 Thread Matteo Grolla
> When you used the keywordTokenizer, was there other analysis such as > lowercasing going on? > > -Yonik > > > On Mon, May 18, 2015 at 10:26 AM, Matteo Grolla > wrote: >> Hi, >>I tried performing a join query >>{!join from=fA to=fB}

please confirm: pseudo join queries can only be performed on fields of exactly the same type

2015-05-18 Thread Matteo Grolla
Hi, I tried performing a join query {!join from=fA to=fB} where fA was string and fB was text using keywordTokenizer it doesn't work, but it does if either fields are both string or both text. If you confirm this is the correct behavior I'll upda

stats component performance

2015-04-27 Thread Matteo Grolla
Hi, is there any public benchmark or description of how the solr stats component works? Matteo

Re: scanning all documents in the collection

2015-02-02 Thread Matteo Grolla
Wow!!! thanks Joe! Il giorno 02/feb/2015, alle ore 15:05, Joseph Obernberger ha scritto: > I have a similar use-case. Check out the export capability and using > cursorMark. > > -Joe > > On 2/2/2015 8:14 AM, Matteo Grolla wrote: >> Hi, >> I'm t

scanning all documents in the collection

2015-02-02 Thread Matteo Grolla
Hi, I'm thinking about having an instance of solr (SolrA) with all fields stored and just id indexed in addition with a normal production instance of solr (SolrB) that is used for the searches. This would allow me to read only what changed from previous crawl, update SolrA and send the f

Re: solrcloud nodes registering as 127.0.1.1

2015-01-12 Thread Matteo Grolla
Solved! ubuntu has an entry like this in /etc/hosts 127.0.1.1 to properly run solrcloud one must substitute 127.0.1.1 with a real (possibly permanent) ip address Il giorno 12/gen/2015, alle ore 12:47, Matteo Grolla ha scritto: > Hi, > hope someone can h

solrcloud nodes registering as 127.0.1.1

2015-01-12 Thread Matteo Grolla
Hi, hope someone can help me troubleshoot this issue. I'm trying to setup a solrcloud cluster with -zookeeper on 192.168.1.8 (osx mac) -solr1 on 192.168.1.10 (virtualized ubuntu running on mac) -solr2 on 192.168.1.3 (ubuntu on another pc) the problem is th

Re: add and then delete same document before commit,

2014-11-05 Thread Matteo Grolla
; document. > > -- Jack Krupansky > > -----Original Message- From: Matteo Grolla > Sent: Wednesday, November 5, 2014 4:47 AM > To: solr-user@lucene.apache.org > Subject: add and then delete same document before commit, > > Can anyone tell me the behavior of solr (and if i

add and then delete same document before commit,

2014-11-05 Thread Matteo Grolla
Can anyone tell me the behavior of solr (and if it's consistent) when I do what follows: 1) add document x 2) delete document x 3) commit I've tried with solr 4.5.0 and document x get's indexed Matteo

Re: order of updates

2014-11-03 Thread Matteo Grolla
Thanks really a lot Yonik! Il giorno 03/nov/2014, alle ore 15:51, Yonik Seeley ha scritto: > On Mon, Nov 3, 2014 at 8:53 AM, Matteo Grolla wrote: >> HI, >>can anybody give me a confirm? >> If I add multiple document with the same id but differing on other fields

order of updates

2014-11-03 Thread Matteo Grolla
HI, can anybody give me a confirm? If I add multiple document with the same id but differing on other fields and then issue a commit (no commits before this) the last added document gets indexed, right? H.p. using solr 4 and default settings for optimistic locking. Matteo

how to fully test a response writer

2014-07-23 Thread Matteo Grolla
Hi, I developed a new SolResponseWriter but I'm not happy with how I wrote tests. My problem is that I need to test it either with local request and with distributed request since the solr response object (input to the response writer) are different. a) I tested the local request case

Re: query(subquery, default) filters results

2014-05-15 Thread Matteo Grolla
Thanks very much, i realized too late that that I skipped an important part of the wiki documentation "this example assumes /detType=func" thanks a lot Il giorno 06/mag/2014, alle ore 21:05, Yonik Seeley ha scritto: > On Tue, May 6, 2014 at 5:08 AM, Matteo Grolla

query(subquery, default) filters results

2014-05-06 Thread Matteo Grolla
Hi everybody, I'm having troubles with the function query "query(subquery, default)" http://wiki.apache.org/solr/FunctionQuery#query running this http://localhost:8983/solr/select?q=query($qq,1)&qq={!dismax qf=text}hard drive on collection1 gives me no results but

Re: interpretation of cat_rank in http://people.apache.org/~hossman/ac2012eu/

2014-05-06 Thread Matteo Grolla
Thanks a lot and thanks for pointing me at the video. I missed it Matteo Il giorno 05/mag/2014, alle ore 20:44, Chris Hostetter ha scritto: > : Hi everybody > : can anyone give me a suitable interpretation for cat_rank in > : http://people.apache.org/~hossman/ac2012eu/ slide 15 > >

interpretation of cat_rank in http://people.apache.org/~hossman/ac2012eu/

2014-05-05 Thread Matteo Grolla
Hi everybody can anyone give me a suitable interpretation for cat_rank in http://people.apache.org/~hossman/ac2012eu/ slide 15 thanks

How to size document cache

2013-10-25 Thread Matteo Grolla
Hi, I'd really appreciate if you could give me some help understanding how to tune the document cache. My thoughts: min values: max_results * max_concurrent_queries, as stated by http://wiki.apache.org/solr/SolrCaching how can I estimate max_concurrent_queries?

Re: Improving indexing performance

2013-10-08 Thread Matteo Grolla
'll be best to specify openSearcher=false for max indexing throughput > BTW. You should be able to do this quite frequently, 15 seconds seems > quite reasonable. > > Best, > Erick > > On Sun, Oct 6, 2013 at 12:19 PM, Matteo Grolla > wrote: >> I'd like

Improving indexing performance

2013-10-06 Thread Matteo Grolla
I'd like to have some suggestion on how to improve the indexing performance on the following scenario I'm uploading 1M docs to solr, every docs has id: sequential number title: small string date: date body: 1kb of text Here are my benchmarks (they are all single

how soft-commit works

2013-09-16 Thread Matteo Grolla
searcher? -Is it a good idea to set openSearcher=false in auto commit and rely on soft auto commit to see new data in searches? thanks Matteo Grolla