Solr Admin Page Metrics

2021-03-11 Thread Dwane Hall
Hey Solr community. I started playing around with the 8.8.1 docker image today 
and noticed issues with the JVM and host memory 'Bar' graphs from the dashboard 
page of the Solr Admin interface. It also appeares the "JVM" parameters were 
not listed here but definitely configured as expected as they were visible 
under the "Java Properties" tab. Form a quick inspection of the Javascript 
console it appears some objects were undefined (looks to be an early Angular 
application). Has anyone else noticed this behaviour as well this worked as 
expected on the 7.x branch of Solr?

Thanks,

Dwane


DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular-resource.min.js.map: HTTP error: status code 
404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular.min.js.map: HTTP error: status code 404, 
net::ERR_HTTP_RESPONSE_CODE_FAILURE

DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular-route.min.js.map: HTTP error: status code 404, 
net::ERR_HTTP_RESPONSE_CODE_FAILURE

DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular-cookies.min.js.map: HTTP error: status code 
404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

angular.min.js:146 TypeError: Cannot read property 'match' of undefined

at parse_memory_value (index.js:80)

at index.js:43

at I (angular-resource.min.js:31)

at angular.min.js:159

at m.$digest (angular.min.js:170)

at m.$apply (angular.min.js:174)

at k (angular.min.js:125)

at v (angular.min.js:130)

at XMLHttpRequest.y.onload (angular.min.js:131) "Possibly unhandled 
rejection: {}"


RE: zk upconfig does not recognize ZK_HOST style url

2021-03-11 Thread Subhajit Das
Hi Jan,

Opened SOLR-15231

Thanks,
Subhajit

From: Jan Høydahl
Sent: 09 March 2021 05:05 PM
To: users@solr.apache.org
Subject: Re: zk upconfig does not recognize ZK_HOST style url

This sounds like a bug. Please open a JIRA issue for it.

The bug appears when parsing zkHost in SolrCLI.java, here is one place 
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/util/SolrCLI.java#L212
 but there are others. We need to parse the string, pull out zkChRoot part and 
pass that to CloudSolrClient.Builder()

Jan

> 9. mar. 2021 kl. 08:36 skrev Subhajit Das :
>
>
> Hi There,
>
> When I was trying to upload new configset to zookeeper using Solr control 
> script, the -z parameter is not recognizing ZK_HOST style string.
>
> Say, I use ,,/solr, then the config is uploaded to  
> directly, instead of /solr znode.
>
> Can anyone please help on this matter, if this is how it is supposed to work.
>
> Note: /solr,/solr,/solr seems o work.
>



Distributed group query, if in ` fl ` rename the unique key field name, will be NullPointerException.

2021-03-11 Thread Dawn
Hi:
Distributed group query, if in ` fl ` rename the unique key field name, 
will be java.lang.NullPointerException.

Because of without StoredFieldsShardResponseProcessor processing rename.

Can I create an issue to fix it?


Sincerely yours
Dawn





Re: Solr not distributing search requests among replicas

2021-03-11 Thread Chris Hostetter

: >> 2) a single "extra" solr node in the cluster can be used as a "self
: configuring" load balancer
: 
: I’ve thought about this a bunch before, are there mechanisms to instruct
: Solr to not host shards for this purpose? Maybe it deserves its own
: discussion.

Rules based replica placement can prevent any replica (of any shard, of 
any collection) from being put on a particular host by ip (or by sys prop 
set when the node is started) ... 

https://solr.apache.org/guide/8_6/rule-based-replica-placement.html#do-not-create-any-replicas-in-host-192-45-67-3

...similar restrictions can easily be imposed by an autoscaling policy 
rules (although AFAIK you have to flip the logic: "all replicas of all 
collections should only live on nodes with prop X") ...

https://solr.apache.org/guide/8_6/solrcloud-autoscaling-policy-preferences.html#node-selector

[ { "replica": "#ALL", "nodeset": {"sysprop.use_for_replicas": "search"}} ] } }'


... but I don't know how much of this functionality has survived "The 
Great Autoscaling Purge of 9.x"




-Hoss
http://www.lucidworks.com/

Re: Increase in response time in case of collapse queries.

2021-03-11 Thread Gajendra Dadheech
@florin

Great advice. Null key for unique documents is really helpful. Any other
such tricks that you are using to improve collapse performance ?

On Tue, Mar 9, 2021, 2:45 PM Parshant Kumar
 wrote:

> Hi Joel,
>
> 1) What are the response times for both methods. Saying one is faster is
> not specific enough.
>
> Response time for the grouped method is 167 ms for 0.65 million requests.
> Response time for the collapsed method is 177 ms for 0.65 million requests.
>
> 2) What is the cardinality of the collapse field, saying it's high is not
> specific enough. What is the actual cardinality?
>
> Cardinality of the collapse field is around 6.2 Million
>
> [image: image.png]
> 3) Is ngroups used in the grouping query
>
> Yes, ngroups is used in grouping query.
>
> Thanks
> Parshant Kumar
>
>
>
>
> On Tue, Mar 9, 2021 at 12:30 AM Joel Bernstein  wrote:
>
>> Collapse is designed to outperform grouping in the following scenario:
>>
>> There is high cardinality on the group field and group.ngroups is needed.
>> If either of these conditions is not satisfied grouping will typically be
>> faster.
>>
>> You will need to provide some more information about your setup to get an
>> answer to the collapse performance question.
>>
>> 1) What are the response times for both methods. Saying one is faster is
>> not specific enough.
>> 2) What is the cardinality of the collapse field, saying it's high is not
>> specific enough. What is the actual cardinality?
>> 3) Is ngroups used in the grouping query.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Mon, Mar 8, 2021 at 11:30 AM Gajendra Dadheech 
>> wrote:
>>
>> > @prashant Florin means to put null for parentglusrid in documents where
>> > this field-value is only present in one document [Group has only one
>> > document]. and then use nullPolicy to include/expand.
>> >
>> >
>> >
>> > On Mon, Mar 8, 2021 at 6:55 PM Parshant Kumar
>> >  wrote:
>> >
>> > > client should set to null the field if it's unique.
>> > >
>> > > @florin @Gajendra can you please explain more .I am not clear how to
>> > > perform this.
>> > >
>> > > On Mon, Mar 8, 2021 at 6:09 PM Florin Babes 
>> > wrote:
>> > >
>> > > > @Gajendra Our response time dropped by 36% and our rps increased
>> with
>> > > 27%.
>> > > >
>> > > > You have to reindex the core and the client should set to null the
>> > field
>> > > if
>> > > > it's unique.
>> > > >
>> > > > În lun., 8 mar. 2021 la 13:18, Parshant Kumar
>> > > >  a scris:
>> > > >
>> > > > > How can we make group_field null? Using nullPolicy=expand ?
>> > > > >
>> > > > > On Mon, Mar 8, 2021 at 4:41 PM Florin Babes <
>> babesflo...@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > We improved the performance of collapse by making the
>> group_field
>> > > null
>> > > > > for
>> > > > > > the documents that have an unique value for group_field. This
>> might
>> > > > help/
>> > > > > >
>> > > > > >
>> > > > > > În lun., 8 mar. 2021 la 12:40, Parshant Kumar
>> > > > > >  a scris:
>> > > > > >
>> > > > > > > yes,group_field is having high cardinality.
>> > > > > > >
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > > Parshant Kumar
>> > > > > > >
>> > > > > > > On Mon, Mar 8, 2021 at 4:06 PM Florin Babes <
>> > babesflo...@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Your group_field has a high cardinality?
>> > > > > > > > Thanks,
>> > > > > > > > Florin Babes
>> > > > > > > >
>> > > > > > > > În lun., 8 mar. 2021 la 10:35, Parshant Kumar
>> > > > > > > >  a scris:
>> > > > > > > >
>> > > > > > > > > Hi florin,
>> > > > > > > > >
>> > > > > > > > > I am using below.
>> > > > > > > > >
>> > > > > > > > > 1) fq={!collapse field=parentglusrid}
>> > > > > > > > > 2) expand.rows=4
>> > > > > > > > > 3) expand=true
>> > > > > > > > >
>> > > > > > > > > Size of index is around 100GB.
>> > > > > > > > > Solr version is 6.5
>> > > > > > > > >
>> > > > > > > > > On Mon, Mar 8, 2021 at 1:46 PM Florin Babes <
>> > > > babesflo...@gmail.com
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hello,
>> > > > > > > > > > First let's call the field you collapse on group_field
>> > > > > > > > > > If group_field has a high cardinality you should make
>> > > > group_field
>> > > > > > > null
>> > > > > > > > > for
>> > > > > > > > > > those documents that have a unique group_field and set
>> > > > > > > > nullPolicy=expand.
>> > > > > > > > > > By doing that solr will use less memory for it's
>> internal
>> > > maps
>> > > > > (so
>> > > > > > > > faster
>> > > > > > > > > > gc) and the head selecting will be faster.
>> > > > > > > > > > What is your head selecting strategy? Can you share
>> your fq
>> > > > which
>> > > > > > you
>> > > > > > > > use
>> > > > > > > > > > for collapsing?
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > > Florin Babes
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > În lun., 8 mar.

Best throttling / push-back strategy for updates?

2021-03-11 Thread Jan Høydahl
Hi,

When sending updates to Solr, you often need to run multi threaded to utilize 
the CPU on the solr side.
But how can the client (whether it is pure HTTP POST or SolrJ know whether Solr 
is happy with the indexing speed or not?

I'm thinking of a feedback mechanism where Solr can check its load level, 
indexing queue filling rate or other metrics as desired, and respond to the 
caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
Clients will then know that they should pause for a while and retry. Clients 
can then implement an exponential backoff strategy to adjust their indexing 
rate.
A bonus with such a system would be that Solr could "tell" indexing to slow 
down during periods with heavy query traffic, background merge activity, 
recovery, replication, if warming is too slow (max warming searchers) etc etc.

I know Elastic has something similar. Is there already something in our APIs 
that I don't know about?

Jan

Re: Best throttling / push-back strategy for updates?

2021-03-11 Thread Walter Underwood
In a master/slave system, it is OK to run as fast as possible to the master.
In a cloud system, we want to keep the indexing load at a level that doesn’t
interfere with queries.

I do this by matching the number of indexing threads to the number of CPUs.
Very roughly, two threads will keep one CPU busy, that is one thread waiting 
for 
the CPU to finish the batch and another sending the next batch.

With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads 
to use 25% (2 CPUs).

In a sharded system, the indexing is spread over the leaders. For example,
in our system with 8 shards, 64 threads will keep 2 CPUs busy on each 
leader. That number of threads runs at nearly a half-million updates per
minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
with 72 CPUs.

Also, we don’t use the cloud-sensitive stuff, we just throw update batches
at the load balancer. One loader is a simple Python program, so that sends
it all in JSON. That is the one doing 480k/min with 64 threads.

Finally, we use a separate load balancer for indexing. That lets us set 
different
response time alert levels for query traffic and update traffic. It also allows 
us
to see anomalous bursts of query traffic separate from updates.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 11, 2021, at 10:51 AM, Jan Høydahl  wrote:
> 
> Hi,
> 
> When sending updates to Solr, you often need to run multi threaded to utilize 
> the CPU on the solr side.
> But how can the client (whether it is pure HTTP POST or SolrJ know whether 
> Solr is happy with the indexing speed or not?
> 
> I'm thinking of a feedback mechanism where Solr can check its load level, 
> indexing queue filling rate or other metrics as desired, and respond to the 
> caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
> Clients will then know that they should pause for a while and retry. Clients 
> can then implement an exponential backoff strategy to adjust their indexing 
> rate.
> A bonus with such a system would be that Solr could "tell" indexing to slow 
> down during periods with heavy query traffic, background merge activity, 
> recovery, replication, if warming is too slow (max warming searchers) etc etc.
> 
> I know Elastic has something similar. Is there already something in our APIs 
> that I don't know about?
> 
> Jan



Re: Increase in response time in case of collapse queries.

2021-03-11 Thread Joel Bernstein
So here are the response times:

Response time for the grouped method is 167 ms for 0.65 million requests.
Response time for the collapsed method is 177 ms for 0.65 million requests.

If group.ngroups is used with 6+ million cardinality, then the result set
size before grouping must have been small. With a large result set grouping
would have been quite slow in this scenario.

With small result sets, grouping is just fine also.




Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Mar 11, 2021 at 1:40 PM Gajendra Dadheech 
wrote:

> @florin
>
> Great advice. Null key for unique documents is really helpful. Any other
> such tricks that you are using to improve collapse performance ?
>
> On Tue, Mar 9, 2021, 2:45 PM Parshant Kumar
>  wrote:
>
> > Hi Joel,
> >
> > 1) What are the response times for both methods. Saying one is faster is
> > not specific enough.
> >
> > Response time for the grouped method is 167 ms for 0.65 million requests.
> > Response time for the collapsed method is 177 ms for 0.65 million
> requests.
> >
> > 2) What is the cardinality of the collapse field, saying it's high is not
> > specific enough. What is the actual cardinality?
> >
> > Cardinality of the collapse field is around 6.2 Million
> >
> > [image: image.png]
> > 3) Is ngroups used in the grouping query
> >
> > Yes, ngroups is used in grouping query.
> >
> > Thanks
> > Parshant Kumar
> >
> >
> >
> >
> > On Tue, Mar 9, 2021 at 12:30 AM Joel Bernstein 
> wrote:
> >
> >> Collapse is designed to outperform grouping in the following scenario:
> >>
> >> There is high cardinality on the group field and group.ngroups is
> needed.
> >> If either of these conditions is not satisfied grouping will typically
> be
> >> faster.
> >>
> >> You will need to provide some more information about your setup to get
> an
> >> answer to the collapse performance question.
> >>
> >> 1) What are the response times for both methods. Saying one is faster is
> >> not specific enough.
> >> 2) What is the cardinality of the collapse field, saying it's high is
> not
> >> specific enough. What is the actual cardinality?
> >> 3) Is ngroups used in the grouping query.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
> >> On Mon, Mar 8, 2021 at 11:30 AM Gajendra Dadheech 
> >> wrote:
> >>
> >> > @prashant Florin means to put null for parentglusrid in documents
> where
> >> > this field-value is only present in one document [Group has only one
> >> > document]. and then use nullPolicy to include/expand.
> >> >
> >> >
> >> >
> >> > On Mon, Mar 8, 2021 at 6:55 PM Parshant Kumar
> >> >  wrote:
> >> >
> >> > > client should set to null the field if it's unique.
> >> > >
> >> > > @florin @Gajendra can you please explain more .I am not clear how to
> >> > > perform this.
> >> > >
> >> > > On Mon, Mar 8, 2021 at 6:09 PM Florin Babes 
> >> > wrote:
> >> > >
> >> > > > @Gajendra Our response time dropped by 36% and our rps increased
> >> with
> >> > > 27%.
> >> > > >
> >> > > > You have to reindex the core and the client should set to null the
> >> > field
> >> > > if
> >> > > > it's unique.
> >> > > >
> >> > > > În lun., 8 mar. 2021 la 13:18, Parshant Kumar
> >> > > >  a scris:
> >> > > >
> >> > > > > How can we make group_field null? Using nullPolicy=expand ?
> >> > > > >
> >> > > > > On Mon, Mar 8, 2021 at 4:41 PM Florin Babes <
> >> babesflo...@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > > We improved the performance of collapse by making the
> >> group_field
> >> > > null
> >> > > > > for
> >> > > > > > the documents that have an unique value for group_field. This
> >> might
> >> > > > help/
> >> > > > > >
> >> > > > > >
> >> > > > > > În lun., 8 mar. 2021 la 12:40, Parshant Kumar
> >> > > > > >  a scris:
> >> > > > > >
> >> > > > > > > yes,group_field is having high cardinality.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > Thanks
> >> > > > > > > Parshant Kumar
> >> > > > > > >
> >> > > > > > > On Mon, Mar 8, 2021 at 4:06 PM Florin Babes <
> >> > babesflo...@gmail.com
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Your group_field has a high cardinality?
> >> > > > > > > > Thanks,
> >> > > > > > > > Florin Babes
> >> > > > > > > >
> >> > > > > > > > În lun., 8 mar. 2021 la 10:35, Parshant Kumar
> >> > > > > > > >  a scris:
> >> > > > > > > >
> >> > > > > > > > > Hi florin,
> >> > > > > > > > >
> >> > > > > > > > > I am using below.
> >> > > > > > > > >
> >> > > > > > > > > 1) fq={!collapse field=parentglusrid}
> >> > > > > > > > > 2) expand.rows=4
> >> > > > > > > > > 3) expand=true
> >> > > > > > > > >
> >> > > > > > > > > Size of index is around 100GB.
> >> > > > > > > > > Solr version is 6.5
> >> > > > > > > > >
> >> > > > > > > > > On Mon, Mar 8, 2021 at 1:46 PM Florin Babes <
> >> > > > babesflo...@gmail.com
> >> > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hello,
> >> > > > > > > > >

Re: Solr Admin Page Metrics

2021-03-11 Thread Dwane Hall
I dug into this a l little deeper and it looks like some of the metrics 
reported from the Metrics API have changed between Solr 7 and 8.  The main 
culprits seem to be os.totalPhysicalMemorySize not being calucated in Solr 8 
and two missing metrics os.totalSwapSpaceSize and os.freeSwapSpaceSize which 
are all used in the Dashboard view page.  Below is an extract of the javascrpit 
used on the Admin Dashboard,  and a comparison between metrics reported in Solr 
7 and 8.  The function "parse_memory_value" is where the javascript error 
appears to be thrown with the missing metrics.

Thanks,

Dwane


Solr 8

"os.totalPhysicalMemorySize":0, (Not calculated)

"os.freePhysicalMemorySize":792087998464,

"os.totalSwapSpaceSize" (Metric not present)

"os.freeSwapSpaceSize": (Metric not present)

"memory.heap.committed":8589934592,

"memory.heap.init":8589934592,

"memory.heap.max":8589934592,

"memory.heap.usage":0.006413557566702366,

"memory.heap.used":55092040,

"memory.non-heap.committed":97910784,

"memory.non-heap.init":7667712,

"memory.non-heap.max":-1,

"memory.non-heap.usage":-9.2249552E7,

"memory.non-heap.used":92249712,



Solr 7

"os.totalPhysicalMemorySize":810586099712,

"os.freePhysicalMemorySize":756665888768,

"os.totalSwapSpaceSize":0,

"os.freeSwapSpaceSize":0

"memory.heap.committed":12348030976,

"memory.heap.init":12884901888,

"memory.heap.max":12348030976,

"memory.heap.usage":0.313836514301922,

"memory.heap.used":3875263000,

"memory.non-heap.committed":145039360,

"memory.non-heap.init":7667712,

"memory.non-heap.max":-1,

"memory.non-heap.usage":-1.30145664E8,

"memory.non-heap.used":130145824,







main.js (Metrics Dashboard)



// physical memory

var memoryMax = parse_memory_value(data.system.totalPhysicalMemorySize);

$scope.memoryTotal = parse_memory_value(data.system.totalPhysicalMemorySize - 
data.system.freePhysicalMemorySize);

$scope.memoryPercentage = ($scope.memoryTotal / memoryMax * 100).toFixed(1)+ 
"%";

$scope.memoryMax = pretty_print_bytes(memoryMax);

$scope.memoryTotalDisplay = pretty_print_bytes($scope.memoryTotal);



// swap space

var swapMax = parse_memory_value(data.system.totalSwapSpaceSize);

$scope.swapTotal = parse_memory_value(data.system.totalSwapSpaceSize - 
data.system.freeSwapSpaceSize);

$scope.swapPercentage = ($scope.swapTotal / swapMax * 100).toFixed(1)+ "%";

$scope.swapMax = pretty_print_bytes(swapMax);

$scope.swapTotalDisplay = pretty_print_bytes($scope.swapTotal);



// file handles

$scope.fileDescriptorPercentage = (data.system.openFileDescriptorCount / 
data.system.maxFileDescriptorCount *100).toFixed(1) + "%";



// java memory

var javaMemoryMax = parse_memory_value(data.jvm.memory.raw.max || 
data.jvm.memory.max);

$scope.javaMemoryTotal = parse_memory_value(data.jvm.memory.raw.total || 
data.jvm.memory.total);

$scope.javaMemoryUsed = parse_memory_value(data.jvm.memory.raw.used || 
data.jvm.memory.used);

$scope.javaMemoryTotalPercentage = ($scope.javaMemoryTotal / javaMemoryMax 
*100).toFixed(1) + "%";

$scope.javaMemoryUsedPercentage = ($scope.javaMemoryUsed / 
$scope.javaMemoryTotal *100).toFixed(1) + "%";

$scope.javaMemoryPercentage = ($scope.javaMemoryUsed / javaMemoryMax * 
100).toFixed(1) + "%";

$scope.javaMemoryTotalDisplay = pretty_print_bytes($scope.javaMemoryTotal);

$scope.javaMemoryUsedDisplay = pretty_print_bytes($scope.javaMemoryUsed);  // 
@todo These should really be an AngularJS Filter: {{ javaMemoryUsed | bytes }}

$scope.javaMemoryMax = pretty_print_bytes(javaMemoryMax);





var parse_memory_value = function( value ) {

  if( value !== Number( value ) )

  {

var units = 'BKMGTPEZY';

var match = value.match( /^(\d+([,\.]\d+)?) (\w).*$/ );

var value = parseFloat( match[1] ) * Math.pow( 1024, units.indexOf( 
match[3].toUpperCase() ) );

  }



  return value;

};



From: Dwane Hall 
Sent: Thursday, 11 March 2021 7:40 PM
To: users@solr.apache.org 
Subject: Solr Admin Page Metrics

Hey Solr community. I started playing around with the 8.8.1 docker image today 
and noticed issues with the JVM and host memory 'Bar' graphs from the dashboard 
page of the Solr Admin interface. It also appeares the "JVM" parameters were 
not listed here but definitely configured as expected as they were visible 
under the "Java Properties" tab. Form a quick inspection of the Javascript 
console it appears some objects were undefined (looks to be an early Angular 
application). Has anyone else noticed this behaviour as well this worked as 
expected on the 7.x branch of Solr?

Thanks,

Dwane


DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular-resource.min.js.map: HTTP error: status code 
404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

DevTools failed to load SourceMap: Could not load content for 
https://myhost/solr/libs/angular.min.js.map: HTTP error: status code 404, 
net::ERR_HTTP_RESPONSE_CODE_FAILURE

DevTools failed to load SourceMap: Could not l

Re: Solr Admin Page Metrics

2021-03-11 Thread Eric Pugh
I’d love to see a Jira issue created and a PR opened against 
https://github.com/apache/solr  for this.  Tag 
me and I’ll review it.

> On Mar 11, 2021, at 6:13 PM, Dwane Hall  wrote:
> 
> I dug into this a l little deeper and it looks like some of the metrics 
> reported from the Metrics API have changed between Solr 7 and 8.  The main 
> culprits seem to be os.totalPhysicalMemorySize not being calucated in Solr 8 
> and two missing metrics os.totalSwapSpaceSize and os.freeSwapSpaceSize which 
> are all used in the Dashboard view page.  Below is an extract of the 
> javascrpit used on the Admin Dashboard,  and a comparison between metrics 
> reported in Solr 7 and 8.  The function "parse_memory_value" is where the 
> javascript error appears to be thrown with the missing metrics.
> 
> Thanks,
> 
> Dwane
> 
> 
> Solr 8
> 
> "os.totalPhysicalMemorySize":0, (Not calculated)
> 
> "os.freePhysicalMemorySize":792087998464,
> 
> "os.totalSwapSpaceSize" (Metric not present)
> 
> "os.freeSwapSpaceSize": (Metric not present)
> 
> "memory.heap.committed":8589934592,
> 
> "memory.heap.init":8589934592,
> 
> "memory.heap.max":8589934592,
> 
> "memory.heap.usage":0.006413557566702366,
> 
> "memory.heap.used":55092040,
> 
> "memory.non-heap.committed":97910784,
> 
> "memory.non-heap.init":7667712,
> 
> "memory.non-heap.max":-1,
> 
> "memory.non-heap.usage":-9.2249552E7,
> 
> "memory.non-heap.used":92249712,
> 
> 
> 
> Solr 7
> 
> "os.totalPhysicalMemorySize":810586099712,
> 
> "os.freePhysicalMemorySize":756665888768,
> 
> "os.totalSwapSpaceSize":0,
> 
> "os.freeSwapSpaceSize":0
> 
> "memory.heap.committed":12348030976,
> 
> "memory.heap.init":12884901888,
> 
> "memory.heap.max":12348030976,
> 
> "memory.heap.usage":0.313836514301922,
> 
> "memory.heap.used":3875263000,
> 
> "memory.non-heap.committed":145039360,
> 
> "memory.non-heap.init":7667712,
> 
> "memory.non-heap.max":-1,
> 
> "memory.non-heap.usage":-1.30145664E8,
> 
> "memory.non-heap.used":130145824,
> 
> 
> 
> 
> 
> 
> 
> main.js (Metrics Dashboard)
> 
> 
> 
> // physical memory
> 
> var memoryMax = parse_memory_value(data.system.totalPhysicalMemorySize);
> 
> $scope.memoryTotal = parse_memory_value(data.system.totalPhysicalMemorySize - 
> data.system.freePhysicalMemorySize);
> 
> $scope.memoryPercentage = ($scope.memoryTotal / memoryMax * 100).toFixed(1)+ 
> "%";
> 
> $scope.memoryMax = pretty_print_bytes(memoryMax);
> 
> $scope.memoryTotalDisplay = pretty_print_bytes($scope.memoryTotal);
> 
> 
> 
> // swap space
> 
> var swapMax = parse_memory_value(data.system.totalSwapSpaceSize);
> 
> $scope.swapTotal = parse_memory_value(data.system.totalSwapSpaceSize - 
> data.system.freeSwapSpaceSize);
> 
> $scope.swapPercentage = ($scope.swapTotal / swapMax * 100).toFixed(1)+ "%";
> 
> $scope.swapMax = pretty_print_bytes(swapMax);
> 
> $scope.swapTotalDisplay = pretty_print_bytes($scope.swapTotal);
> 
> 
> 
> // file handles
> 
> $scope.fileDescriptorPercentage = (data.system.openFileDescriptorCount / 
> data.system.maxFileDescriptorCount *100).toFixed(1) + "%";
> 
> 
> 
> // java memory
> 
> var javaMemoryMax = parse_memory_value(data.jvm.memory.raw.max || 
> data.jvm.memory.max);
> 
> $scope.javaMemoryTotal = parse_memory_value(data.jvm.memory.raw.total || 
> data.jvm.memory.total);
> 
> $scope.javaMemoryUsed = parse_memory_value(data.jvm.memory.raw.used || 
> data.jvm.memory.used);
> 
> $scope.javaMemoryTotalPercentage = ($scope.javaMemoryTotal / javaMemoryMax 
> *100).toFixed(1) + "%";
> 
> $scope.javaMemoryUsedPercentage = ($scope.javaMemoryUsed / 
> $scope.javaMemoryTotal *100).toFixed(1) + "%";
> 
> $scope.javaMemoryPercentage = ($scope.javaMemoryUsed / javaMemoryMax * 
> 100).toFixed(1) + "%";
> 
> $scope.javaMemoryTotalDisplay = pretty_print_bytes($scope.javaMemoryTotal);
> 
> $scope.javaMemoryUsedDisplay = pretty_print_bytes($scope.javaMemoryUsed);  // 
> @todo These should really be an AngularJS Filter: {{ javaMemoryUsed | bytes }}
> 
> $scope.javaMemoryMax = pretty_print_bytes(javaMemoryMax);
> 
> 
> 
> 
> 
> var parse_memory_value = function( value ) {
> 
>  if( value !== Number( value ) )
> 
>  {
> 
>var units = 'BKMGTPEZY';
> 
>var match = value.match( /^(\d+([,\.]\d+)?) (\w).*$/ );
> 
>var value = parseFloat( match[1] ) * Math.pow( 1024, units.indexOf( 
> match[3].toUpperCase() ) );
> 
>  }
> 
> 
> 
>  return value;
> 
> };
> 
> 
> 
> From: Dwane Hall 
> Sent: Thursday, 11 March 2021 7:40 PM
> To: users@solr.apache.org 
> Subject: Solr Admin Page Metrics
> 
> Hey Solr community. I started playing around with the 8.8.1 docker image 
> today and noticed issues with the JVM and host memory 'Bar' graphs from the 
> dashboard page of the Solr Admin interface. It also appeares the "JVM" 
> parameters were not listed here but definitely configured as expected as they 
> were visible under the "Java Properties" tab. Form a quick inspection of the 
> Javascript console it appears some 

Re: Solr Admin Page Metrics

2021-03-11 Thread Dwane Hall
Hi Eric I've raised a jira ticket 
(https://issues.apache.org/jira/browse/SOLR-15251) for the topic mentioned in 
this thread.  I was unable to assign it to you or find your username to tag you 
in the ticket.

Thanks,

Dwane

From: Eric Pugh 
Sent: Friday, 12 March 2021 10:15 AM
To: users@solr.apache.org 
Subject: Re: Solr Admin Page Metrics

I’d love to see a Jira issue created and a PR opened against 
https://github.com/apache/solr  for this.  Tag 
me and I’ll review it.

> On Mar 11, 2021, at 6:13 PM, Dwane Hall  wrote:
>
> I dug into this a l little deeper and it looks like some of the metrics 
> reported from the Metrics API have changed between Solr 7 and 8.  The main 
> culprits seem to be os.totalPhysicalMemorySize not being calucated in Solr 8 
> and two missing metrics os.totalSwapSpaceSize and os.freeSwapSpaceSize which 
> are all used in the Dashboard view page.  Below is an extract of the 
> javascrpit used on the Admin Dashboard,  and a comparison between metrics 
> reported in Solr 7 and 8.  The function "parse_memory_value" is where the 
> javascript error appears to be thrown with the missing metrics.
>
> Thanks,
>
> Dwane
>
>
> Solr 8
>
> "os.totalPhysicalMemorySize":0, (Not calculated)
>
> "os.freePhysicalMemorySize":792087998464,
>
> "os.totalSwapSpaceSize" (Metric not present)
>
> "os.freeSwapSpaceSize": (Metric not present)
>
> "memory.heap.committed":8589934592,
>
> "memory.heap.init":8589934592,
>
> "memory.heap.max":8589934592,
>
> "memory.heap.usage":0.006413557566702366,
>
> "memory.heap.used":55092040,
>
> "memory.non-heap.committed":97910784,
>
> "memory.non-heap.init":7667712,
>
> "memory.non-heap.max":-1,
>
> "memory.non-heap.usage":-9.2249552E7,
>
> "memory.non-heap.used":92249712,
>
>
>
> Solr 7
>
> "os.totalPhysicalMemorySize":810586099712,
>
> "os.freePhysicalMemorySize":756665888768,
>
> "os.totalSwapSpaceSize":0,
>
> "os.freeSwapSpaceSize":0
>
> "memory.heap.committed":12348030976,
>
> "memory.heap.init":12884901888,
>
> "memory.heap.max":12348030976,
>
> "memory.heap.usage":0.313836514301922,
>
> "memory.heap.used":3875263000,
>
> "memory.non-heap.committed":145039360,
>
> "memory.non-heap.init":7667712,
>
> "memory.non-heap.max":-1,
>
> "memory.non-heap.usage":-1.30145664E8,
>
> "memory.non-heap.used":130145824,
>
>
>
>
>
>
>
> main.js (Metrics Dashboard)
>
>
>
> // physical memory
>
> var memoryMax = parse_memory_value(data.system.totalPhysicalMemorySize);
>
> $scope.memoryTotal = parse_memory_value(data.system.totalPhysicalMemorySize - 
> data.system.freePhysicalMemorySize);
>
> $scope.memoryPercentage = ($scope.memoryTotal / memoryMax * 100).toFixed(1)+ 
> "%";
>
> $scope.memoryMax = pretty_print_bytes(memoryMax);
>
> $scope.memoryTotalDisplay = pretty_print_bytes($scope.memoryTotal);
>
>
>
> // swap space
>
> var swapMax = parse_memory_value(data.system.totalSwapSpaceSize);
>
> $scope.swapTotal = parse_memory_value(data.system.totalSwapSpaceSize - 
> data.system.freeSwapSpaceSize);
>
> $scope.swapPercentage = ($scope.swapTotal / swapMax * 100).toFixed(1)+ "%";
>
> $scope.swapMax = pretty_print_bytes(swapMax);
>
> $scope.swapTotalDisplay = pretty_print_bytes($scope.swapTotal);
>
>
>
> // file handles
>
> $scope.fileDescriptorPercentage = (data.system.openFileDescriptorCount / 
> data.system.maxFileDescriptorCount *100).toFixed(1) + "%";
>
>
>
> // java memory
>
> var javaMemoryMax = parse_memory_value(data.jvm.memory.raw.max || 
> data.jvm.memory.max);
>
> $scope.javaMemoryTotal = parse_memory_value(data.jvm.memory.raw.total || 
> data.jvm.memory.total);
>
> $scope.javaMemoryUsed = parse_memory_value(data.jvm.memory.raw.used || 
> data.jvm.memory.used);
>
> $scope.javaMemoryTotalPercentage = ($scope.javaMemoryTotal / javaMemoryMax 
> *100).toFixed(1) + "%";
>
> $scope.javaMemoryUsedPercentage = ($scope.javaMemoryUsed / 
> $scope.javaMemoryTotal *100).toFixed(1) + "%";
>
> $scope.javaMemoryPercentage = ($scope.javaMemoryUsed / javaMemoryMax * 
> 100).toFixed(1) + "%";
>
> $scope.javaMemoryTotalDisplay = pretty_print_bytes($scope.javaMemoryTotal);
>
> $scope.javaMemoryUsedDisplay = pretty_print_bytes($scope.javaMemoryUsed);  // 
> @todo These should really be an AngularJS Filter: {{ javaMemoryUsed | bytes }}
>
> $scope.javaMemoryMax = pretty_print_bytes(javaMemoryMax);
>
>
>
>
>
> var parse_memory_value = function( value ) {
>
>  if( value !== Number( value ) )
>
>  {
>
>var units = 'BKMGTPEZY';
>
>var match = value.match( /^(\d+([,\.]\d+)?) (\w).*$/ );
>
>var value = parseFloat( match[1] ) * Math.pow( 1024, units.indexOf( 
> match[3].toUpperCase() ) );
>
>  }
>
>
>
>  return value;
>
> };
>
>
> 
> From: Dwane Hall 
> Sent: Thursday, 11 March 2021 7:40 PM
> To: users@solr.apache.org 
> Subject: Solr Admin Page Metrics
>
> Hey Solr community. I started playing around with the 8.8.1 docker image 
> today and noticed issues with the JVM and host memory 'Bar' grap

Re: Solr Admin Page Metrics

2021-03-11 Thread Eric Pugh
I assigned it to me (on Jira I am a David), and I’d love to see your PR.   Let 
me know on the ticket if you need some help!

> On Mar 11, 2021, at 7:04 PM, Dwane Hall  wrote:
> 
> Hi Eric I've raised a jira ticket 
> (https://issues.apache.org/jira/browse/SOLR-15251 
> ) for the topic mentioned 
> in this thread.  I was unable to assign it to you or find your username to 
> tag you in the ticket. 
> 
> Thanks,
> 
> Dwane
> From: Eric Pugh  >
> Sent: Friday, 12 March 2021 10:15 AM
> To: users@solr.apache.org  
> mailto:users@solr.apache.org>>
> Subject: Re: Solr Admin Page Metrics
>  
> I’d love to see a Jira issue created and a PR opened against 
> https://github.com/apache/solr 
>  > for this.  Tag me and I’ll review it.
> 
> > On Mar 11, 2021, at 6:13 PM, Dwane Hall  > > wrote:
> > 
> > I dug into this a l little deeper and it looks like some of the metrics 
> > reported from the Metrics API have changed between Solr 7 and 8.  The main 
> > culprits seem to be os.totalPhysicalMemorySize not being calucated in Solr 
> > 8 and two missing metrics os.totalSwapSpaceSize and os.freeSwapSpaceSize 
> > which are all used in the Dashboard view page.  Below is an extract of the 
> > javascrpit used on the Admin Dashboard,  and a comparison between metrics 
> > reported in Solr 7 and 8.  The function "parse_memory_value" is where the 
> > javascript error appears to be thrown with the missing metrics.
> > 
> > Thanks,
> > 
> > Dwane
> > 
> > 
> > Solr 8
> > 
> > "os.totalPhysicalMemorySize":0, (Not calculated)
> > 
> > "os.freePhysicalMemorySize":792087998464,
> > 
> > "os.totalSwapSpaceSize" (Metric not present)
> > 
> > "os.freeSwapSpaceSize": (Metric not present)
> > 
> > "memory.heap.committed":8589934592,
> > 
> > "memory.heap.init":8589934592,
> > 
> > "memory.heap.max":8589934592,
> > 
> > "memory.heap.usage":0.006413557566702366,
> > 
> > "memory.heap.used":55092040,
> > 
> > "memory.non-heap.committed":97910784,
> > 
> > "memory.non-heap.init":7667712,
> > 
> > "memory.non-heap.max":-1,
> > 
> > "memory.non-heap.usage":-9.2249552E7,
> > 
> > "memory.non-heap.used":92249712,
> > 
> > 
> > 
> > Solr 7
> > 
> > "os.totalPhysicalMemorySize":810586099712,
> > 
> > "os.freePhysicalMemorySize":756665888768,
> > 
> > "os.totalSwapSpaceSize":0,
> > 
> > "os.freeSwapSpaceSize":0
> > 
> > "memory.heap.committed":12348030976,
> > 
> > "memory.heap.init":12884901888,
> > 
> > "memory.heap.max":12348030976,
> > 
> > "memory.heap.usage":0.313836514301922,
> > 
> > "memory.heap.used":3875263000,
> > 
> > "memory.non-heap.committed":145039360,
> > 
> > "memory.non-heap.init":7667712,
> > 
> > "memory.non-heap.max":-1,
> > 
> > "memory.non-heap.usage":-1.30145664E8,
> > 
> > "memory.non-heap.used":130145824,
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > main.js (Metrics Dashboard)
> > 
> > 
> > 
> > // physical memory
> > 
> > var memoryMax = parse_memory_value(data.system.totalPhysicalMemorySize);
> > 
> > $scope.memoryTotal = parse_memory_value(data.system.totalPhysicalMemorySize 
> > - data.system.freePhysicalMemorySize);
> > 
> > $scope.memoryPercentage = ($scope.memoryTotal / memoryMax * 
> > 100).toFixed(1)+ "%";
> > 
> > $scope.memoryMax = pretty_print_bytes(memoryMax);
> > 
> > $scope.memoryTotalDisplay = pretty_print_bytes($scope.memoryTotal);
> > 
> > 
> > 
> > // swap space
> > 
> > var swapMax = parse_memory_value(data.system.totalSwapSpaceSize);
> > 
> > $scope.swapTotal = parse_memory_value(data.system.totalSwapSpaceSize - 
> > data.system.freeSwapSpaceSize);
> > 
> > $scope.swapPercentage = ($scope.swapTotal / swapMax * 100).toFixed(1)+ "%";
> > 
> > $scope.swapMax = pretty_print_bytes(swapMax);
> > 
> > $scope.swapTotalDisplay = pretty_print_bytes($scope.swapTotal);
> > 
> > 
> > 
> > // file handles
> > 
> > $scope.fileDescriptorPercentage = (data.system.openFileDescriptorCount / 
> > data.system.maxFileDescriptorCount *100).toFixed(1) + "%";
> > 
> > 
> > 
> > // java memory
> > 
> > var javaMemoryMax = parse_memory_value(data.jvm.memory.raw.max || 
> > data.jvm.memory.max);
> > 
> > $scope.javaMemoryTotal = parse_memory_value(data.jvm.memory.raw.total || 
> > data.jvm.memory.total);
> > 
> > $scope.javaMemoryUsed = parse_memory_value(data.jvm.memory.raw.used || 
> > data.jvm.memory.used);
> > 
> > $scope.javaMemoryTotalPercentage = ($scope.javaMemoryTotal / javaMemoryMax 
> > *100).toFixed(1) + "%";
> > 
> > $scope.javaMemoryUsedPercentage = ($scope.javaMemoryUsed / 
> > $scope.javaMemoryTotal *100).toFixed(1) + "%";
> > 
> > $scope.javaMemoryPercentage = ($scope.javaMemoryUsed / javaMemoryMax * 
> > 100).toFixed(1) + "%";
> > 
> > $scope.javaMemoryTotalDisplay = pretty_print_bytes($scope.javaMemoryTotal);
> > 
> > $scope.javaMemoryUsedDisplay = pretty_print_bytes($s

Re: Best throttling / push-back strategy for updates?

2021-03-11 Thread Jan Høydahl
Yes, that is what I'm recommending customers right now, to manually match 
indexing threads with CPUs, and that is the "manual" way.

My question was rather whether we have or want to add some dynamic backoff 
system so that clients can just go full speed until told to back off, and thus 
adjust perfectly to what Solr can swallow.

I had a client the other day where they ingested too fast, into a system with a 
other query load going on at the same time, and it caused some serious slowdown 
and even GC pauses.

Jan

> 11. mar. 2021 kl. 20:55 skrev Walter Underwood :
> 
> In a master/slave system, it is OK to run as fast as possible to the master.
> In a cloud system, we want to keep the indexing load at a level that doesn’t
> interfere with queries.
> 
> I do this by matching the number of indexing threads to the number of CPUs.
> Very roughly, two threads will keep one CPU busy, that is one thread waiting 
> for 
> the CPU to finish the batch and another sending the next batch.
> 
> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads 
> to use 25% (2 CPUs).
> 
> In a sharded system, the indexing is spread over the leaders. For example,
> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each 
> leader. That number of threads runs at nearly a half-million updates per
> minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
> with 72 CPUs.
> 
> Also, we don’t use the cloud-sensitive stuff, we just throw update batches
> at the load balancer. One loader is a simple Python program, so that sends
> it all in JSON. That is the one doing 480k/min with 64 threads.
> 
> Finally, we use a separate load balancer for indexing. That lets us set 
> different
> response time alert levels for query traffic and update traffic. It also 
> allows us
> to see anomalous bursts of query traffic separate from updates.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 11, 2021, at 10:51 AM, Jan Høydahl  wrote:
>> 
>> Hi,
>> 
>> When sending updates to Solr, you often need to run multi threaded to 
>> utilize the CPU on the solr side.
>> But how can the client (whether it is pure HTTP POST or SolrJ know whether 
>> Solr is happy with the indexing speed or not?
>> 
>> I'm thinking of a feedback mechanism where Solr can check its load level, 
>> indexing queue filling rate or other metrics as desired, and respond to the 
>> caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
>> Clients will then know that they should pause for a while and retry. Clients 
>> can then implement an exponential backoff strategy to adjust their indexing 
>> rate.
>> A bonus with such a system would be that Solr could "tell" indexing to slow 
>> down during periods with heavy query traffic, background merge activity, 
>> recovery, replication, if warming is too slow (max warming searchers) etc 
>> etc.
>> 
>> I know Elastic has something similar. Is there already something in our APIs 
>> that I don't know about?
>> 
>> Jan
> 



Re: Best throttling / push-back strategy for updates?

2021-03-11 Thread Mike Drob
The new circuit breakers might be able to offer some rate limiting.

On Thu, Mar 11, 2021 at 6:25 PM Jan Høydahl  wrote:

> Yes, that is what I'm recommending customers right now, to manually match
> indexing threads with CPUs, and that is the "manual" way.
>
> My question was rather whether we have or want to add some dynamic backoff
> system so that clients can just go full speed until told to back off, and
> thus adjust perfectly to what Solr can swallow.
>
> I had a client the other day where they ingested too fast, into a system
> with a other query load going on at the same time, and it caused some
> serious slowdown and even GC pauses.
>
> Jan
>
> > 11. mar. 2021 kl. 20:55 skrev Walter Underwood :
> >
> > In a master/slave system, it is OK to run as fast as possible to the
> master.
> > In a cloud system, we want to keep the indexing load at a level that
> doesn’t
> > interfere with queries.
> >
> > I do this by matching the number of indexing threads to the number of
> CPUs.
> > Very roughly, two threads will keep one CPU busy, that is one thread
> waiting for
> > the CPU to finish the batch and another sending the next batch.
> >
> > With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
> > to use 25% (2 CPUs).
> >
> > In a sharded system, the indexing is spread over the leaders. For
> example,
> > in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
> > leader. That number of threads runs at nearly a half-million updates per
> > minute, so we don’t need further tuning. 2  busy CPUs is just fine on
> hosts
> > with 72 CPUs.
> >
> > Also, we don’t use the cloud-sensitive stuff, we just throw update
> batches
> > at the load balancer. One loader is a simple Python program, so that
> sends
> > it all in JSON. That is the one doing 480k/min with 64 threads.
> >
> > Finally, we use a separate load balancer for indexing. That lets us set
> different
> > response time alert levels for query traffic and update traffic. It also
> allows us
> > to see anomalous bursts of query traffic separate from updates.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Mar 11, 2021, at 10:51 AM, Jan Høydahl 
> wrote:
> >>
> >> Hi,
> >>
> >> When sending updates to Solr, you often need to run multi threaded to
> utilize the CPU on the solr side.
> >> But how can the client (whether it is pure HTTP POST or SolrJ know
> whether Solr is happy with the indexing speed or not?
> >>
> >> I'm thinking of a feedback mechanism where Solr can check its load
> level, indexing queue filling rate or other metrics as desired, and respond
> to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
> >> Clients will then know that they should pause for a while and retry.
> Clients can then implement an exponential backoff strategy to adjust their
> indexing rate.
> >> A bonus with such a system would be that Solr could "tell" indexing to
> slow down during periods with heavy query traffic, background merge
> activity, recovery, replication, if warming is too slow (max warming
> searchers) etc etc.
> >>
> >> I know Elastic has something similar. Is there already something in our
> APIs that I don't know about?
> >>
> >> Jan
> >
>
>


Re: Best throttling / push-back strategy for updates?

2021-03-11 Thread Walter Underwood
Circuit breakers only cancel searches. They don’t touch updates.
I was in that code a few weeks ago and have a patch waiting for approval.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 11, 2021, at 4:33 PM, Mike Drob  wrote:
> 
> The new circuit breakers might be able to offer some rate limiting.
> 
> On Thu, Mar 11, 2021 at 6:25 PM Jan Høydahl  wrote:
> 
>> Yes, that is what I'm recommending customers right now, to manually match
>> indexing threads with CPUs, and that is the "manual" way.
>> 
>> My question was rather whether we have or want to add some dynamic backoff
>> system so that clients can just go full speed until told to back off, and
>> thus adjust perfectly to what Solr can swallow.
>> 
>> I had a client the other day where they ingested too fast, into a system
>> with a other query load going on at the same time, and it caused some
>> serious slowdown and even GC pauses.
>> 
>> Jan
>> 
>>> 11. mar. 2021 kl. 20:55 skrev Walter Underwood :
>>> 
>>> In a master/slave system, it is OK to run as fast as possible to the
>> master.
>>> In a cloud system, we want to keep the indexing load at a level that
>> doesn’t
>>> interfere with queries.
>>> 
>>> I do this by matching the number of indexing threads to the number of
>> CPUs.
>>> Very roughly, two threads will keep one CPU busy, that is one thread
>> waiting for
>>> the CPU to finish the batch and another sending the next batch.
>>> 
>>> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
>>> to use 25% (2 CPUs).
>>> 
>>> In a sharded system, the indexing is spread over the leaders. For
>> example,
>>> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
>>> leader. That number of threads runs at nearly a half-million updates per
>>> minute, so we don’t need further tuning. 2  busy CPUs is just fine on
>> hosts
>>> with 72 CPUs.
>>> 
>>> Also, we don’t use the cloud-sensitive stuff, we just throw update
>> batches
>>> at the load balancer. One loader is a simple Python program, so that
>> sends
>>> it all in JSON. That is the one doing 480k/min with 64 threads.
>>> 
>>> Finally, we use a separate load balancer for indexing. That lets us set
>> different
>>> response time alert levels for query traffic and update traffic. It also
>> allows us
>>> to see anomalous bursts of query traffic separate from updates.
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
 On Mar 11, 2021, at 10:51 AM, Jan Høydahl 
>> wrote:
 
 Hi,
 
 When sending updates to Solr, you often need to run multi threaded to
>> utilize the CPU on the solr side.
 But how can the client (whether it is pure HTTP POST or SolrJ know
>> whether Solr is happy with the indexing speed or not?
 
 I'm thinking of a feedback mechanism where Solr can check its load
>> level, indexing queue filling rate or other metrics as desired, and respond
>> to the caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
 Clients will then know that they should pause for a while and retry.
>> Clients can then implement an exponential backoff strategy to adjust their
>> indexing rate.
 A bonus with such a system would be that Solr could "tell" indexing to
>> slow down during periods with heavy query traffic, background merge
>> activity, recovery, replication, if warming is too slow (max warming
>> searchers) etc etc.
 
 I know Elastic has something similar. Is there already something in our
>> APIs that I don't know about?
 
 Jan
>>> 
>> 
>> 



Re: Best throttling / push-back strategy for updates?

2021-03-11 Thread Dwane Hall
I really like the idea. I too have had instances in the past where (some) 
updates fail because of long (ish) gc pause times due to overloading and having 
the option to pause indexing and give Solr a chance to catch up would very 
useful.  I typically have a retry clause managing these issues but I'm 
generally catching generic errors so a specific error code that you could 
catch, sleep, and try again at a future intervals has some merit in my opinion.

Thanks,

Dwane

From: Jan Høydahl 
Sent: Friday, 12 March 2021 11:18 AM
To: users@solr.apache.org 
Subject: Re: Best throttling / push-back strategy for updates?

Yes, that is what I'm recommending customers right now, to manually match 
indexing threads with CPUs, and that is the "manual" way.

My question was rather whether we have or want to add some dynamic backoff 
system so that clients can just go full speed until told to back off, and thus 
adjust perfectly to what Solr can swallow.

I had a client the other day where they ingested too fast, into a system with a 
other query load going on at the same time, and it caused some serious slowdown 
and even GC pauses.

Jan

> 11. mar. 2021 kl. 20:55 skrev Walter Underwood :
>
> In a master/slave system, it is OK to run as fast as possible to the master.
> In a cloud system, we want to keep the indexing load at a level that doesn’t
> interfere with queries.
>
> I do this by matching the number of indexing threads to the number of CPUs.
> Very roughly, two threads will keep one CPU busy, that is one thread waiting 
> for
> the CPU to finish the batch and another sending the next batch.
>
> With an 8 CPU machine, use 16 threads to use 100%. Or use 4 threads
> to use 25% (2 CPUs).
>
> In a sharded system, the indexing is spread over the leaders. For example,
> in our system with 8 shards, 64 threads will keep 2 CPUs busy on each
> leader. That number of threads runs at nearly a half-million updates per
> minute, so we don’t need further tuning. 2  busy CPUs is just fine on hosts
> with 72 CPUs.
>
> Also, we don’t use the cloud-sensitive stuff, we just throw update batches
> at the load balancer. One loader is a simple Python program, so that sends
> it all in JSON. That is the one doing 480k/min with 64 threads.
>
> Finally, we use a separate load balancer for indexing. That lets us set 
> different
> response time alert levels for query traffic and update traffic. It also 
> allows us
> to see anomalous bursts of query traffic separate from updates.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 11, 2021, at 10:51 AM, Jan Høydahl  wrote:
>>
>> Hi,
>>
>> When sending updates to Solr, you often need to run multi threaded to 
>> utilize the CPU on the solr side.
>> But how can the client (whether it is pure HTTP POST or SolrJ know whether 
>> Solr is happy with the indexing speed or not?
>>
>> I'm thinking of a feedback mechanism where Solr can check its load level, 
>> indexing queue filling rate or other metrics as desired, and respond to the 
>> caller with a HTTP 503, or a custom Solr HTTP code "533 Slow down".
>> Clients will then know that they should pause for a while and retry. Clients 
>> can then implement an exponential backoff strategy to adjust their indexing 
>> rate.
>> A bonus with such a system would be that Solr could "tell" indexing to slow 
>> down during periods with heavy query traffic, background merge activity, 
>> recovery, replication, if warming is too slow (max warming searchers) etc 
>> etc.
>>
>> I know Elastic has something similar. Is there already something in our APIs 
>> that I don't know about?
>>
>> Jan
>