Hi Joshua, Can you confirm thaht you are now seing only 3 lines instead of the 9 that you showed previously? Also the base_url should now be different. Is that the case?
Mathieu On Wed, Aug 11, 2021 at 3:50 PM Joshua Hendrickson < jhendrick...@tripadvisor.com> wrote: > Mathieu, > > We have changed our Prometheus configuration to scrape only from one pod > in the cluster, but we still see the error given below. Is there anything > else we can try? > > On 2021/08/11 08:58:34, Mathieu Marie <m...@salesforce.com.INVALID> > wrote: > > It happens because you use *-z zk-url *to connect to solr.> > > When you do that the prometheus-export assumes that it connects to a> > > SolrCloud environment and will collect the metrics from all nodes.> > > Given you have started 3 prometheus-exporters, each one of them will> > > collect all metrics from the cluster.> > > > > You can fix this in two different ways:> > > 1- use *-h <your-local-solr-url>* instead of *-z <zk-url>*> > > 2- have only one instance of the prometheus-exporter in the cluster> > > > > Note that solution 1 will not retrieve the metrics you have configured > in> > > the *<collections>* tag in your configuration, as *-h* assumes a > non-solr> > > cloud instance.> > > > > Regards,> > > Mathieu> > > > > On Wed, Aug 11, 2021 at 9:32 AM Joshua Hendrickson <> > > jhendrick...@tripadvisor.com> wrote:> > > > > > Hello,> > > >> > > > Our organization has implemented Solr 8.9.0 for a production use case. > We> > > > have standardized on Prometheus for metrics collection and storage. > We> > > > export metrics from our Solr cluster by deploying the public Solr > image for> > > > version 8.9.0 to an EC2 instance and using Docker to run the exporter> > > > binary against Solr (which is running in a container on the same > host). Our> > > > Prometheus scraper (hosted in Kubernetes and configured via a Helm > chart)> > > > reports errors like the following on every scrape:> > > >> > > > ts=2021-08-10T16:44:13.929Z caller=dedupe.go:112 component=remote> > > > level=error remote_name=11d3d0 url=https://our.endpoint/push> > > > msg="non-recoverable error" count=500 err="server returned HTTP status > 400> > > > Bad Request: user=nnnnn: err: duplicate sample for timestamp.> > > > timestamp=2021-08-10T16:44:13.317Z,> > > > series={__name__=\"solr_metrics_core_time_seconds_total\",> > > > aws_account=\"our-account\", base_url=\"> > > > http://fqdn.for.solr.server:32080/solr\", category=\"QUERY\",> > > > cluster=\"our-cluster\", collection=\"a-collection\",> > > > core=\"a_collection_shard1_replica_t13\", dc=\"aws\", > handler=\"/select\",> > > > instance=\" fqdn.for.solr.server:8984\", job=\"solr\",> > > > replica=\"replica_t13\", shard=\"shard1\"}"> > > >> > > > We have confirmed that there are indeed duplicate time series when we> > > > query our promtheus exporter. Here is a sample that shows the > duplicate> > > > time series:> > > >> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t1",collection="a_collection",shard="shard1",replica="replica_t1",base_url="> > > > > http://fqdn3.for.solr.server:32080/solr",} 1.533471301599E9> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t1",collection="a_collection",shard="shard1",replica="replica_t1",base_url="> > > > > http://fqdn3.for.solr.server:32080/solr",} 8.89078653472891E11> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t1",collection="a_collection",shard="shard1",replica="replica_t1",base_url="> > > > > http://fqdn3.for.solr.server:32080/solr",} 8.9061212477449E11> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t3",collection="a_collection",shard="shard1",replica="replica_t3",base_url="> > > > > http://fqdn2.for.solr.server:32080/solr",} 1.63796914645E9> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t3",collection="a_collection",shard="shard1",replica="replica_t3",base_url="> > > > > http://fqdn2.for.solr.server:32080/solr",} 9.05314998357273E11> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t3",collection="a_collection",shard="shard1",replica="replica_t3",base_url="> > > > > http://fqdn2.for.solr.server:32080/solr",} 9.06952967503723E11> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t5",collection="a_collection",shard="shard1",replica="replica_t5",base_url="> > > > > http://fqdn1.for.solr.server:32080/solr",} 1.667842814432E9> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t5",collection="a_collection",shard="shard1",replica="replica_t5",base_url="> > > > > http://fqdn1.for.solr.server:32080/solr",} 9.1289401347629E11> > > >> > > > > solr_metrics_core_time_seconds_total{category="QUERY",handler="/select",core="a_collection_shard1_replica_t5",collection="a_collection",shard="shard1",replica="replica_t5",base_url="> > > > > http://fqdn1.for.solr.server:32080/solr",} 9.14561856290722E11> > > >> > > > This is the systemd unit file that runs the exporter container:> > > >> > > > [Unit]> > > > Description=Solr Exporter Docker> > > > After=network.target> > > > Wants=network.target> > > > Requires=docker.service> > > > After=docker.service> > > >> > > > [Service]> > > > Type=simple> > > > ExecStart=/usr/bin/docker run --rm \> > > > --name=solr-exporter \> > > > --net=host \> > > > --user=solr \> > > > solr:8.9.0 \> > > > /opt/solr/contrib/prometheus-exporter/bin/solr-exporter \> > > > -p 8984 -z the-various-zookeeper-endpoints -f> > > > /opt/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml -n > 4> > > >> > > > ExecStop=/usr/bin/docker stop -t 2 solr-exporter> > > > Restart=on-failure> > > >> > > > [Install]> > > > WantedBy=multi-user.target> > > >> > > > I looked into the XML configurations for prometheus-exporter between > 8.6.2> > > > (the previous version we used) and latest, and it looks like at some > point> > > > recently there was a major refactoring in how this works. Is there> > > > something we are missing? Can anyone reproduce this issue on 8.9?> > > >> > > > Thanks in advance,> > > > Joshua Hendrickson> > > >> > > >> > > >