Yeah that's a few undesirable behaviors there. https://bugs.launchpad.net/swift/+bug/1583305
#willfix On Tue, May 17, 2016 at 11:04 PM, Mark Kirkwood < mark.kirkw...@catalyst.net.nz> wrote: > On 17/05/16 17:43, Mark Kirkwood wrote: > >> >> I'm seeing some replication errors in the object server log: >> >> May 17 05:27:36 markir-dev-ostor001 object-server: Starting object >> replication pass. >> May 17 05:27:36 markir-dev-ostor001 object-server: 1/1 (100.00%) >> partitions replicated in 0.03s (38.19/sec, 0s remaining) >> May 17 05:27:36 markir-dev-ostor001 object-server: 2 successes, 0 failures >> May 17 05:27:36 markir-dev-ostor001 object-server: 1 suffixes checked - >> 0.00% hashed, 0.00% synced >> May 17 05:27:36 markir-dev-ostor001 object-server: Partition times: max >> 0.0210s, min 0.0210s, med 0.0210s >> May 17 05:27:36 markir-dev-ostor001 object-server: Object replication >> complete. (0.00 minutes) >> May 17 05:27:36 markir-dev-ostor001 object-server: Replication sleeping >> for 30 seconds. >> May 17 05:27:40 markir-dev-ostor001 object-server: Begin object audit >> "forever" mode (ALL) >> May 17 05:27:40 markir-dev-ostor001 object-server: Begin object audit >> "forever" mode (ZBF) >> May 17 05:27:40 markir-dev-ostor001 object-server: Object audit (ZBF). >> Since Tue May 17 05:27:40 2016: Locally: 1 passed, 0 quarantined, 0 errors, >> files/sec: 83.24, bytes/sec: 0.00, Total time: 0.01, Auditing time: 0.00, >> Rate: 0.00 >> May 17 05:27:40 markir-dev-ostor001 object-server: Object audit (ZBF) >> "forever" mode completed: 0.01s. Total quarantined: 0, Total errors: 0, >> Total files/sec: 66.89, Total bytes/sec: 0.00, Auditing time: 0.01, Rate: >> 0.75 >> May 17 05:27:45 markir-dev-ostor001 object-server: ::ffff:10.0.3.242 - - >> [17/May/2016:05:27:45 +0000] "REPLICATE /1/899" 200 56 "-" "-" >> "object-replicator 18131" 0.0014 "-" 29108 0 >> May 17 05:27:45 markir-dev-ostor001 object-server: ::ffff:10.0.3.242 - - >> [17/May/2016:05:27:45 +0000] "REPLICATE /1/899" 200 56 "-" "-" >> "object-replicator 18131" 0.0016 "-" 29109 0 >> May 17 05:28:06 markir-dev-ostor001 object-server: Starting object >> replication pass. >> May 17 05:28:06 markir-dev-ostor001 object-server: 1/1 (100.00%) >> partitions replicated in 0.02s (49.85/sec, 0s remaining) >> May 17 05:28:06 markir-dev-ostor001 object-server: 2 successes, 6 >> failures <============================== >> May 17 05:28:06 markir-dev-ostor001 object-server: 1 suffixes checked - >> 0.00% hashed, 0.00% synced >> May 17 05:28:06 markir-dev-ostor001 object-server: Partition times: max >> 0.0155s, min 0.0155s, med 0.0155s >> May 17 05:28:06 markir-dev-ostor001 object-server: Object replication >> complete. (0.00 minutes) >> May 17 05:28:06 markir-dev-ostor001 object-server: Replication sleeping >> for 30 seconds. >> > > > The other case is (bit more debugging, but trivial so will inline it): > > Log: > May 18 05:59:50 markir-dev-ostor002 object-server: object replication > failure 1 detail no error > May 18 05:59:50 markir-dev-ostor002 object-server: object replication > failure 1 detail no error > May 18 05:59:50 markir-dev-ostor002 object-server: 2/2 (100.00%) > partitions replicated in 0.04s (47.15/sec, 0s remaining) > May 18 05:59:50 markir-dev-ostor002 object-server: 4 successes, 12 failures > > > Code (around line 492 of replication.py): > except (Exception, Timeout): > trace = traceback.format_exc() > failure_devs_info.update(target_devs_info) > self.logger.exception(_("Error syncing partition")) > else: > trace = "no error" > finally: > self.stats['success'] += len(target_devs_info - > failure_devs_info) > self.logger.warning('object replication failure 1 detail %s', > trace) > self._add_failure_stats(failure_devs_info) <=============== > self.partition_times.append(time.time() - begin) > self.logger.timing_since('partition.update.timing', begin) > > > That 'finally' is gonna increment the error count even if there is no > exception I think, probably should check if an exception actually occurred! > > Cheers > > > Mark > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack@lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack