Re: [External] RE: how to decommission tablet server

Mike Miller Wed, 18 Aug 2021 04:46:51 -0700

The admin stop command issues a graceful shutdown to Accumulo for that
tserver. There is a force option you could try {"-f", "--force"} that will
remove the lock. But these are more graceful than a linux kill -9 command,
which you may have to do if the admin command doesn't kill the process
entirely.


On Wed, Aug 18, 2021 at 7:31 AM Ligade, Shailesh [USA] <
[email protected]> wrote:

> Thank you for good explanation! I really appreciate that.
>
> Yes I need to remove the hardware, meaning I need to stop everything on
> the server (tserver and datanode)
>
> One quick question:
>
> What is the difference between accumulo admin stop <tserver>:9997 and
> stopping tserver linux service?
>
> When I issue admin stop, I can see, from the monitor, hosted tablets count
> from the tserver in the question  goes down to 0, however it doesn't stop
> the tserver process or service.
>
> In your steps, you are stopping datanode service first (adding into
> exclude file and then running refreshNodes and then stop the service), I
> was thinking to stop accumulo tserver and let it handle hosted tablets
> first, before touching datanode, will there be any difference? Just trying
> to understand how the relationship between accumulo and hadoop is.
>
> Thank you!
>
> -S
> ------------------------------
> *From:* [email protected] <[email protected]>
> *Sent:* Tuesday, August 17, 2021 2:39 PM
> *To:* [email protected] <[email protected]>
> *Subject:* [External] RE: how to decommission tablet server
>
>
> Maybe you could clarify.  Decommissioning tablet servers and hdfs
> replication are separate and distinct issues.  Accumulo will generally be
> unaware of hdfs replication and table assignment does not change the hdfs
> replication.  You can set the replication factor for a tablet – but that is
> used on writes to hdfs – Accumulo will assume that on any successful write,
> on return hdfs  is managing the details.
>
>
>
> When a tablet is assigned / migrated, the underlying files in hdfs are not
> changed – the file references are reassigned in a metadata operation, but
> the files themselves are not modified.  They will maintain whatever
> replication factor that was assigned and whatever the namenode decides.
>
>
>
> If you are removing servers that have both data nodes and tserver
> processes running:
>
>
>
> If you stop / kill the tserver, the tablets assigned to that server will
> be reassigned rather quickly.  It is only an metadata update.  The exact
> timing will depend on your ZooKeeper time-out setting, but the “dead”
> tserver should be detected and reassigned in short order. The reassignment
> may cause some churn of assignments if the cluster becomes un-balanced.
>  The manager (master) will select tablets from tservers that are
> over-subscribed and then assign them to tservers that have fewer tablets –
> you can monitor the manager (master) debug log to see the migration
> progress.  If you want to be gentile, stop a tserver, wait for the number
> of unassigned tables to hit zero and migration to settle and then repeat.
>
>
>
> If you want to stop the data nodes, you can do that independently of
> Accumulo – just follow the Hadoop data node decommission process.  Hadoop
> will move the data blocks assigned to the data node so that it is “safe” to
> then stop the data node process.  This is independent of Accumulo and
> Accumulo will not be aware that the blocks are moving.  If you are running
> compactions, Accumulo may try to write blocks locally, but if the data node
> is rejecting new block assignments (which I rather assume that it would
> when in decommission mode) then Accumulo still would not care.  If somehow
> new blocks where written it may just delay the Hadoop data node
> decommissioning.
>
>
>
> If you are running ingest while killing tservers – things should mostly
> work – there may be ingest failures, but normally things would get retried
> and the subsequent effort should succeed – the issue may be that if by bad
> luck the work keeps getting assigned to tservers that are then killed, you
> could end up exceeding the number of retries and the ingest would fail out
> right.  If you can pause ingest, then this limits that chance.  If you can
> monitor your ingest and know when an ingest failed you could just
> reschedule the ingest (for bulk import)  If you are doing continuous
> ingest, it may be harder to determine if a specific ingest fails, so you
> may need to select an appropriate range for replay.  Overall it may mostly
> work – it will depend on your processes and your tolerance for any
> particular data loss on an ingest.
>
>
>
> The modest approach (if you can accept transient errors):
>
>
>
> 1 Start the data node decommission process.
>
> 2 Pause ingest and cancel any running user compactions.
>
> 3 Stop a tserver and wait for unassigned tablets to go back to 0.  Wait
> for the tablet migration (if any) to quiet down.
>
> 4 Repeat 3 until all tserver processes have been stopped on the nodes you
> are removing.
>
> 5 Restart ingest – rerun any user compactions if you stopped any.
>
> 6 Wait for the hdfs decommission process to finish moving / replicating
> blocks.
>
> 7 stop the data node process.
>
> 8 do what you want with the node.
>
>
>
> You do not need to schedule down time – if you can accept transient errors
> – say that a client scan is running and that tserver is stopped – the
> client may receive an error for the scan.  If the scan is resubmitted and
> the tablet has been reassigned it should work – it may pause for the
> reassignment and / or timeout if the assignment takes some time.   You are
> basically playing a number game here – the number of tablets, the number of
> unassigned tablets, the odds that a scan would be using a particular tablet
> for the duration that it is unavailable.  It’s not guaranteed that it will
> fail, its just that there is a greater than 0 chance that it could – if
> that is unacceptable then:
>
>
>
> 1 Stop ingest – wait for all to finish or mark which ones will need to be
> rescheduled
>
> 2 Stop Accumulo
>
> 3 Remove the tservers from the servers list
>
> 4 Start Accumulo without starting the decommissioned tserver nodes.
>
>
>
> Do what you want with the data node decommissioning.
>
>
>
> The later approach removes possible transient issues.  It is up to you to
> determine your tolerance for possible transient issues for the duration
> that tservers are being stopped vs a complete outage for the duration that
> Accumulo is down.  If it is a large cluster and just a few tservers, the
> odds of a specific tablet being off line for a short duration may be very
> low.  If it is a small cluster or the percentage of tservers that you are
> stopping is large then the odds increase, but the issues will still be
> transient.  You need to decide which is acceptable to you and your
> circumstances.
>
>
>
> *From:* Shailesh Ligade <[email protected]>
> *Sent:* Tuesday, August 17, 2021 11:26 AM
> *To:* [email protected]
> *Subject:* RE: how to decommission tablet server
>
>
>
> It will be helpful to know that when you are decommissioning tablets (one
> at a time for underlying hdfs to replicate), do we need accumulo downtime?
> Can accumulo be ingesting while we are decommissioning tablets?
>
>
>
> Thanks
>
>
>
> -S
>
>
>
> *From:* Shailesh Ligade <[email protected]>
> *Sent:* Tuesday, August 17, 2021 8:52 AM
> *To:* [email protected]
> *Subject:* [EXTERNAL EMAIL] - how to decommission tablet server
>
>
>
> Hello,
>
>
>
> I am using accumulo 1.10 and want to remove few tablet server
>
>
>
> I saw in the documentation that I need to run
>
>
>
> accumulo admin stop <tserver>:9997
>
>
>
> That command comes back quickly, not sure how long, if any I have to wait
> for before I stop tserver service? When is the time to stop datanode
> service (running on the same tablet server)? And when to update slaves
> files (for accumulo and hdfs)?
>
>
>
> Any guidelines on this?
>
>
>
> Thanks
>
>
>
> -S
>
>
>
>
>
>
>

Re: [External] RE: how to decommission tablet server

Reply via email to