Extending the same thought from Steven. If you are going to do a small
delay, it is better to do it via a Lease.

So SCM could offer a lease for 60 seconds, with a provision to reacquire
the lease one more time.
This does mean that a single container inside the data node technically
could become larger than 5GB (but that is possible even today).

I do think a lease or a timeout based approach (as suggested by Steven)
might be easier than pre-allocating blocks.

Thanks
Anu


On Fri, Sep 9, 2022 at 12:47 AM Stephen O'Donnell
<sodonn...@cloudera.com.invalid> wrote:

> > 4. Datanode wait until no write commands to this container, then close
> it.
>
> This could be done on SCM, with a simple delay. Ie hold back the close
> commands for a "normal" close for some configurable amount of time. Eg if
> we hold for 60 seconds, it is likely almost all blocks will get written. If
> a very small number fail, it is likely OK.
>
> On Fri, Sep 9, 2022 at 5:01 AM Kaijie Chen <c...@apache.org> wrote:
>
> > Thanks Ethan. Yes this could be a simpler solution.
> > The main idea is allowing container size limit to be exceeded,
> > to ensure all allocated blocks can be finished.
> >
> > We can change it to something like this:
> > 1. Datanode notices the container is near full.
> > 2. Datanode sends close container action to SCM immediately.
> > 3. SCM closes the container and stops allocating new blocks in it.
> > 4. Datanode wait until no write commands to this container, then close
> it.
> >
> > It's still okay to wait for the next heartbeat in step 2.
> > Step 4 is a little bit tricky, we need a lease or timeout to determine
> the
> > time.
> >
> > Kaijie
> >
> >  ---- On Fri, 09 Sep 2022 08:54:34 +0800  Ethan Rose  wrote ---
> >  > I believe the flow is:
> >  > 1. Datanode notices the container is near full.
> >  > 2. Datanode sends close container action to SCM on its next heartbeat.
> >  > 3. SCM closes the container and sends a close container command on the
> >  > heartbeat response.
> >  > 4. Datanodes get the response and close the container. If it is a
> Ratis
> >  > container, the leader will send the close via Ratis.
> >  >
> >  > There is a "grace period" of sorts between steps 1 and 2, but this
> does
> > not
> >  > help the situation because SCM does not stop issuing blocks to this
> >  > container until after step 3. Perhaps some amount of pause between
> > steps 3
> >  > and 4 would help, either on the SCM or datanode side. This would
> > provide a
> >  > "grace period" between when SCM stops allocating blocks for the
> > container
> >  > and when the container is actually closed. I'm not sure exactly how
> this
> >  > would be implemented in the code given the current setup, but it seems
> > like
> >  > a simple option we should try before other more complicated solutions.
> >  >
> >  > Ethan
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>

Reply via email to