Thanks Ethan. Yes this could be a simpler solution.
The main idea is allowing container size limit to be exceeded,
to ensure all allocated blocks can be finished.
We can change it to something like this:
1. Datanode notices the container is near full.
2. Datanode sends close container action to SC
I believe the flow is:
1. Datanode notices the container is near full.
2. Datanode sends close container action to SCM on its next heartbeat.
3. SCM closes the container and sends a close container command on the
heartbeat response.
4. Datanodes get the response and close the container. If it is a
> Are you seeing this for Ratis writes or only EC? Have you changed the EC
> pipeline limit to a higher value than 5? I wonder if a lesser number of
> open write pipelines could contribute to this problem too.
This exception is reproducible in both RATIS and EC.
https://paste.ubuntu.com/p/NjpQ
Are you seeing this for Ratis writes or only EC? Have you changed the EC
pipeline limit to a higher value than 5? I wonder if a lesser number of
open write pipelines could contribute to this problem too.
On Thu, Sep 8, 2022 at 3:35 AM Kaijie Chen wrote:
> Thanks Stephen for explaining,
>
> > I