[
https://issues.apache.org/jira/browse/CASSANDRA-18781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763718#comment-17763718
]
Stefan Miklosovic edited comment on CASSANDRA-18781 at 9/11/23 1:30 PM:
------------------------------------------------------------------------
I think we have a problem. The problem is that when stream fails, sstableloader
logs this in the console
{code}
ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234
StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote
peer /127.0.0.1:7000 failed stream session.
INFO [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000
is failed
progress: total: 100% 0.000B/s (avg: 0.000B/s)
WARN [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed:
Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session
ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086
JVMStabilityInspector.java:70 - Exception in thread
Thread[NonPeriodicTasks:1,5,NonPeriodicTasks]
java.lang.AssertionError: for sstable =
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
at
org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249)
at
org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236)
at
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213)
at
org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140)
at
org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166)
at
org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157)
at
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219)
at
org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124)
at
org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195)
at
org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124)
at
org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96)
at
org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104)
at
org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148)
at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251)
at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
at
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545)
at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
{code}
Notice
{code}
java.lang.AssertionError: for sstable =
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
{code}
It is executing this on a failure:
https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248
So, the problem here is that, for some reason, there is one more object
referencing that SSTable and it is not cleaned up as we throw that exception. I
am not sure why this is happening. On happy path, (since it is cleaning them on
success too, same method is called on success if you notice),
sstable.selfRef().globalCount() returns 0, but not on failure.
While this is "harmless" when using it in a tool, I do not think this is safe
it it is called programmatically, we would basically leak the references.
was (Author: smiklosovic):
I think we have a problem. The problem is that when stream fails, sstableloader
logs this in the console
{code}
ERROR [Stream-Deserializer-/127.0.0.1:7000-4476be2a] 2023-09-11 15:19:41,234
StreamSession.java:1128 - [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Remote
peer /127.0.0.1:7000 failed stream session.
INFO [NonPeriodicTasks:1] 2023-09-11 15:19:41,236 StreamResultFuture.java:201
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Session with /127.0.0.1:7000
is failed
progress: total: 100% 0.000B/s (avg: 0.000B/s)
WARN [NonPeriodicTasks:1] 2023-09-11 15:19:41,241 StreamResultFuture.java:250
- [Stream #dd5cdaa0-50a5-11ee-be1f-097a59d811d7] Stream failed:
Session peer /127.0.0.1:7000 Remote peer /127.0.0.1:7000 failed stream session
ERROR [NonPeriodicTasks:1] 2023-09-11 15:21:19,086
JVMStabilityInspector.java:70 - Exception in thread
Thread[NonPeriodicTasks:1,5,NonPeriodicTasks]
java.lang.AssertionError: for sstable =
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
at
org.apache.cassandra.io.sstable.SSTableLoader.releaseReferences(SSTableLoader.java:249)
at
org.apache.cassandra.io.sstable.SSTableLoader.onFailure(SSTableLoader.java:236)
at
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.run(ListenerList.java:213)
at
org.apache.cassandra.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:140)
at
org.apache.cassandra.utils.concurrent.ListenerList.safeExecute(ListenerList.java:166)
at
org.apache.cassandra.utils.concurrent.ListenerList.notifyListener(ListenerList.java:157)
at
org.apache.cassandra.utils.concurrent.ListenerList$CallbackListener.notifySelf(ListenerList.java:219)
at
org.apache.cassandra.utils.concurrent.ListenerList.lambda$notifyExclusive$0(ListenerList.java:124)
at
org.apache.cassandra.utils.concurrent.IntrusiveStack.forEach(IntrusiveStack.java:195)
at
org.apache.cassandra.utils.concurrent.ListenerList.notifyExclusive(ListenerList.java:124)
at
org.apache.cassandra.utils.concurrent.ListenerList.notify(ListenerList.java:96)
at
org.apache.cassandra.utils.concurrent.AsyncFuture.trySet(AsyncFuture.java:104)
at
org.apache.cassandra.utils.concurrent.AbstractFuture.tryFailure(AbstractFuture.java:148)
at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:251)
at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:205)
at
org.apache.cassandra.streaming.StreamSession.lambda$closeSession$2(StreamSession.java:545)
at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
{code}
Notice
{code}
java.lang.AssertionError: for sstable =
BigTableReader:big(path='/tmp/load/ks/tb/nc-5-big-Data.db'), ref count = 1
{code}
It is executing this on a failure:
https://github.com/instaclustr/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L232-L248
So, the problem here is that, for some reason, there is one more object
referencing that SSTable and it is not cleaned up as we throw that exception. I
am not sure why this is happening. On happy path, (since it is cleaning them on
success too, same method is called on success if you notice),
sstable.selfRef().globalCount() returns 0, but not on failure.
> Add the ability to disable bulk loading of SSTables on a node
> -------------------------------------------------------------
>
> Key: CASSANDRA-18781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18781
> Project: Cassandra
> Issue Type: Improvement
> Components: Tool/bulk load
> Reporter: Runtian Liu
> Assignee: Runtian Liu
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> Currently, Cassandra database users can use sstableloader to bulk load data
> into Cassandra. However, for a Cassandra operator, there is no way to
> forcibly block this behavior. Additionally, there is no metric indicating
> whether the bulk load is being used on the server side. If a client is using
> sstableloader, they will also need to upgrade the sstableloader code to the
> new major version. This lack of control and visibility can become a blocker
> during a major version upgrade.
>
> 1. Can we add a config to disable bulk load feature? Or it falls into
> https://issues.apache.org/jira/browse/CASSANDRA-8303
> 2. Can we add metrics for bulk load used on server end?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]