[jira] [Commented] (KUDU-2233) Check failure during compactions: pv_delete_redo != nullptr

2018-02-26 Thread Oleksandra Klevets (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377093#comment-16377093
 ] 

Oleksandra Klevets commented on KUDU-2233:
--

I have the same issues with Kudu 1.5

> Check failure during compactions: pv_delete_redo != nullptr
> ---
>
> Key: KUDU-2233
> URL: https://issues.apache.org/jira/browse/KUDU-2233
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet, tserver
>Affects Versions: 1.4.0
>Reporter: Andrew Wong
>Assignee: David Alves
>Priority: Major
>
> There have been a couple of reports of a check failure during compactions at 
> least from 1.4, pasted below:
> {noformat}
> F1201 14:55:37.052140 10508 compaction.cc:756] Check failed: pv_delete_redo 
> != nullptr
>  * 
>  ** 
>  *** Check failure stack trace: ***
>  Wrote minidump to 
> /var/log/kudu/minidumps/kudu-tserver/215cde39-7795-0885-0b51038d-771d875e.dmp
>  *** Aborted at 1512161737 (unix time) try "date -d @1512161737" if you are 
> using GNU date ***
>  PC: @ 0x3ec3632625 (unknown)
>  *** SIGABRT (@0x3b98eec028e3) received by PID 10467 (TID 0x7f8b02c58700) 
> from PID 10467; stack trace: ***
>  @ 0x3ec3a0f7e0 (unknown)
>  @ 0x3ec3632625 (unknown)
>  @ 0x3ec3633e05 (unknown)
>  @ 0x1b53f59 (unknown)
>  @ 0x8b9f6d google::LogMessage::Fail()
>  @ 0x8bbe2d google::LogMessage::SendToLog()
>  @ 0x8b9aa9 google::LogMessage::Flush()
>  @ 0x8bc8cf google::LogMessageFatal::~LogMessageFatal()
>  @ 0x9db0fe kudu::tablet::FlushCompactionInput()
>  @ 0x9a056a kudu::tablet::Tablet::DoMergeCompactionOrFlush()
>  @ 0x9a372d kudu::tablet::Tablet::Compact()
>  @ 0x9bd8d1 kudu::tablet::CompactRowSetsOp::Perform()
>  @ 0x1b4145f kudu::MaintenanceManager::LaunchOp()
>  @ 0x1b8da06 kudu::ThreadPool::DispatchThread()
>  @ 0x1b888ea kudu::Thread::SuperviseThread()
>  @ 0x3ec3a07aa1 (unknown)
>  @ 0x3ec36e893d (unknown)
>  @ 0x0 (unknown)}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2278) Improve IO for writing deltas

2018-02-26 Thread Adar Dembo (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377387#comment-16377387
 ] 

Adar Dembo commented on KUDU-2278:
--

The same is true for flushing an MRS, right? I would imagine that like MRS 
flushes, DMS flushes are deferred until the DMS is sufficiently "meaty". Is the 
DMS flush policy different than the MRS flush policy?


> Improve IO for writing deltas
> -
>
> Key: KUDU-2278
> URL: https://issues.apache.org/jira/browse/KUDU-2278
> Project: Kudu
>  Issue Type: Improvement
>  Components: cfile, tablet
>Reporter: Andrew Wong
>Priority: Major
>
> Today, writing new deltas entails rewriting entire tablet metadata files in 
> order to track the newly-created block ids. Even if the delta were on the 
> order of kilobytes, the tablet metadata files could be on the order of 
> megabytes, so the relative cost for this small amount of data is quite high, 
> considering the amount of the metadata.
> This could be improved by batching such delta flushes, or by revamping tablet 
> metadata entirely to batch any operations that require metadata updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2278) Improve IO for writing deltas

2018-02-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377424#comment-16377424
 ] 

Todd Lipcon commented on KUDU-2278:
---

The difference is that a random insert workload will just populate one MRS per 
tablet, so it can reach a large size before having to flush. A random 
update/upsert workload, however, will spread over hundreds of DMS (one per 
rowset), so they'll often have to start flushing when individual DMS are quite 
small.

> Improve IO for writing deltas
> -
>
> Key: KUDU-2278
> URL: https://issues.apache.org/jira/browse/KUDU-2278
> Project: Kudu
>  Issue Type: Improvement
>  Components: cfile, tablet
>Reporter: Andrew Wong
>Priority: Major
>
> Today, writing new deltas entails rewriting entire tablet metadata files in 
> order to track the newly-created block ids. Even if the delta were on the 
> order of kilobytes, the tablet metadata files could be on the order of 
> megabytes, so the relative cost for this small amount of data is quite high, 
> considering the amount of the metadata.
> This could be improved by batching such delta flushes, or by revamping tablet 
> metadata entirely to batch any operations that require metadata updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2233) Check failure during compactions: pv_delete_redo != nullptr

2018-02-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377429#comment-16377429
 ] 

Todd Lipcon commented on KUDU-2233:
---

[~elisska] - if you are building from source you can try patching in 
https://gerrit.cloudera.org/c/9436/ . Please let us know if that fixes your 
issue if you can try it.

> Check failure during compactions: pv_delete_redo != nullptr
> ---
>
> Key: KUDU-2233
> URL: https://issues.apache.org/jira/browse/KUDU-2233
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet, tserver
>Affects Versions: 1.4.0
>Reporter: Andrew Wong
>Assignee: David Alves
>Priority: Major
>
> There have been a couple of reports of a check failure during compactions at 
> least from 1.4, pasted below:
> {noformat}
> F1201 14:55:37.052140 10508 compaction.cc:756] Check failed: pv_delete_redo 
> != nullptr
>  * 
>  ** 
>  *** Check failure stack trace: ***
>  Wrote minidump to 
> /var/log/kudu/minidumps/kudu-tserver/215cde39-7795-0885-0b51038d-771d875e.dmp
>  *** Aborted at 1512161737 (unix time) try "date -d @1512161737" if you are 
> using GNU date ***
>  PC: @ 0x3ec3632625 (unknown)
>  *** SIGABRT (@0x3b98eec028e3) received by PID 10467 (TID 0x7f8b02c58700) 
> from PID 10467; stack trace: ***
>  @ 0x3ec3a0f7e0 (unknown)
>  @ 0x3ec3632625 (unknown)
>  @ 0x3ec3633e05 (unknown)
>  @ 0x1b53f59 (unknown)
>  @ 0x8b9f6d google::LogMessage::Fail()
>  @ 0x8bbe2d google::LogMessage::SendToLog()
>  @ 0x8b9aa9 google::LogMessage::Flush()
>  @ 0x8bc8cf google::LogMessageFatal::~LogMessageFatal()
>  @ 0x9db0fe kudu::tablet::FlushCompactionInput()
>  @ 0x9a056a kudu::tablet::Tablet::DoMergeCompactionOrFlush()
>  @ 0x9a372d kudu::tablet::Tablet::Compact()
>  @ 0x9bd8d1 kudu::tablet::CompactRowSetsOp::Perform()
>  @ 0x1b4145f kudu::MaintenanceManager::LaunchOp()
>  @ 0x1b8da06 kudu::ThreadPool::DispatchThread()
>  @ 0x1b888ea kudu::Thread::SuperviseThread()
>  @ 0x3ec3a07aa1 (unknown)
>  @ 0x3ec36e893d (unknown)
>  @ 0x0 (unknown)}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2233) Check failure during compactions: pv_delete_redo != nullptr

2018-02-26 Thread Oleksandra Klevets (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377451#comment-16377451
 ] 

Oleksandra Klevets commented on KUDU-2233:
--

I am using Cloudera 5.13 distro. Any other possible workarounds for this issue? 

> Check failure during compactions: pv_delete_redo != nullptr
> ---
>
> Key: KUDU-2233
> URL: https://issues.apache.org/jira/browse/KUDU-2233
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet, tserver
>Affects Versions: 1.4.0
>Reporter: Andrew Wong
>Assignee: David Alves
>Priority: Major
>
> There have been a couple of reports of a check failure during compactions at 
> least from 1.4, pasted below:
> {noformat}
> F1201 14:55:37.052140 10508 compaction.cc:756] Check failed: pv_delete_redo 
> != nullptr
>  * 
>  ** 
>  *** Check failure stack trace: ***
>  Wrote minidump to 
> /var/log/kudu/minidumps/kudu-tserver/215cde39-7795-0885-0b51038d-771d875e.dmp
>  *** Aborted at 1512161737 (unix time) try "date -d @1512161737" if you are 
> using GNU date ***
>  PC: @ 0x3ec3632625 (unknown)
>  *** SIGABRT (@0x3b98eec028e3) received by PID 10467 (TID 0x7f8b02c58700) 
> from PID 10467; stack trace: ***
>  @ 0x3ec3a0f7e0 (unknown)
>  @ 0x3ec3632625 (unknown)
>  @ 0x3ec3633e05 (unknown)
>  @ 0x1b53f59 (unknown)
>  @ 0x8b9f6d google::LogMessage::Fail()
>  @ 0x8bbe2d google::LogMessage::SendToLog()
>  @ 0x8b9aa9 google::LogMessage::Flush()
>  @ 0x8bc8cf google::LogMessageFatal::~LogMessageFatal()
>  @ 0x9db0fe kudu::tablet::FlushCompactionInput()
>  @ 0x9a056a kudu::tablet::Tablet::DoMergeCompactionOrFlush()
>  @ 0x9a372d kudu::tablet::Tablet::Compact()
>  @ 0x9bd8d1 kudu::tablet::CompactRowSetsOp::Perform()
>  @ 0x1b4145f kudu::MaintenanceManager::LaunchOp()
>  @ 0x1b8da06 kudu::ThreadPool::DispatchThread()
>  @ 0x1b888ea kudu::Thread::SuperviseThread()
>  @ 0x3ec3a07aa1 (unknown)
>  @ 0x3ec36e893d (unknown)
>  @ 0x0 (unknown)}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2233) Check failure during compactions: pv_delete_redo != nullptr

2018-02-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377939#comment-16377939
 ] 

Todd Lipcon commented on KUDU-2233:
---

Unfortunately we're not aware of any workarounds outside of patching the code 
once the problem has happened. Setting --log-min-segments-to-retain=2 instead 
of the default 1 may help avoid the issue, but once it has occurred, a patch is 
required. We can't provide support for downstream vendors in the context of the 
Apache JIRA so please get in touch with Cloudera to find out more about their 
releases etc

> Check failure during compactions: pv_delete_redo != nullptr
> ---
>
> Key: KUDU-2233
> URL: https://issues.apache.org/jira/browse/KUDU-2233
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet, tserver
>Affects Versions: 1.4.0
>Reporter: Andrew Wong
>Assignee: David Alves
>Priority: Major
>
> There have been a couple of reports of a check failure during compactions at 
> least from 1.4, pasted below:
> {noformat}
> F1201 14:55:37.052140 10508 compaction.cc:756] Check failed: pv_delete_redo 
> != nullptr
>  * 
>  ** 
>  *** Check failure stack trace: ***
>  Wrote minidump to 
> /var/log/kudu/minidumps/kudu-tserver/215cde39-7795-0885-0b51038d-771d875e.dmp
>  *** Aborted at 1512161737 (unix time) try "date -d @1512161737" if you are 
> using GNU date ***
>  PC: @ 0x3ec3632625 (unknown)
>  *** SIGABRT (@0x3b98eec028e3) received by PID 10467 (TID 0x7f8b02c58700) 
> from PID 10467; stack trace: ***
>  @ 0x3ec3a0f7e0 (unknown)
>  @ 0x3ec3632625 (unknown)
>  @ 0x3ec3633e05 (unknown)
>  @ 0x1b53f59 (unknown)
>  @ 0x8b9f6d google::LogMessage::Fail()
>  @ 0x8bbe2d google::LogMessage::SendToLog()
>  @ 0x8b9aa9 google::LogMessage::Flush()
>  @ 0x8bc8cf google::LogMessageFatal::~LogMessageFatal()
>  @ 0x9db0fe kudu::tablet::FlushCompactionInput()
>  @ 0x9a056a kudu::tablet::Tablet::DoMergeCompactionOrFlush()
>  @ 0x9a372d kudu::tablet::Tablet::Compact()
>  @ 0x9bd8d1 kudu::tablet::CompactRowSetsOp::Perform()
>  @ 0x1b4145f kudu::MaintenanceManager::LaunchOp()
>  @ 0x1b8da06 kudu::ThreadPool::DispatchThread()
>  @ 0x1b888ea kudu::Thread::SuperviseThread()
>  @ 0x3ec3a07aa1 (unknown)
>  @ 0x3ec36e893d (unknown)
>  @ 0x0 (unknown)}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)