[jira] [Commented] (KUDU-2984) memory_gc-itest is flaky

2019-10-24 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959357#comment-16959357
 ] 

Yingchun Lai commented on KUDU-2984:


I'll take a look.

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Alexey Serbin
>Priority: Minor
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2984) memory_gc-itest is flaky

2019-10-25 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960240#comment-16960240
 ] 

Yingchun Lai commented on KUDU-2984:


I found 2 problems may make this test flaky:
 # Test table has only 1 tablet (by default) and 1 replica, so data writes on 
only 1 tserver, the other 2 may not conssume much memory when we do scan 
workload.
 # For tserver-1, scan workload is going on together with interval memory GC, 
in a corner case, GC may always take place after CHECK, so CHECK always failed.

I'm sorry that this test is still flaky after 2 fix patches:(, now I try to fix 
again 
[https://gerrit.cloudera.org/c/14553/,|https://gerrit.cloudera.org/c/14553/] 
repeat test for thousands times on Ubuntu 18.04. I'll do more test before I try 
to merge it.

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Alexey Serbin
>Assignee: Yingchun Lai
>Priority: Minor
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-2984) memory_gc-itest is flaky

2019-10-25 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960240#comment-16960240
 ] 

Yingchun Lai edited comment on KUDU-2984 at 10/26/19 3:01 AM:
--

I found 2 problems may make this test flaky:
 # Test table has only 1 tablet (by default) and 1 replica, so data writes on 
only 1 tserver, the other 2 may not conssume much memory when we do scan 
workload.
 # For tserver-1, scan workload is going on together with interval memory GC, 
in a corner case, GC may always take place after CHECK, so CHECK always failed.

I'm sorry that this test is still flaky after 2 fix patches:(, now I try to fix 
again 
[https://gerrit.cloudera.org/c/14553/,|https://gerrit.cloudera.org/c/14553/] 
and repeated test for thousands times on Ubuntu 18.04 and all passed. However, 
I'll do more test before I try to merge it.


was (Author: acelyc111):
I found 2 problems may make this test flaky:
 # Test table has only 1 tablet (by default) and 1 replica, so data writes on 
only 1 tserver, the other 2 may not conssume much memory when we do scan 
workload.
 # For tserver-1, scan workload is going on together with interval memory GC, 
in a corner case, GC may always take place after CHECK, so CHECK always failed.

I'm sorry that this test is still flaky after 2 fix patches:(, now I try to fix 
again 
[https://gerrit.cloudera.org/c/14553/,|https://gerrit.cloudera.org/c/14553/] 
repeat test for thousands times on Ubuntu 18.04. I'll do more test before I try 
to merge it.

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Alexey Serbin
>Assignee: Yingchun Lai
>Priority: Minor
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2879) Build hangs in DEBUG type on Ubuntu 18.04

2019-11-02 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965333#comment-16965333
 ] 

Yingchun Lai commented on KUDU-2879:


I'm building on an x86 machine.

I did nothing about this, but after some OS upgrading, now I can build it.

> Build hangs in DEBUG type on Ubuntu 18.04
> -
>
> Key: KUDU-2879
> URL: https://issues.apache.org/jira/browse/KUDU-2879
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: config.diff, config.log
>
>
> Few months ago, I report this issue on Slack: 
> [https://getkudu.slack.com/archives/C0CPXJ3CH/p1549942641041600]
> I switch to RELEASE type then on, and haven't try build on DEBUG type on my 
> Ubuntu environment.
> Now, when I try build DEBUG type to check 1.10.0-RC2, this issue occurred 
> again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KUDU-2879) Build hangs in DEBUG type on Ubuntu 18.04

2019-11-02 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai closed KUDU-2879.
--
Resolution: Fixed

> Build hangs in DEBUG type on Ubuntu 18.04
> -
>
> Key: KUDU-2879
> URL: https://issues.apache.org/jira/browse/KUDU-2879
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: config.diff, config.log
>
>
> Few months ago, I report this issue on Slack: 
> [https://getkudu.slack.com/archives/C0CPXJ3CH/p1549942641041600]
> I switch to RELEASE type then on, and haven't try build on DEBUG type on my 
> Ubuntu environment.
> Now, when I try build DEBUG type to check 1.10.0-RC2, this issue occurred 
> again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-2992) Limit concurrent alter request of a table

2019-11-04 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-2992:
--

 Summary: Limit concurrent alter request of a table
 Key: KUDU-2992
 URL: https://issues.apache.org/jira/browse/KUDU-2992
 Project: Kudu
  Issue Type: Improvement
  Components: master
Reporter: Yingchun Lai


One of our production environment clusters cause an accident some days ago, one 
user has a table, partition schema looks like:
{code:java}
HASH (uuid) PARTITIONS 80,RANGE (date_hour) (
PARTITION 2019102900 <= VALUES < 2019102901,
PARTITION 2019102901 <= VALUES < 2019102902,
PARTITION 2019102902 <= VALUES < 2019102903,
PARTITION 2019102903 <= VALUES < 2019102904,
PARTITION 2019102904 <= VALUES < 2019102905,
...)
{code}
He try to remove many outdated partitions once by SparkSQL, but it returns an 
timeout error at first, then he try again and again, and SparkSQL failed again 
and again. Then the cluster became unstable, memory used and CPU load 
increasing.

 

I found many log like:
{code:java}
W1030 17:29:53.382287  7588 rpcz_store.cc:259] Trace:1030 17:26:19.714799 (+
 0us) service_pool.cc:162] Inserting onto call queue1030 17:26:19.714808 (+ 
9us) service_pool.cc:221] Handling call1030 17:29:53.382204 (+213667396us) 
ts_tablet_manager.cc:874] Deleting tablet c52c5f43f7884d08b07fd0005e878fed1030 
17:29:53.382205 (+ 1us) ts_tablet_manager.cc:794] Acquired tablet manager 
lock1030 17:29:53.382208 (+ 3us) inbound_call.cc:162] Queueing success 
responseMetrics: {"tablet-delete.queue_time_us":213667360}W1030 17:29:53.382300 
 7586 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820316) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)W1030 17:29:53.382292 10623 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820315) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)W1030 17:29:53.382297 10622 rpcz_store.cc:259] Trace:1030 17:26:19.714825 
(+ 0us) service_pool.cc:162] Inserting onto call queue1030 17:26:19.714833 
(+ 8us) service_pool.cc:221] Handling call1030 17:29:53.382239 
(+213667406us) ts_tablet_manager.cc:874] Deleting tablet 
479f8c592f16408c830637a0129359e11030 17:29:53.382241 (+ 2us) 
ts_tablet_manager.cc:794] Acquired tablet manager lock1030 17:29:53.382244 (+   
  3us) inbound_call.cc:162] Queueing success responseMetrics: 
{"tablet-delete.queue_time_us":213667378}
{code}
That means 'Acquired tablet manager lock' cost much time, right?
{code:java}
Status TSTabletManager::BeginReplicaStateTransition(
const string& tablet_id,
const string& reason,
scoped_refptr* replica,
scoped_refptr* deleter,
TabletServerErrorPB::Code* error_code) {
  // Acquire the lock in exclusive mode as we'll add a entry to the
  // transition_in_progress_ map.
  std::lock_guard lock(lock_);
  TRACE("Acquired tablet manager lock");
  RETURN_NOT_OK(CheckRunningUnlocked(error_code));
  ...
}{code}
But I think the root case is Kudu master send too many duplicate 'alter 
table/delete tablet' request to tserver. I found more info in master's log:
{code:java}
$ grep "Scheduling retry of 8f8b354490684bf3a54e49a1478ec99d" 
kudu_master.zjy-hadoop-prc-ct01.bj.work.log.INFO.20191030-204137.62788 | egrep 
"attempt = 1\)"I1030 20:41:42.207222 62821 catalog_manager.cc:2971] Scheduling 
retry of 8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 43 ms (attempt = 1)I1030 
20:41:42.207556 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 40 ms (attempt = 1)I1030 
20:41:42.260052 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 31 ms (attempt = 1)I1030 
20:41:42.278609 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 19 ms (attempt = 1)I1030 
20:41:42.312175 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 48 ms (attempt = 1)I1030 
20:41:42.318933 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 62 ms (attempt = 1)I1030 
20:41:42.340060 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 30 ms (attempt = 1)I1030 
20:41:42.475689 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a380

[jira] [Updated] (KUDU-2992) Limit concurrent alter request of a table

2019-11-04 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-2992:
---
Description: 
One of our production environment clusters cause an accident some days ago, one 
user has a table, partition schema looks like:
{code:java}
HASH (uuid) PARTITIONS 80,RANGE (date_hour) (
PARTITION 2019102900 <= VALUES < 2019102901,
PARTITION 2019102901 <= VALUES < 2019102902,
PARTITION 2019102902 <= VALUES < 2019102903,
PARTITION 2019102903 <= VALUES < 2019102904,
PARTITION 2019102904 <= VALUES < 2019102905,
...)
{code}
He try to remove many outdated partitions once by SparkSQL, but it returns an 
timeout error at first, then he try again and again, and SparkSQL failed again 
and again. Then the cluster became unstable, memory used and CPU load 
increasing.

 

I found many log like:
{code:java}
W1030 17:29:53.382287  7588 rpcz_store.cc:259] Trace:
1030 17:26:19.714799 (+ 0us) service_pool.cc:162] Inserting onto call queue
1030 17:26:19.714808 (+ 9us) service_pool.cc:221] Handling call
1030 17:29:53.382204 (+213667396us) ts_tablet_manager.cc:874] Deleting tablet 
c52c5f43f7884d08b07fd0005e878fed
1030 17:29:53.382205 (+ 1us) ts_tablet_manager.cc:794] Acquired tablet 
manager lock
1030 17:29:53.382208 (+ 3us) inbound_call.cc:162] Queueing success response
Metrics: {"tablet-delete.queue_time_us":213667360}
W1030 17:29:53.382300  7586 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820316) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)
W1030 17:29:53.382292 10623 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820315) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)
W1030 17:29:53.382297 10622 rpcz_store.cc:259] Trace:
1030 17:26:19.714825 (+ 0us) service_pool.cc:162] Inserting onto call queue
1030 17:26:19.714833 (+ 8us) service_pool.cc:221] Handling call
1030 17:29:53.382239 (+213667406us) ts_tablet_manager.cc:874] Deleting tablet 
479f8c592f16408c830637a0129359e1
1030 17:29:53.382241 (+ 2us) ts_tablet_manager.cc:794] Acquired tablet 
manager lock
1030 17:29:53.382244 (+ 3us) inbound_call.cc:162] Queueing success response
Metrics: {"tablet-delete.queue_time_us":213667378}
...{code}
That means 'Acquired tablet manager lock' cost much time, right?
{code:java}
Status TSTabletManager::BeginReplicaStateTransition(
const string& tablet_id,
const string& reason,
scoped_refptr* replica,
scoped_refptr* deleter,
TabletServerErrorPB::Code* error_code) {
  // Acquire the lock in exclusive mode as we'll add a entry to the
  // transition_in_progress_ map.
  std::lock_guard lock(lock_);
  TRACE("Acquired tablet manager lock");
  RETURN_NOT_OK(CheckRunningUnlocked(error_code));
  ...
}{code}
But I think the root case is Kudu master send too many duplicate 'alter 
table/delete tablet' request to tserver. I found more info in master's log:
{code:java}
$ grep "Scheduling retry of 8f8b354490684bf3a54e49a1478ec99d" 
kudu_master.zjy-hadoop-prc-ct01.bj.work.log.INFO.20191030-204137.62788 | egrep 
"attempt = 1\)"
I1030 20:41:42.207222 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 43 ms (attempt = 1)
I1030 20:41:42.207556 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 40 ms (attempt = 1)
I1030 20:41:42.260052 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 31 ms (attempt = 1)
I1030 20:41:42.278609 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 19 ms (attempt = 1)
I1030 20:41:42.312175 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 48 ms (attempt = 1)
I1030 20:41:42.318933 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 62 ms (attempt = 1)
I1030 20:41:42.340060 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 30 ms (attempt = 1)
...{code}
That means master received too many duplicate 'delete tablet' request from 
client, and then dispatch these request to tservers.

I think we should limit the concurrent alter request of a table, when a alter 
request is on going and hasn't been finished, the following request should be 
re

[jira] [Updated] (KUDU-2992) Limit concurrent alter request of a table

2019-11-04 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-2992:
---
Description: 
One of our production environment clusters cause an accident some days ago, one 
user has a table, partition schema looks like:
{code:java}
HASH (uuid) PARTITIONS 80,RANGE (date_hour) (
PARTITION 2019102900 <= VALUES < 2019102901,
PARTITION 2019102901 <= VALUES < 2019102902,
PARTITION 2019102902 <= VALUES < 2019102903,
PARTITION 2019102903 <= VALUES < 2019102904,
PARTITION 2019102904 <= VALUES < 2019102905,
...)
{code}
He try to remove many outdated partitions once by SparkSQL, but it returns an 
timeout error at first, then he try again and again, and SparkSQL failed again 
and again. Then the cluster became unstable, memory usage and CPU load 
increasing.

 

I found many log like:
{code:java}
W1030 17:29:53.382287  7588 rpcz_store.cc:259] Trace:
1030 17:26:19.714799 (+ 0us) service_pool.cc:162] Inserting onto call queue
1030 17:26:19.714808 (+ 9us) service_pool.cc:221] Handling call
1030 17:29:53.382204 (+213667396us) ts_tablet_manager.cc:874] Deleting tablet 
c52c5f43f7884d08b07fd0005e878fed
1030 17:29:53.382205 (+ 1us) ts_tablet_manager.cc:794] Acquired tablet 
manager lock
1030 17:29:53.382208 (+ 3us) inbound_call.cc:162] Queueing success response
Metrics: {"tablet-delete.queue_time_us":213667360}
W1030 17:29:53.382300  7586 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820316) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)
W1030 17:29:53.382292 10623 rpcz_store.cc:253] Call 
kudu.tserver.TabletServerAdminService.DeleteTablet from 10.152.49.21:55576 
(request call id 1820315) took 213667 ms (3.56 min). Client timeout 2 ms 
(30 s)
W1030 17:29:53.382297 10622 rpcz_store.cc:259] Trace:
1030 17:26:19.714825 (+ 0us) service_pool.cc:162] Inserting onto call queue
1030 17:26:19.714833 (+ 8us) service_pool.cc:221] Handling call
1030 17:29:53.382239 (+213667406us) ts_tablet_manager.cc:874] Deleting tablet 
479f8c592f16408c830637a0129359e1
1030 17:29:53.382241 (+ 2us) ts_tablet_manager.cc:794] Acquired tablet 
manager lock
1030 17:29:53.382244 (+ 3us) inbound_call.cc:162] Queueing success response
Metrics: {"tablet-delete.queue_time_us":213667378}
...{code}
That means 'Acquired tablet manager lock' cost much time, right?
{code:java}
Status TSTabletManager::BeginReplicaStateTransition(
const string& tablet_id,
const string& reason,
scoped_refptr* replica,
scoped_refptr* deleter,
TabletServerErrorPB::Code* error_code) {
  // Acquire the lock in exclusive mode as we'll add a entry to the
  // transition_in_progress_ map.
  std::lock_guard lock(lock_);
  TRACE("Acquired tablet manager lock");
  RETURN_NOT_OK(CheckRunningUnlocked(error_code));
  ...
}{code}
But I think the root case is Kudu master send too many duplicate 'alter 
table/delete tablet' request to tserver. I found more info in master's log:
{code:java}
$ grep "Scheduling retry of 8f8b354490684bf3a54e49a1478ec99d" 
kudu_master.zjy-hadoop-prc-ct01.bj.work.log.INFO.20191030-204137.62788 | egrep 
"attempt = 1\)"
I1030 20:41:42.207222 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 43 ms (attempt = 1)
I1030 20:41:42.207556 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 40 ms (attempt = 1)
I1030 20:41:42.260052 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 31 ms (attempt = 1)
I1030 20:41:42.278609 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 19 ms (attempt = 1)
I1030 20:41:42.312175 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 48 ms (attempt = 1)
I1030 20:41:42.318933 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 62 ms (attempt = 1)
I1030 20:41:42.340060 62821 catalog_manager.cc:2971] Scheduling retry of 
8f8b354490684bf3a54e49a1478ec99d Delete Tablet RPC for 
TS=d50ddd2e763e4d5e81828a3807187b2e with a delay of 30 ms (attempt = 1)
...{code}
That means master received too many duplicate 'delete tablet' request from 
client, and then dispatch these request to tservers.

I think we should limit the concurrent alter request of a table, when a alter 
request is on going and hasn't been finished, the following request should be 
r

[jira] [Commented] (KUDU-2453) kudu should stop creating tablet infinitely

2019-11-19 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977296#comment-16977296
 ] 

Yingchun Lai commented on KUDU-2453:


We also happend to see this issue, I created another Jira to trace it, and also 
give some ideas to resolve it.

> kudu should stop creating tablet infinitely
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: LiFu He
>Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> -
> We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
> there are some load on the kudu cluster. Then someone else created a big 
> table which had tens of thousands of tablets from impala-shell (that was a 
> mistake). 
> {code:java}
> CREATE TABLE XXX(
> ...
>PRIMARY KEY (...)
> )
> PARTITION BY HASH (...) PARTITIONS 100,
> RANGE (...)
> (
>   PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
>   PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
>   ...
>   PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
> )
> STORED AS KUDU
> TBLPROPERTIES ('kudu.master_addresses'= '...');
> {code}
> Here are the logs after creating table (only pick one tablet as example):
> {code:java}
> --Kudu-master log
> ==e884bda6bbd3482f94c07ca0f34f99a4==
> W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
> failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:50247 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
> 39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
> ...
> ==Be replaced by 0b144c00f35d48cca4d4981698faef72==
> W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
> timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
> ...
> I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
> DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
> ...
> I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
> ...
> W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
> already in progress: creating tablet
> ...
> I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
> e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
> ...
> W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
> b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
> failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:59735 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
> b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
> ...
> ==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
> W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> 0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
> timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
> ...
> --Kudu-tserver log
> I1024 11:40:52.014571 137

[jira] [Comment Edited] (KUDU-2453) kudu should stop creating tablet infinitely

2019-11-19 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977296#comment-16977296
 ] 

Yingchun Lai edited comment on KUDU-2453 at 11/19/19 9:25 AM:
--

We also happend to see this issue, I created another Jira to trace it, and also 
gave some ideas to resolve it.


was (Author: acelyc111):
We also happend to see this issue, I created another Jira to trace it, and also 
give some ideas to resolve it.

> kudu should stop creating tablet infinitely
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: LiFu He
>Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> -
> We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
> there are some load on the kudu cluster. Then someone else created a big 
> table which had tens of thousands of tablets from impala-shell (that was a 
> mistake). 
> {code:java}
> CREATE TABLE XXX(
> ...
>PRIMARY KEY (...)
> )
> PARTITION BY HASH (...) PARTITIONS 100,
> RANGE (...)
> (
>   PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
>   PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
>   ...
>   PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
> )
> STORED AS KUDU
> TBLPROPERTIES ('kudu.master_addresses'= '...');
> {code}
> Here are the logs after creating table (only pick one tablet as example):
> {code:java}
> --Kudu-master log
> ==e884bda6bbd3482f94c07ca0f34f99a4==
> W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
> failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:50247 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
> 39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
> ...
> ==Be replaced by 0b144c00f35d48cca4d4981698faef72==
> W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
> timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
> ...
> I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
> DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
> ...
> I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
> ...
> W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
> already in progress: creating tablet
> ...
> I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
> e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
> ...
> W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
> b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
> failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:59735 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
> b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
> ...
> ==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
> W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> 0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within t

[jira] [Created] (KUDU-3001) Multi-thread to load containers in a data directory

2019-11-19 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3001:
--

 Summary: Multi-thread to load containers in a data directory
 Key: KUDU-3001
 URL: https://issues.apache.org/jira/browse/KUDU-3001
 Project: Kudu
  Issue Type: Improvement
Reporter: Yingchun Lai
Assignee: Yingchun Lai


As what [~tlipcon] mentioned in 
https://issues.apache.org/jira/browse/KUDU-2014, we can improve tserver startup 
time by load containers in a data directoty by multiple threads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-3102) tabletserver coredump in jsonwriter

2020-04-01 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3102:
--

 Summary: tabletserver coredump in jsonwriter
 Key: KUDU-3102
 URL: https://issues.apache.org/jira/browse/KUDU-3102
 Project: Kudu
  Issue Type: Bug
  Components: tserver
Affects Versions: 1.10.1
Reporter: Yingchun Lai


A tserver coredump happened, backtrace like fowllowing:
{code:java}
[Thread debugging using libthread_db enabled][Thread debugging using 
libthread_db enabled]Using host libthread_db library 
"/lib64/libthread_db.so.1".Missing separate debuginfo for 
/home/work/app/kudu/c3tst-dev/master/package/libcrypto.so.10Try: yum 
--enablerepo='*debug*' install 
/usr/lib/debug/.build-id/35/93fa778645a59ea272dbbb59d318c60940e792.debugCore 
was generated by `/home/work/app/kudu/c3tst-dev/master/package/kudu_master 
-default_num_replicas='.Program terminated with signal 11, Segmentation 
fault.#0  GetStackTrace_x86 (result=0x7fbf7232fa00, max_depth=31, skip_count=0) 
at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328328
 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:
 No such file or directory.Missing separate debuginfos, use: debuginfo-install 
cyrus-sasl-gssapi-2.1.26-20.el7_2.x86_64 cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 
cyrus-sasl-md5-2.1.26-20.el7_2.x86_64 cyrus-sasl-plain-2.1.26-20.el7_2.x86_64 
glibc-2.17-157.el7_3.1.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 
krb5-libs-1.14.1-27.el7_3.x86_64 libcom_err-1.42.9-9.el7.x86_64 
libdb-5.3.21-19.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 
libselinux-2.5-6.el7.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64 
nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 
openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 
zlib-1.2.7-17.el7.x86_64(gdb) bt#0  GetStackTrace_x86 (result=0x7fbf7232fa00, 
max_depth=31, skip_count=0) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328#1
  0x00b9992b in GetStackTrace (result=result@entry=0x7fbf7232fa00, 
max_depth=max_depth@entry=31, skip_count=skip_count@entry=1) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace.cc:295#2
  0x00b8c14d in DoSampledAllocation (size=size@entry=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1169#3
  0x0289f151 in do_malloc (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1361#4
  do_allocate_full (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1751#5
  tcmalloc::allocate_full_cpp_throw_oom (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1765#6
  0x0289f2a7 in dispatch_allocate_full 
(size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1774#7
  malloc_fast_path (size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1845#8
  tc_new (size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1969#9
  0x7fbf79c785cd in std::__cxx11::basic_string, std::allocator >::reserve 
(this=this@entry=0x7fbf7232fbb0, __res=)    at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:293#10
 0x7fbf79c6be0b in std::__cxx11::basic_stringbuf, std::allocator >::overflow (this=0x7fbf72330668, 
__c=83) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/sstream.tcc:133#11
 0x7fbf79c76b89 in std::basic_streambuf 
>::xsputn (this=0x7fbf72330668,    __s=0x6929232 
"Service_RequestConsensusVote\",\"total_count\":1,\"min\":104,\"mean\":104.0,\"percentile_75\":104,\"percentile_95\":104,\"percentile_99\":104,\"percentile_99_9\":104,\"percentile_99_99\":104,\"max\":104,\"total_sum\":104}"...,
 __n=250)    at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/streambuf.tcc:98#12
 0x7fbf79c66b62 in sputn (__n=250, __s=, this=) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/streambuf:451#13
 _M_write (__n=250, __s=, this=0x7fbf72330660) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/ostream:313#14
 std::ostream::write (this=0x7fbf72330660,    __s=0x6929200 
",{\"name\":\"handler_latency_kudu_consensus_ConsensusService_RequestConsensusVote\",\"total_count\":1,\"min\":104,\"mean\":104.0,\"percentile_75\":104,\"percentile_95\":104,\"percentile_99\":104,\"percentile_99_9\":104"...,
 __n=250)    at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/ostream.tcc:196#15
 0x0265bb1c in Flush (this=0x58c75f8) at 
/home/laiyingchun/kudu_xm/src/kudu/util/jsonwriter.cc:307#16 
kudu::JsonWriterImpl, rapidjson::UTF8, rapidjson::CrtAllocator, 0u> 
>::EndObject (this=0x58c75f0) at 
/home/laiyingchun/kudu_xm/src/kudu/util/jsonwriter.cc:34

[jira] [Updated] (KUDU-3102) tabletserver coredump in jsonwriter

2020-04-01 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3102:
---
Description: 
A tserver coredump happened, backtrace like fowllowing:
{code:java}
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for 
/home/work/app/kudu/c3tst-dev/master/package/libcrypto.so.10
Try: yum --enablerepo='*debug*' install 
/usr/lib/debug/.build-id/35/93fa778645a59ea272dbbb59d318c60940e792.debug
Core was generated by `/home/work/app/kudu/c3tst-dev/master/package/kudu_master 
-default_num_replicas='.
Program terminated with signal 11, Segmentation fault.
#0  GetStackTrace_x86 (result=0x7fbf7232fa00, max_depth=31, skip_count=0) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328
328
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:
 No such file or directory.
Missing separate debuginfos, use: debuginfo-install 
cyrus-sasl-gssapi-2.1.26-20.el7_2.x86_64 cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 
cyrus-sasl-md5-2.1.26-20.el7_2.x86_64 cyrus-sasl-plain-2.1.26-20.el7_2.x86_64 
glibc-2.17-157.el7_3.1.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 
krb5-libs-1.14.1-27.el7_3.x86_64 libcom_err-1.42.9-9.el7.x86_64 
libdb-5.3.21-19.el7.x86_64 libgcc-4.8.5-11.el7.x86_64 
libselinux-2.5-6.el7.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64 
nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 
openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  GetStackTrace_x86 (result=0x7fbf7232fa00, max_depth=31, skip_count=0) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328
#1  0x00b9992b in GetStackTrace (result=result@entry=0x7fbf7232fa00, 
max_depth=max_depth@entry=31, skip_count=skip_count@entry=1) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace.cc:295
#2  0x00b8c14d in DoSampledAllocation (size=size@entry=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1169
#3  0x0289f151 in do_malloc (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1361
#4  do_allocate_full (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1751
#5  tcmalloc::allocate_full_cpp_throw_oom (size=16385) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1765
#6  0x0289f2a7 in dispatch_allocate_full 
(size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1774
#7  malloc_fast_path (size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1845
#8  tc_new (size=) at 
/home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1969
#9  0x7fbf79c785cd in std::__cxx11::basic_string, std::allocator >::reserve 
(this=this@entry=0x7fbf7232fbb0, __res=)
at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:293
#10 0x7fbf79c6be0b in std::__cxx11::basic_stringbuf, std::allocator >::overflow (this=0x7fbf72330668, 
__c=83) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/sstream.tcc:133
#11 0x7fbf79c76b89 in std::basic_streambuf 
>::xsputn (this=0x7fbf72330668,
__s=0x6929232 
"Service_RequestConsensusVote\",\"total_count\":1,\"min\":104,\"mean\":104.0,\"percentile_75\":104,\"percentile_95\":104,\"percentile_99\":104,\"percentile_99_9\":104,\"percentile_99_99\":104,\"max\":104,\"total_sum\":104}"...,
 __n=250)
at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/streambuf.tcc:98
#12 0x7fbf79c66b62 in sputn (__n=250, __s=, this=) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/streambuf:451
#13 _M_write (__n=250, __s=, this=0x7fbf72330660) at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/ostream:313
#14 std::ostream::write (this=0x7fbf72330660,
__s=0x6929200 
",{\"name\":\"handler_latency_kudu_consensus_ConsensusService_RequestConsensusVote\",\"total_count\":1,\"min\":104,\"mean\":104.0,\"percentile_75\":104,\"percentile_95\":104,\"percentile_99\":104,\"percentile_99_9\":104"...,
 __n=250)
at 
/home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/ostream.tcc:196
#15 0x0265bb1c in Flush (this=0x58c75f8) at 
/home/laiyingchun/kudu_xm/src/kudu/util/jsonwriter.cc:307
#16 kudu::JsonWriterImpl, rapidjson::UTF8, rapidjson::CrtAllocator, 0u> 
>::EndObject (this=0x58c75f0) at 
/home/laiyingchun/kudu_xm/src/kudu/util/jsonwriter.cc:345
#17 0x0265a968 in EndObject (this=0x7fbf72330190) at 
/home/laiyingchun/kudu_xm/src/kudu/util/jsonwriter.cc:151
#18 kudu::JsonWriter::Protobuf (this=this@entry=0x7fbf72330190, pb=...) at 
/hom

[jira] [Commented] (KUDU-3102) tabletserver coredump in jsonwriter

2020-04-01 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073308#comment-17073308
 ] 

Yingchun Lai commented on KUDU-3102:


[~adar] The server has 128GB physical memory, and when the Kudu master process 
coredump, only about 16GB is used totally. But another thing I have to point 
out is that there are several processes running on the server, some Kudu 
masters, and some java applications. And one of the masters' cluster is running 
YCSB benchmark, but I don't think it will introduce much load to master.

[~tlipcon] Not frequently, just once since last restart about 3 month ago, Kudu 
master version is 1.10.1.

> tabletserver coredump in jsonwriter
> ---
>
> Key: KUDU-3102
> URL: https://issues.apache.org/jira/browse/KUDU-3102
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.10.1
>Reporter: Yingchun Lai
>Priority: Major
>
> A tserver coredump happened, backtrace like fowllowing:
> {code:java}
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Missing separate debuginfo for 
> /home/work/app/kudu/c3tst-dev/master/package/libcrypto.so.10
> Try: yum --enablerepo='*debug*' install 
> /usr/lib/debug/.build-id/35/93fa778645a59ea272dbbb59d318c60940e792.debug
> Core was generated by 
> `/home/work/app/kudu/c3tst-dev/master/package/kudu_master 
> -default_num_replicas='.
> Program terminated with signal 11, Segmentation fault.
> #0  GetStackTrace_x86 (result=0x7fbf7232fa00, max_depth=31, skip_count=0) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328
> 328
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:
>  No such file or directory.
> Missing separate debuginfos, use: debuginfo-install 
> cyrus-sasl-gssapi-2.1.26-20.el7_2.x86_64 
> cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 cyrus-sasl-md5-2.1.26-20.el7_2.x86_64 
> cyrus-sasl-plain-2.1.26-20.el7_2.x86_64 glibc-2.17-157.el7_3.1.x86_64 
> keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.14.1-27.el7_3.x86_64 
> libcom_err-1.42.9-9.el7.x86_64 libdb-5.3.21-19.el7.x86_64 
> libgcc-4.8.5-11.el7.x86_64 libselinux-2.5-6.el7.x86_64 
> ncurses-libs-5.9-13.20130511.el7.x86_64 
> nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 
> openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 
> zlib-1.2.7-17.el7.x86_64
> (gdb) bt
> #0  GetStackTrace_x86 (result=0x7fbf7232fa00, max_depth=31, skip_count=0) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace_x86-inl.h:328
> #1  0x00b9992b in GetStackTrace (result=result@entry=0x7fbf7232fa00, 
> max_depth=max_depth@entry=31, skip_count=skip_count@entry=1) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/stacktrace.cc:295
> #2  0x00b8c14d in DoSampledAllocation (size=size@entry=16385) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1169
> #3  0x0289f151 in do_malloc (size=16385) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1361
> #4  do_allocate_full (size=16385) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1751
> #5  tcmalloc::allocate_full_cpp_throw_oom (size=16385) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1765
> #6  0x0289f2a7 in dispatch_allocate_full 
> (size=) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1774
> #7  malloc_fast_path (size=) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1845
> #8  tc_new (size=) at 
> /home/laiyingchun/kudu_xm/thirdparty/src/gperftools-2.6.90/src/tcmalloc.cc:1969
> #9  0x7fbf79c785cd in std::__cxx11::basic_string std::char_traits, std::allocator >::reserve 
> (this=this@entry=0x7fbf7232fbb0, __res=)
> at 
> /home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:293
> #10 0x7fbf79c6be0b in std::__cxx11::basic_stringbuf std::char_traits, std::allocator >::overflow 
> (this=0x7fbf72330668, __c=83) at 
> /home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/sstream.tcc:133
> #11 0x7fbf79c76b89 in std::basic_streambuf 
> >::xsputn (this=0x7fbf72330668,
> __s=0x6929232 
> "Service_RequestConsensusVote\",\"total_count\":1,\"min\":104,\"mean\":104.0,\"percentile_75\":104,\"percentile_95\":104,\"percentile_99\":104,\"percentile_99_9\":104,\"percentile_99_99\":104,\"max\":104,\"total_sum\":104}"...,
>  __n=250)
> at 
> /home/laiyingchun/gcc-7.4.0-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/streambuf.tcc:98
> #12 0x7fbf79c66b62 in sputn (__n=250, __s=, 
> this=) at 
> /home/laiyingchun/gcc-7.4.0-build/x86_64-

[jira] [Updated] (KUDU-2824) Make some tables in high priority in MM compaction

2020-05-09 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-2824:
---
Fix Version/s: 1.10.0

> Make some tables in high priority in MM compaction
> --
>
> Key: KUDU-2824
> URL: https://issues.apache.org/jira/browse/KUDU-2824
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.9.0
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Minor
>  Labels: MM, compaction, maintenance, priority
> Fix For: 1.10.0
>
>
> In a Kudu cluster with thousands of tables, it's hard for a specified 
> tablet's maintenance OPs to be launched when their scores are not the 
> highest, even if the table the tablet belongs to is high priority for Kudu 
> users.
> For example, table A has 10 tablets and has total size of 1G, table B has 
> 1000 tablets and has total size of 100G. Both of them have similar update 
> writes, i.e. DRSs have similar overlaps, similar redo/undo logs, so they have 
> similar compaction scores. However, table A has much more reads than table B, 
> but table A and B are equal in MM, their DRS compactions are lauched equally, 
> we have to suffer a long time util most of tablets have been compacted in the 
> cluster to achieve a fast scan.
> So, maybe we can introduce some algorithm to detect high priority tables and 
> speed up compaction of these tables?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-2824) Make some tables in high priority in MM compaction

2020-05-09 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103238#comment-17103238
 ] 

Yingchun Lai edited comment on KUDU-2824 at 5/9/20, 10:23 AM:
--

[maintenance] Support priorities for tables in MM compaction

{{This commit adds a feature to specify different priorities for table 
compaction. In a Kudu cluster with thousands of tables, it's hard for a 
specified tablet's maintenance OPs to be launched when their scores are not the 
highest, even if the table the tablet belongs to is high priority for Kudu 
users. This patch allows administators to specify different priorities for 
tables by gflags, these maintenance OPs of these high priority tables have 
greater chance to be launched. }}
{{ Change-Id: I3ea3b73505157678a8fb551656123b64e6bfb304 }}
{{Reviewed-on: [http://gerrit.cloudera.org:8080/12852]}}
{{Tested-by: Adar Dembo  }}
{{Reviewed-by: Adar Dembo }}


was (Author: acelyc111):
[maintenance] Support priorities for tables in MM compaction
This commit adds a feature to specify different priorities for table 
compaction. In a Kudu cluster with thousands of tables, it's hard for a 
specified tablet's maintenance OPs to be launched when their scores are not the 
highest, even if the table the tablet belongs to is high priority for Kudu 
users. This patch allows administators to specify different priorities for 
tables by gflags, these maintenance OPs of these high priority tables have 
greater chance to be launched. Change-Id: 
I3ea3b73505157678a8fb551656123b64e6bfb304 Reviewed-on: 
[http://gerrit.cloudera.org:8080/12852]Tested-by: Adar Dembo 
 Reviewed-by: Adar Dembo 

> Make some tables in high priority in MM compaction
> --
>
> Key: KUDU-2824
> URL: https://issues.apache.org/jira/browse/KUDU-2824
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.9.0
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Minor
>  Labels: MM, compaction, maintenance, priority
>
> In a Kudu cluster with thousands of tables, it's hard for a specified 
> tablet's maintenance OPs to be launched when their scores are not the 
> highest, even if the table the tablet belongs to is high priority for Kudu 
> users.
> For example, table A has 10 tablets and has total size of 1G, table B has 
> 1000 tablets and has total size of 100G. Both of them have similar update 
> writes, i.e. DRSs have similar overlaps, similar redo/undo logs, so they have 
> similar compaction scores. However, table A has much more reads than table B, 
> but table A and B are equal in MM, their DRS compactions are lauched equally, 
> we have to suffer a long time util most of tablets have been compacted in the 
> cluster to achieve a fast scan.
> So, maybe we can introduce some algorithm to detect high priority tables and 
> speed up compaction of these tables?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2824) Make some tables in high priority in MM compaction

2020-05-09 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103238#comment-17103238
 ] 

Yingchun Lai commented on KUDU-2824:


[maintenance] Support priorities for tables in MM compaction
This commit adds a feature to specify different priorities for table 
compaction. In a Kudu cluster with thousands of tables, it's hard for a 
specified tablet's maintenance OPs to be launched when their scores are not the 
highest, even if the table the tablet belongs to is high priority for Kudu 
users. This patch allows administators to specify different priorities for 
tables by gflags, these maintenance OPs of these high priority tables have 
greater chance to be launched. Change-Id: 
I3ea3b73505157678a8fb551656123b64e6bfb304 Reviewed-on: 
[http://gerrit.cloudera.org:8080/12852]Tested-by: Adar Dembo 
 Reviewed-by: Adar Dembo 

> Make some tables in high priority in MM compaction
> --
>
> Key: KUDU-2824
> URL: https://issues.apache.org/jira/browse/KUDU-2824
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.9.0
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Minor
>  Labels: MM, compaction, maintenance, priority
>
> In a Kudu cluster with thousands of tables, it's hard for a specified 
> tablet's maintenance OPs to be launched when their scores are not the 
> highest, even if the table the tablet belongs to is high priority for Kudu 
> users.
> For example, table A has 10 tablets and has total size of 1G, table B has 
> 1000 tablets and has total size of 100G. Both of them have similar update 
> writes, i.e. DRSs have similar overlaps, similar redo/undo logs, so they have 
> similar compaction scores. However, table A has much more reads than table B, 
> but table A and B are equal in MM, their DRS compactions are lauched equally, 
> we have to suffer a long time util most of tablets have been compacted in the 
> cluster to achieve a fast scan.
> So, maybe we can introduce some algorithm to detect high priority tables and 
> speed up compaction of these tables?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KUDU-2824) Make some tables in high priority in MM compaction

2020-05-09 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai resolved KUDU-2824.

Resolution: Fixed

> Make some tables in high priority in MM compaction
> --
>
> Key: KUDU-2824
> URL: https://issues.apache.org/jira/browse/KUDU-2824
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.9.0
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Minor
>  Labels: MM, compaction, maintenance, priority
> Fix For: 1.10.0
>
>
> In a Kudu cluster with thousands of tables, it's hard for a specified 
> tablet's maintenance OPs to be launched when their scores are not the 
> highest, even if the table the tablet belongs to is high priority for Kudu 
> users.
> For example, table A has 10 tablets and has total size of 1G, table B has 
> 1000 tablets and has total size of 100G. Both of them have similar update 
> writes, i.e. DRSs have similar overlaps, similar redo/undo logs, so they have 
> similar compaction scores. However, table A has much more reads than table B, 
> but table A and B are equal in MM, their DRS compactions are lauched equally, 
> we have to suffer a long time util most of tablets have been compacted in the 
> cluster to achieve a fast scan.
> So, maybe we can introduce some algorithm to detect high priority tables and 
> speed up compaction of these tables?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2984) memory_gc-itest is flaky

2020-06-03 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125073#comment-17125073
 ] 

Yingchun Lai commented on KUDU-2984:


Has been fixed by [https://gerrit.cloudera.org/c/14553/].

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Alexey Serbin
>Assignee: Yingchun Lai
>Priority: Minor
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KUDU-2984) memory_gc-itest is flaky

2020-06-03 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai resolved KUDU-2984.

Fix Version/s: 1.12.0
   Resolution: Fixed

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Alexey Serbin
>Assignee: Yingchun Lai
>Priority: Minor
> Fix For: 1.12.0
>
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KUDU-2984) memory_gc-itest is flaky

2020-06-03 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-2984:
---
Affects Version/s: (was: 1.12.0)

> memory_gc-itest is flaky
> 
>
> Key: KUDU-2984
> URL: https://issues.apache.org/jira/browse/KUDU-2984
> Project: Kudu
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.11.0
>Reporter: Alexey Serbin
>Assignee: Yingchun Lai
>Priority: Minor
> Fix For: 1.12.0
>
> Attachments: memory_gc-itest.txt.xz
>
>
> The {{memory_gc-itest}} fails time to time with the following error message 
> (DEBUG build):
> {noformat}
> src/kudu/integration-tests/memory_gc-itest.cc:117: Failure
> Expected: (ratio) >= (0.1), actual: 0.0600604 vs 0.1
> tserver-2
> src/kudu/util/test_util.cc:339: Failure
> Failed
> Timed out waiting for assertion to pass.
> {noformat}
> The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-3304) Alter table support set replica number

2021-07-11 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3304:
--

 Summary: Alter table support set replica number
 Key: KUDU-3304
 URL: https://issues.apache.org/jira/browse/KUDU-3304
 Project: Kudu
  Issue Type: New Feature
  Components: client
Affects Versions: 1.15.0
Reporter: Yingchun Lai


For some historical reason, there maybe some tables with only one replica, when 
we want to increase their replication factor to 3, seems there is no way.

I want to add a alter method to do this work, typically, it will used in CLI 
tools.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3304) Alter table support set replica number

2021-07-11 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378847#comment-17378847
 ] 

Yingchun Lai commented on KUDU-3304:


duplicate with https://issues.apache.org/jira/browse/KUDU-2357

> Alter table support set replica number
> --
>
> Key: KUDU-3304
> URL: https://issues.apache.org/jira/browse/KUDU-3304
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Affects Versions: 1.15.0
>Reporter: Yingchun Lai
>Priority: Minor
>
> For some historical reason, there maybe some tables with only one replica, 
> when we want to increase their replication factor to 3, seems there is no way.
> I want to add a alter method to do this work, typically, it will used in CLI 
> tools.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KUDU-3304) Alter table support set replica number

2021-07-11 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3304:
---
Comment: was deleted

(was: duplicate with https://issues.apache.org/jira/browse/KUDU-2357)

> Alter table support set replica number
> --
>
> Key: KUDU-3304
> URL: https://issues.apache.org/jira/browse/KUDU-3304
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Affects Versions: 1.15.0
>Reporter: Yingchun Lai
>Priority: Minor
>
> For some historical reason, there maybe some tables with only one replica, 
> when we want to increase their replication factor to 3, seems there is no way.
> I want to add a alter method to do this work, typically, it will used in CLI 
> tools.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1954) Improve maintenance manager behavior in heavy write workload

2021-07-27 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388492#comment-17388492
 ] 

Yingchun Lai commented on KUDU-1954:


Although we have tried to reduce a single compaction operation's duration, it 
is still possible in some special environments compaction OPs run slower than 
data ingestion. In some environments, the machines may have only spinning 
disks, or even a single spinning disk, the --maintenance_manager_num_threads is 
set to 1, once the thread is lauching some heavy compaction OPs, flush OPs will 
wait a long time to be lauched.

I think we can introduce a seperate flush threads to do flush OPs specially, 
which is similar to how RocksDB works[1].

1. 
https://github.com/facebook/rocksdb/blob/4361d6d16380f619833d58225183cbfbb2c7a1dd/include/rocksdb/options.h#L599-L658

> Improve maintenance manager behavior in heavy write workload
> 
>
> Key: KUDU-1954
> URL: https://issues.apache.org/jira/browse/KUDU-1954
> Project: Kudu
>  Issue Type: Improvement
>  Components: compaction, perf, tserver
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: performance, roadmap-candidate, scalability
> Attachments: mm-trace.png
>
>
> During the investigation in [this 
> doc|https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit]
>  I found a few maintenance-manager-related issues during heavy writes:
> - we don't schedule flushes until we are already in "backpressure" realm, so 
> we spent most of our time doing backpressure
> - even if we configure N maintenance threads, we typically are only using 
> ~50% of those threads due to the scheduling granularity
> - when we do hit the "memory-pressure flush" threshold, all threads quickly 
> switch to flushing, which then brings us far beneath the threshold
> - long running compactions can temporarily starve flushes
> - high volume of writes can starve compactions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3318:
--

 Summary: Log Block Container metadata consumed too much disk space
 Key: KUDU-3318
 URL: https://issues.apache.org/jira/browse/KUDU-3318
 Project: Kudu
  Issue Type: Improvement
  Components: fs
Reporter: Yingchun Lai


In log block container, blocks in .data file are append only, there is a 
related append only .metadata file to trace blocks in .data, this type of 
entries in metadata are in CREATE type, the other type of entries in metadata 
are type of DELETE, it means mark the corresponding CREATE block as deleted.

If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
hole punch to reclaim disk space in .data file, but the entries in .metadata 
will not be compacted except bootstrap.

Another way to limit metadata is the .data file offset reach its size 
limitation(default 10GB), or block number in metadata reach its limitation(no 
limit on default).

I found a case in product environment that metadata consumed too many disk 
space and near to .data's disk space, it's a waste, and make users confused and 
complain that the actual disk space is far more than user's data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413461#comment-17413461
 ] 

Yingchun Lai commented on KUDU-3318:


A easy way to resolve this problem si to add a limitation to .metadata file 
too, when it reach that limit size(similar to data file reach its limit offset, 
or block number reach its number limit), the container is refused to append 
more blocks, and then after all blocks are deleted, the whole container will be 
removed.

> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413462#comment-17413462
 ] 

Yingchun Lai commented on KUDU-3318:


Another way to optimize the situation is to compact metadata at runtime, now it 
is only compact at bootstrap.

> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413462#comment-17413462
 ] 

Yingchun Lai edited comment on KUDU-3318 at 9/11/21, 3:03 AM:
--

Another way to optimize the situation is to compact metadata at runtime, now it 
is only compact at bootstrap. We can implement it in the future.


was (Author: laiyingchun):
Another way to optimize the situation is to compact metadata at runtime, now it 
is only compact at bootstrap.

> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3318:
---
Description: 
In log block container, blocks in .data file are append only, there is a 
related append only .metadata file to trace blocks in .data, this type of 
entries in metadata are in CREATE type, the other type of entries in metadata 
are type of DELETE, it means mark the corresponding CREATE block as deleted.

If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
hole punch to reclaim disk space in .data file, but the entries in .metadata 
will not be compacted except bootstrap.

Another way to limit metadata is the .data file offset reach its size 
limitation(default 10GB), or block number in metadata reach its limitation(no 
limit on default).

I found a case in product environment that metadata consumed too many disk 
space and near to .data's disk space, it's a waste, and make users confused and 
complain that the actual disk space is far more than user's data.

 
{code:java}
[root@hybrid01 data]# du -cs *.metadata | sort -n | tail
19072 fb58e00979914e95aae7184e3189c8c6.metadata
19092 5bbf54294d5948c4a695e240e81d5f80.metadata
19168 89da5f3c4dfa469a9935f091bced1856.metadata
19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
19496 a6cbb4a284b842deafe6939be051c77c.metadata
19568 ba749640df684cb8868d6e51ea3d1b17.metadata
19924 e5469080934746e58b0fd2ba29d69c9d.metadata
148954280 total

[root@hybrid01 data]# du -cs *.data | sort -n | tail
64568 46dfbc5ac94d429b8d79a536727495df.data
64568 b4abc59d4eb2473ca267e0b057c8fad7.data
65728 576e09ed7e164ddebe5b1702be296619.data
66368 88d295f38dec4197bfbc6927e0528bde.data
90904 7291e10aafe74f2792168f6146738c5d.data
96788 6e72381ae95840f99864baacbc9169af.data
98060 c413553491764d039e702577606bac02.data
103556 a5db7a9c2e93457aa06103e45f59d8b4.data
138200 3876af02694643d49b19b39789460759.data
176443948 total
{code}
 

 

  was:
In log block container, blocks in .data file are append only, there is a 
related append only .metadata file to trace blocks in .data, this type of 
entries in metadata are in CREATE type, the other type of entries in metadata 
are type of DELETE, it means mark the corresponding CREATE block as deleted.

If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
hole punch to reclaim disk space in .data file, but the entries in .metadata 
will not be compacted except bootstrap.

Another way to limit metadata is the .data file offset reach its size 
limitation(default 10GB), or block number in metadata reach its limitation(no 
limit on default).

I found a case in product environment that metadata consumed too many disk 
space and near to .data's disk space, it's a waste, and make users confused and 
complain that the actual disk space is far more than user's data.


> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.
>  
> {code:java}
> [root@hybrid01 data]# du -cs *.metadata | sort -n | tail
> 19072 fb58e00979914e95aae7184e3189c8c6.metadata
> 19092 5bbf54294d5948c4a695e240e81d5f80.metadata
> 19168 89da5f3c4dfa469a9935f091bced1856.metadata
> 19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
> 19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
> 19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
> 19496 a6cbb4a284b842deafe6939be051c77c.metadata
> 19568 ba749640df684cb8868d6e51ea3d1b17.metadata
> 19924 e5469080934746e58b0fd2ba29d69c9d.metadata
> 148954280 total
> [root@hybrid01 data]# du -cs *.data | sort -n | tail
> 64568 46dfbc5ac94d429b8d79a536727495df.data
> 64568 b4abc59d4eb2473ca267e0b057c8fad7.da

[jira] [Updated] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3318:
---
Description: 
In log block container, blocks in .data file are append only, there is a 
related append only .metadata file to trace blocks in .data, this type of 
entries in metadata are in CREATE type, the other type of entries in metadata 
are type of DELETE, it means mark the corresponding CREATE block as deleted.

If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
hole punch to reclaim disk space in .data file, but the entries in .metadata 
will not be compacted except bootstrap.

Another way to limit metadata is the .data file offset reach its size 
limitation(default 10GB), or block number in metadata reach its limitation(no 
limit on default).

I found a case in product environment that metadata consumed too many disk 
space and near to .data's disk space, it's a waste, and make users confused and 
complain that the actual disk space is far more than user's data.

 
{code:java}
[root@hybrid01 data]# du -cs *.metadata | sort -n | tail
19072 fb58e00979914e95aae7184e3189c8c6.metadata
19092 5bbf54294d5948c4a695e240e81d5f80.metadata
19168 89da5f3c4dfa469a9935f091bced1856.metadata
19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
19496 a6cbb4a284b842deafe6939be051c77c.metadata
19568 ba749640df684cb8868d6e51ea3d1b17.metadata
19924 e5469080934746e58b0fd2ba29d69c9d.metadata
148954280 total// all metadata size ~149GB

[root@hybrid01 data]# du -cs *.data | sort -n | tail
64568 46dfbc5ac94d429b8d79a536727495df.data
64568 b4abc59d4eb2473ca267e0b057c8fad7.data
65728 576e09ed7e164ddebe5b1702be296619.data
66368 88d295f38dec4197bfbc6927e0528bde.data
90904 7291e10aafe74f2792168f6146738c5d.data
96788 6e72381ae95840f99864baacbc9169af.data
98060 c413553491764d039e702577606bac02.data
103556 a5db7a9c2e93457aa06103e45f59d8b4.data
138200 3876af02694643d49b19b39789460759.data
176443948 total // all data size ~176GB

[root@hybrid01 data]# kudu pbc dump e5469080934746e58b0fd2ba29d69c9d.metadata 
--oneline | awk '{print $5}' | sort | uniq -c | egrep -v " 2 "
 1 6165611810 // low live ratio, only 1 live block
{code}
 

 

  was:
In log block container, blocks in .data file are append only, there is a 
related append only .metadata file to trace blocks in .data, this type of 
entries in metadata are in CREATE type, the other type of entries in metadata 
are type of DELETE, it means mark the corresponding CREATE block as deleted.

If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
hole punch to reclaim disk space in .data file, but the entries in .metadata 
will not be compacted except bootstrap.

Another way to limit metadata is the .data file offset reach its size 
limitation(default 10GB), or block number in metadata reach its limitation(no 
limit on default).

I found a case in product environment that metadata consumed too many disk 
space and near to .data's disk space, it's a waste, and make users confused and 
complain that the actual disk space is far more than user's data.

 
{code:java}
[root@hybrid01 data]# du -cs *.metadata | sort -n | tail
19072 fb58e00979914e95aae7184e3189c8c6.metadata
19092 5bbf54294d5948c4a695e240e81d5f80.metadata
19168 89da5f3c4dfa469a9935f091bced1856.metadata
19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
19496 a6cbb4a284b842deafe6939be051c77c.metadata
19568 ba749640df684cb8868d6e51ea3d1b17.metadata
19924 e5469080934746e58b0fd2ba29d69c9d.metadata
148954280 total

[root@hybrid01 data]# du -cs *.data | sort -n | tail
64568 46dfbc5ac94d429b8d79a536727495df.data
64568 b4abc59d4eb2473ca267e0b057c8fad7.data
65728 576e09ed7e164ddebe5b1702be296619.data
66368 88d295f38dec4197bfbc6927e0528bde.data
90904 7291e10aafe74f2792168f6146738c5d.data
96788 6e72381ae95840f99864baacbc9169af.data
98060 c413553491764d039e702577606bac02.data
103556 a5db7a9c2e93457aa06103e45f59d8b4.data
138200 3876af02694643d49b19b39789460759.data
176443948 total
{code}
 

 


> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DE

[jira] [Comment Edited] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-10 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413461#comment-17413461
 ] 

Yingchun Lai edited comment on KUDU-3318 at 9/11/21, 3:29 AM:
--

A easy way to resolve this problem is to add a limitation to .metadata file 
too, when it reach that limit size(similar to data file reach its limit offset, 
or block number reach its number limit), the container is refused to append 
more blocks, and then after all blocks are deleted, the whole container will be 
removed.


was (Author: laiyingchun):
A easy way to resolve this problem si to add a limitation to .metadata file 
too, when it reach that limit size(similar to data file reach its limit offset, 
or block number reach its number limit), the container is refused to append 
more blocks, and then after all blocks are deleted, the whole container will be 
removed.

> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.
>  
> {code:java}
> [root@hybrid01 data]# du -cs *.metadata | sort -n | tail
> 19072 fb58e00979914e95aae7184e3189c8c6.metadata
> 19092 5bbf54294d5948c4a695e240e81d5f80.metadata
> 19168 89da5f3c4dfa469a9935f091bced1856.metadata
> 19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
> 19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
> 19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
> 19496 a6cbb4a284b842deafe6939be051c77c.metadata
> 19568 ba749640df684cb8868d6e51ea3d1b17.metadata
> 19924 e5469080934746e58b0fd2ba29d69c9d.metadata
> 148954280 total// all metadata size ~149GB
> [root@hybrid01 data]# du -cs *.data | sort -n | tail
> 64568 46dfbc5ac94d429b8d79a536727495df.data
> 64568 b4abc59d4eb2473ca267e0b057c8fad7.data
> 65728 576e09ed7e164ddebe5b1702be296619.data
> 66368 88d295f38dec4197bfbc6927e0528bde.data
> 90904 7291e10aafe74f2792168f6146738c5d.data
> 96788 6e72381ae95840f99864baacbc9169af.data
> 98060 c413553491764d039e702577606bac02.data
> 103556 a5db7a9c2e93457aa06103e45f59d8b4.data
> 138200 3876af02694643d49b19b39789460759.data
> 176443948 total // all data size ~176GB
> [root@hybrid01 data]# kudu pbc dump e5469080934746e58b0fd2ba29d69c9d.metadata 
> --oneline | awk '{print $5}' | sort | uniq -c | egrep -v " 2 "
>  1 6165611810 // low live ratio, only 1 live block
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KUDU-3318) Log Block Container metadata consumed too much disk space

2021-09-14 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai resolved KUDU-3318.

Fix Version/s: 1.16.0
   Resolution: Fixed

> Log Block Container metadata consumed too much disk space
> -
>
> Key: KUDU-3318
> URL: https://issues.apache.org/jira/browse/KUDU-3318
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
> Fix For: 1.16.0
>
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.
>  
> {code:java}
> [root@hybrid01 data]# du -cs *.metadata | sort -n | tail
> 19072 fb58e00979914e95aae7184e3189c8c6.metadata
> 19092 5bbf54294d5948c4a695e240e81d5f80.metadata
> 19168 89da5f3c4dfa469a9935f091bced1856.metadata
> 19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
> 19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
> 19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
> 19496 a6cbb4a284b842deafe6939be051c77c.metadata
> 19568 ba749640df684cb8868d6e51ea3d1b17.metadata
> 19924 e5469080934746e58b0fd2ba29d69c9d.metadata
> 148954280 total// all metadata size ~149GB
> [root@hybrid01 data]# du -cs *.data | sort -n | tail
> 64568 46dfbc5ac94d429b8d79a536727495df.data
> 64568 b4abc59d4eb2473ca267e0b057c8fad7.data
> 65728 576e09ed7e164ddebe5b1702be296619.data
> 66368 88d295f38dec4197bfbc6927e0528bde.data
> 90904 7291e10aafe74f2792168f6146738c5d.data
> 96788 6e72381ae95840f99864baacbc9169af.data
> 98060 c413553491764d039e702577606bac02.data
> 103556 a5db7a9c2e93457aa06103e45f59d8b4.data
> 138200 3876af02694643d49b19b39789460759.data
> 176443948 total // all data size ~176GB
> [root@hybrid01 data]# kudu pbc dump e5469080934746e58b0fd2ba29d69c9d.metadata 
> --oneline | awk '{print $5}' | sort | uniq -c | egrep -v " 2 "
>  1 6165611810 // low live ratio, only 1 live block
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-3332) Master coredump when add columns after unsafe_rebuild master

2021-10-28 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3332:
--

 Summary: Master coredump when add columns after unsafe_rebuild 
master
 Key: KUDU-3332
 URL: https://issues.apache.org/jira/browse/KUDU-3332
 Project: Kudu
  Issue Type: Bug
  Components: CLI
Affects Versions: NA
Reporter: Yingchun Lai


When do master unsafe_rebuild, tables' next_column_id is set to (2^31 - 1) / 2, 
i.e. 2^30 - 1.

After that, new added column's id is set to 2^30 - 1, 2^30, 2^30 + 1, ... We 
use an IdMapping to maintainance column id to it's index, like 2^30 - 1 -> 200, 
 2^30 -> 201, 2^30 + 1 -> 202.

However, the IdMapping's implemention use a vector to save all the k-v pairs, 
and the key is nearly the index of IdMapping. So we have to use a very large 
vector to save a column id like 2^30, and future more, it will increase doubly 
when found capacity not enough.

When the column id is 2^30, the double size is 2^31, which is overflow, will 
cause master crash.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3332) Master coredump when add columns after unsafe_rebuild master

2021-10-28 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435537#comment-17435537
 ] 

Yingchun Lai commented on KUDU-3332:


!image-2021-10-28-20-24-25-742.png!

> Master coredump when add columns after unsafe_rebuild master
> 
>
> Key: KUDU-3332
> URL: https://issues.apache.org/jira/browse/KUDU-3332
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: NA
>Reporter: Yingchun Lai
>Priority: Major
>
> When do master unsafe_rebuild, tables' next_column_id is set to (2^31 - 1) / 
> 2, i.e. 2^30 - 1.
> After that, new added column's id is set to 2^30 - 1, 2^30, 2^30 + 1, ... We 
> use an IdMapping to maintainance column id to it's index, like 2^30 - 1 -> 
> 200,  2^30 -> 201, 2^30 + 1 -> 202.
> However, the IdMapping's implemention use a vector to save all the k-v pairs, 
> and the key is nearly the index of IdMapping. So we have to use a very large 
> vector to save a column id like 2^30, and future more, it will increase 
> doubly when found capacity not enough.
> When the column id is 2^30, the double size is 2^31, which is overflow, will 
> cause master crash.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KUDU-3332) Master coredump when add columns after unsafe_rebuild master

2021-11-01 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai resolved KUDU-3332.

Fix Version/s: 1.16.0
   Resolution: Fixed

> Master coredump when add columns after unsafe_rebuild master
> 
>
> Key: KUDU-3332
> URL: https://issues.apache.org/jira/browse/KUDU-3332
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: NA
>Reporter: Yingchun Lai
>Priority: Major
> Fix For: 1.16.0
>
>
> When do master unsafe_rebuild, tables' next_column_id is set to (2^31 - 1) / 
> 2, i.e. 2^30 - 1.
> After that, new added column's id is set to 2^30 - 1, 2^30, 2^30 + 1, ... We 
> use an IdMapping to maintainance column id to it's index, like 2^30 - 1 -> 
> 200,  2^30 -> 201, 2^30 + 1 -> 202.
> However, the IdMapping's implemention use a vector to save all the k-v pairs, 
> and the key is nearly the index of IdMapping. So we have to use a very large 
> vector to save a column id like 2^30, and future more, it will increase 
> doubly when found capacity not enough.
> When the column id is 2^30, the double size is 2^31, which is overflow, will 
> cause master crash.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KUDU-3332) Master coredump when add columns after unsafe_rebuild master

2021-11-01 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai closed KUDU-3332.
--

> Master coredump when add columns after unsafe_rebuild master
> 
>
> Key: KUDU-3332
> URL: https://issues.apache.org/jira/browse/KUDU-3332
> Project: Kudu
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: NA
>Reporter: Yingchun Lai
>Priority: Major
> Fix For: 1.16.0
>
>
> When do master unsafe_rebuild, tables' next_column_id is set to (2^31 - 1) / 
> 2, i.e. 2^30 - 1.
> After that, new added column's id is set to 2^30 - 1, 2^30, 2^30 + 1, ... We 
> use an IdMapping to maintainance column id to it's index, like 2^30 - 1 -> 
> 200,  2^30 -> 201, 2^30 + 1 -> 202.
> However, the IdMapping's implemention use a vector to save all the k-v pairs, 
> and the key is nearly the index of IdMapping. So we have to use a very large 
> vector to save a column id like 2^30, and future more, it will increase 
> doubly when found capacity not enough.
> When the column id is 2^30, the double size is 2^31, which is overflow, will 
> cause master crash.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3290) Implement Replicate table's data to Kafka(or other Storage System)

2021-11-11 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17442194#comment-17442194
 ] 

Yingchun Lai commented on KUDU-3290:


In some cases, there may be multiple upstreams, and we want to 'hot backup' all 
these data into a single downstream, with sensitive latency, this feature would 
be helpful.

 I've read the doc and left some comments, [~shenxingwuying] have improved some 
points of the design.

[~awong] Could you give more suggestion about it ?

> Implement Replicate table's data to Kafka(or other Storage System)
> --
>
> Key: KUDU-3290
> URL: https://issues.apache.org/jira/browse/KUDU-3290
> Project: Kudu
>  Issue Type: New Feature
>  Components: tserver
>Reporter: shenxingwuying
>Priority: Critical
>
> h1. background & problem
> We use kudu to store the user profile data, because business requirements, 
> exchange and share data from multi-tenant users, which is reasonable in our 
> application scene, we need replicate data from one system to another. The 
> destination storage system we pick kafka, because of our company's 
> architecture at now.
> At this time, we have two ideas to solve it.
> h1. two replication scheme
> Generally, Raft group has three replicas, one is leader and the other two are 
> followers. We’ll add a replica, its role is Learner. Learner only receive all 
> the data, but not pariticipart in ther leadership election.
> The learner replica, its state machine will be a plugin system, eg:
>  # We can support KuduEngine, which just a data backup like mongodb’s hidden 
> replica.
>  # We can write to the thirdparty store system, like kafka or any other 
> system we need. Then we can replicate data to another system use its client.
> At Paxos has a learner role, which only receive data. we need such a role for 
> new membership.
> But it Kudu Learner has been used for the copying(recovering) tablet replica. 
> Maybe we need a new role name, at this, we still use Learner to represent the 
> new role. (We should think over new role name)
> In our application scene, we will replicate data to kafka, and I will explain 
> the method.
> h2. Learner replication
>  # Add a new replica role, maybe we call it learner, because Paxos has a 
> learner role, which only receive data. We need such a role for new 
> membership. But at Kudu Learner has been used for the copying(recovering) 
> tablet replica. Maybe we need a new role name, at this, we still use Learner 
> to represent the new role. (We should think over new role name)
>  # The voters's safepoint of clean obsoleted wal is min(leader’ max wal 
> sequence number, followers max wal sequence number, learner’ max wal sequence 
> number)
>  # The learner not voter, not partitipant in elections
>  # Raft can replication data to the learner
>  # The process of learner applydb, just like raft followers, the logs before 
> committed index will replicate to kafka, kafka’s response ok. the apply index 
> will increase.
>  # We need kafka client, it will be added to kudu as an option, maybe as an 
> compile option
>  # When a kudu-tserver decomission or corrupted, the learner must move to new 
> kudu-tserver. So the leader should save learner apply OpId, and replicate to 
> followers, when learner's failover when leader down.
>  # The leader must save the learners apply OpId and replicate it to 
> followers, when learner's recovery can make sure no data loss when leader 
> down. If leader no save the applyIndex, learner maybe loss data
>  # Followers save the learners applyindex and term, coz followers maybe 
> become leader.
>  # When load balancer running,we shoud support move learner another 
> kudu-tserver
>  # Table should add a switch option to determine whether raft group has 
> learner, can support setting it when creating table.
>  # Support altering table to add learners maybe an idea, but need solve the 
> base data migrate problem.
>  # Base data migrate. The simple but heavy cost, when learner's max_OpId < 
> committed_OpId (maybe data loss, maybe we alter table add learner replication 
> for a existing table), we can trigger a full scan at the timestamp and 
> replicate data to learner, and then recover the appendEntries flow.
>  # Kudu not support split and merge, we not discuss it now. If KuduSupport 
> split or merge, we can implement it use 12, of course we can use more better 
> method.
>  # If we need the funtion, our cluster should at least 4 tservers.
> If kafka fail or topic not exist, the learner will stop replicate wal, that 
> will occupt more disk space. if learner loss or corrupted, it can recover 
> from the leader. We need make sure the safepoint.
> h2. Leader replication
> We can replication data to kafka or any other storage system from leader 
>

[jira] [Created] (KUDU-3353) Support setnx semantic on column

2022-02-17 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3353:
--

 Summary: Support setnx semantic on column
 Key: KUDU-3353
 URL: https://issues.apache.org/jira/browse/KUDU-3353
 Project: Kudu
  Issue Type: New Feature
  Components: api, server
Reporter: Yingchun Lai






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KUDU-3353) Support setnx semantic on column

2022-02-17 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3353:
---
Description: 
h1. motivation

In some usage scenarios, Kudu table has a column with semantic of "create 
time", which means it represent the create timestamp of the row. The other 
columns have the similar semantic as before, for example, the user properties 
like age, address, and etc.

Upstream and Kudu user doesn't know whether a row is exist or not, and every 
cell data is the lastest ingested from, for example, event stream.

If without the "create time" column, Kudu user can use UPSERT operations to 
write data to the table, every columns with data will overwrite the old data. 
But if with the "create time" column, the cell data will be overwrote by the 
following UPSERT ops, which is not what we expect.

To achive the goal, we have to 

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KUDU-3353) Support setnx semantic on column

2022-02-17 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3353:
---
Description: 
h1. motivation

In some usage scenarios, Kudu table has a column with semantic of "create 
time", which means it represent the create timestamp of the row. The other 
columns have the similar semantic as before, for example, the user properties 
like age, address, and etc.

Upstream and Kudu user doesn't know whether a row is exist or not, and every 
cell data is the lastest ingested from, for example, event stream.

If without the "create time" column, Kudu user can use UPSERT operations to 
write data to the table, every columns with data will overwrite the old data. 
But if with the "create time" column, the cell data will be overwrote by the 
following UPSERT ops, which is not what we expect.

To achive the goal, we have to read the column out to judge whether the column 
is NULL or not, if it's NULL, we can fill the row with the cell, if not NULL, 
we will drop it from the data before UPSERT, to avoid overwite "create time".

It's expensive, is there a way to avoid a read from Kudu?
h1. Resolvation

We can implement column schema with semantic of "update if null". That means 
cell data in changelist will update the base data if the latter is NULL, and 
will ignore updates if it is not NULL.

So we can use Kudu similarly as before, but only defined the column as "update 
if null" when create table or add column.

 

  was:
h1. motivation

In some usage scenarios, Kudu table has a column with semantic of "create 
time", which means it represent the create timestamp of the row. The other 
columns have the similar semantic as before, for example, the user properties 
like age, address, and etc.

Upstream and Kudu user doesn't know whether a row is exist or not, and every 
cell data is the lastest ingested from, for example, event stream.

If without the "create time" column, Kudu user can use UPSERT operations to 
write data to the table, every columns with data will overwrite the old data. 
But if with the "create time" column, the cell data will be overwrote by the 
following UPSERT ops, which is not what we expect.

To achive the goal, we have to 


> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KUDU-3353) Support setnx semantic on column

2022-02-20 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495292#comment-17495292
 ] 

Yingchun Lai commented on KUDU-3353:


I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:

 
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:

 

 
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore.

 

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KUDU-3353) Support setnx semantic on column

2022-02-20 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495292#comment-17495292
 ] 

Yingchun Lai edited comment on KUDU-3353 at 2/21/22, 3:18 AM:
--

I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore.


was (Author: laiyingchun):
I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:

 
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:

 

 
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore.

 

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KUDU-3353) Support setnx semantic on column

2022-02-20 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495292#comment-17495292
 ] 

Yingchun Lai edited comment on KUDU-3353 at 2/21/22, 3:21 AM:
--

I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore. So 
it's not effective expensive, and would be lighter than overwrite op since 
there is less no cell copies.


was (Author: laiyingchun):
I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore.

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KUDU-3353) Support setnx semantic on column

2022-02-21 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495292#comment-17495292
 ] 

Yingchun Lai edited comment on KUDU-3353 at 2/22/22, 2:13 AM:
--

[~anjuwong] Thanks for your reply!

I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore. So 
it's not effective expensive, and would be lighter than overwrite op since 
there is less no cell copies.


was (Author: laiyingchun):
I should clarify that no matter what the schema is, include SETNX columns or 
not, the update ops will update the row anyway, the difference is whether to 
update the cells or not.

Suppose a table with schema:
{code:java}
TABLE test (
    key INT64 NOT NULL,
    value1 INT64 NULLABLE,
    value2 INT64 NULLABLE UPDATE_IF_NULL,   // this is a SETNX column
    PRIMARY KEY (key)
) ...{code}
case 1: upsert ops on the table are:
{code:java}
upsert1: 1, 2, 3
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 3'. (30 will not overwite 3 because it's not 
NULL)

case 2: upsert ops on the table are:
{code:java}
upsert1: 1, 2, null
upsert2: 1, 20, 30{code}
Then the result will be '1, 20, 30'. (30 will be update because it's NULL)

All the cells in upsert/update ops will be kept as before in changelist, the 
difference is the behavior of delta applier, overwrite the cell or ignore. So 
it's not effective expensive, and would be lighter than overwrite op since 
there is less no cell copies.

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KUDU-3353) Support setnx semantic on column

2022-03-22 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510374#comment-17510374
 ] 

Yingchun Lai commented on KUDU-3353:


After discussing with [~anjuwong] , now let me clarify the design:
 # Add a column property to define a column as IMMUTABLE, means the column cell 
value can not be updated after been written.
 # Use UPDATE_IGNORE and add UPSERT_IGNORE, for UPDATE and UPSERT ops but 
ignore update-errors on IMMUTABLE columns.
 # Since the column is immutable, we restrict it must be 'NOT NULL'. Otherwise, 
you can't update the NULL value after the initial insertion.
 # It's able to add such a column with a default value. All the old column data 
in the table has the default immutable value, new insertion can specify a cell 
value on the column or not, if not, default value will be used.

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KUDU-3353) Support setnx semantic on column

2022-03-29 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3353:
---
Status: In Review  (was: Open)

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3371:
--

 Summary: Use RocksDB to store LBM metadata
 Key: KUDU-3371
 URL: https://issues.apache.org/jira/browse/KUDU-3371
 Project: Kudu
  Issue Type: Improvement
  Components: fs
Reporter: Yingchun Lai






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3371:
---
Description: 
h1. motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

 
h2. 2. Long time bootstrap

 

 

 

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
>  
> h2. 2. Long time bootstrap
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3371:
---
Description: 
h1. motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 

 

  was:
h1. motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

 
h2. 2. Long time bootstrap

 

 

 


> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#8200

[jira] [Updated] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3371:
---
Description: 
h1. Motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 
both at runtime and bootstrap stage.

The one at runtime will lock the container, and it's synchronous.

The one in bootstrap stage is synchronous too, and may make the bootstrap time 
longer.
h1. Optimization

I'm trying to use RocksDB to store LBM container metadata recently, finished 
most of work now, and did some benchmark

  was:
h1. motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 

 


> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as d

[jira] [Updated] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3371:
---
Description: 
h1. Motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 
both at runtime and bootstrap stage.

The one at runtime will lock the container, and it's synchronous.

The one in bootstrap stage is synchronous too, and may make the bootstrap time 
longer.
h1. Optimization by using RocksDB
h2. Storage design
 * RocksDB instance: one RocksDB instance per data directory.
 * Key: .
 * Value: the same as before, i.e. the serialized protobuf string, and only 
store for CREATE entries.
 * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
when delete block
 * Scan: happened only in bootstrap stage to retrieve all blocks
 * DeleteRange: happened only when invalidate a container

h2. Advantages
 # Disk space amplification: There is still disk space amplification problem. 
But we can tune RocksDB to reach a balanced point, I trust in most cases, 
RocksDB is better than append only file.
 # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
maybe much faster than before.
 # metadata compaction: we can leave it to rocksdb to do this work, of course 
tuning needed.

h2. test & benchmark

I'm trying to use RocksDB to store LBM container metadata recently, finished 
most of work now, and did some benchmark. It show that the fs module block 
read/write/delete performance is similar to or little worse than the old 
implemention, the bootstrap time may reduce several times.

I not sure if it is worth to continue the work, or anybody know if there is any 
discussion on this topic ever.

  was:
h1. Motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada com

[jira] [Updated] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-25 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3371:
---
Description: 
h1. Motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 
both at runtime and bootstrap stage.

The one at runtime will lock the container, and it's synchronous.

The one in bootstrap stage is synchronous too, and may make the bootstrap time 
longer.
h1. Optimization

I'm trying to use RocksDB to store LBM container metadata recently, finished 
most of work now, and did some benchmark. It show that the fs module block 
read/write/delete performance is similar to or little worse than the old 
implemention, the bootstrap time may reduce several times.

I not sure if it is worth to continue the work, or anybody know if there is any 
discussion on this topic ever.

  was:
h1. Motivation

The current LBM container use separate .data and .metadata files. The .data 
file store the real user data, we can use hole punching to reduce disk space. 
While the metadata use write protobuf serialized string to a file, in append 
only mode. Each protobuf object is a struct of BlockRecordPB:

 
{code:java}
message BlockRecordPB {
  required BlockIdPB block_id = 1;  // int64
  required BlockRecordType op_type = 2;  // CREATE or DELETE
  required uint64 timestamp_us = 3;
  optional int64 offset = 4; // Required for CREATE.
  optional int64 length = 5; // Required for CREATE.
} {code}
That means each object is either type of CREATE or DELETE. To mark a 'block' as 
deleted, there will be 2 objects in the metadata, one is CREATE type and the 
other is DELETE type.

There are some weak points of current LBM metadata storage mechanism:
h2. 1. Disk space amplification

The metadata live blocks rate may be very low, the worst case is there is only 
1 alive block (suppose it hasn't reach the runtime compact threshold), all the 
other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).

So the disk space amplification is very serious.
h2. 2. Long time bootstrap

In Kudu server bootstrap stage, it have to replay all the metadata files, to 
find out the alive blocks. In the worst case, we may replayed thousands of 
blocks in metadata, but find only a very few blocks are alive.

It may waste much time in almost all cases, since the Kudu cluster in 
production environment always run without bootstrap with several months, the 
LBM may be very loose.
h2. 3. Metadada compaction

To resolve the issues above, there is a metadata compaction mechanism in LBM, 
both at runtime and bootstrap stage.

The one at runtime will lock the container, and it's synchronous.

The one in bootstrap stage is synchronous too, and may make the bootstrap time 
longer.
h1. Optimization

I'm trying to use RocksDB to store LBM container metadata recently, finished 
most of work now, and did some benchmark


> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the

[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542543#comment-17542543
 ] 

Yingchun Lai commented on KUDU-3371:


Yes, after rocksdb in introduced to Kudu, we can store more 'metadata' like 
consensus-meta and tablet-meta. Rocksdb provides plenty of options we can tune, 
we can separate different data to different rocksdb instance or column family 
to get higher performance, or gain different requirement.

The github link in this Jira  seems broken :(

https://issues.apache.org/jira/browse/KUDU-2204

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when invalidate a container
> h2. Advantages
>  # Disk space amplification: There is still disk space amplification problem. 
> But we can tune RocksDB to reach a balanced point, I trust in most cases, 
> RocksDB is better than append only file.
>  # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
> maybe much faster than before.
>  # metadata compaction: we can leave it to rocksdb to do this work, of course 
> tuning needed.
> h2. test & benchmark
> I'm trying to use RocksDB to store LBM container metadata recently, finished 
> most of work now, and did some benchmark. It show that the fs module block 
> read/write/delete performance is similar to or little worse than the old 
> implemention, the bootstrap time may reduce several times.
> I not sure if it is worth to continue the work, or anybody know if there is 
> any discussion on this topic ever.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-05-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542546#comment-17542546
 ] 

Yingchun Lai commented on KUDU-3371:


I submit my WIP patch here
https://gerrit.cloudera.org/c/18569/

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when invalidate a container
> h2. Advantages
>  # Disk space amplification: There is still disk space amplification problem. 
> But we can tune RocksDB to reach a balanced point, I trust in most cases, 
> RocksDB is better than append only file.
>  # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
> maybe much faster than before.
>  # metadata compaction: we can leave it to rocksdb to do this work, of course 
> tuning needed.
> h2. test & benchmark
> I'm trying to use RocksDB to store LBM container metadata recently, finished 
> most of work now, and did some benchmark. It show that the fs module block 
> read/write/delete performance is similar to or little worse than the old 
> implemention, the bootstrap time may reduce several times.
> I not sure if it is worth to continue the work, or anybody know if there is 
> any discussion on this topic ever.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3353) Support setnx semantic on column

2022-06-01 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544775#comment-17544775
 ] 

Yingchun Lai commented on KUDU-3353:


This feature has been implemented by [KUDU-3353 [schema] Add an immutable 
attribute on column schema (If80ebca7) · Gerrit Code Review 
(cloudera.org)|https://gerrit.cloudera.org/c/18241/],

Help to review, thanks!

[~anjuwong] [~aserbin]  [~tlipcon] 

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (KUDU-3353) Support setnx semantic on column

2022-06-01 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544775#comment-17544775
 ] 

Yingchun Lai edited comment on KUDU-3353 at 6/1/22 9:11 AM:


This feature has been implemented by [this 
patch|https://gerrit.cloudera.org/c/18241/],

Help to review, thanks!

[~anjuwong] [~aserbin]  [~tlipcon] 


was (Author: laiyingchun):
This feature has been implemented by [KUDU-3353 [schema] Add an immutable 
attribute on column schema (If80ebca7) · Gerrit Code Review 
(cloudera.org)|https://gerrit.cloudera.org/c/18241/],

Help to review, thanks!

[~anjuwong] [~aserbin]  [~tlipcon] 

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-06-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558980#comment-17558980
 ] 

Yingchun Lai commented on KUDU-3371:


Thanks [~weichiu] 

Kudu store protobuf serialized metadata into append only file too, so the cost 
will not increase after using RocksDB in the serialization/deserialization 
stage.

The difference is the cost of store strings into append only file and RocksDB 
(including muttable memtable and WAL).

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when invalidate a container
> h2. Advantages
>  # Disk space amplification: There is still disk space amplification problem. 
> But we can tune RocksDB to reach a balanced point, I trust in most cases, 
> RocksDB is better than append only file.
>  # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
> maybe much faster than before.
>  # metadata compaction: we can leave it to rocksdb to do this work, of course 
> tuning needed.
> h2. test & benchmark
> I'm trying to use RocksDB to store LBM container metadata recently, finished 
> most of work now, and did some benchmark. It show that the fs module block 
> read/write/delete performance is similar to or little worse than the old 
> implemention, the bootstrap time may reduce several times.
> I not sure if it is worth to continue the work, or anybody know if there is 
> any discussion on this topic ever.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-06-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559016#comment-17559016
 ] 

Yingchun Lai commented on KUDU-3371:


Now I've completed the main work of introducing RocksDB to store log block 
manager's metadata, introduced another block manager type named "logr", and 
some related unit tests and benchmark tests. The benchmark tests include 
startup, it shows that reopen staged reduced upto 90% time cost of using 'log' 
type block manager (but the delete blocks stage increase about 1 time, create 
blocks stage cost similar time, shutdown block manager reduce about 20%).

test: log_block_manager-test 
--gtest_filter=EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 ...
|--startup_benchmark_block_count_per_batch_for_testing=1000
--startup_benchmark_batch_count_for_testing=5000|create blocks|delete 
blocks|shutdown block manager|reopening block manager| |
|--startup_benchmark_deleted_block_percentage|create-log|create-logr|delete-log|delete-logr|shutdown-log|shutdown-logr|reopen-log|reopen-logr|live_blocks|
|10|19.861|18.412|1.307|1.678|10.083|18.736|8.832|5.693|450|
|20|19.369|19.018|2.223|4.292|17.901|21.559|8.503|7.061|400|
|30|20.121|19.737|3.626|6.045|29.604|53.677|8.561|6.189|350|
|40|19.183|18.233|4.409|8.116|37.216|55.642|8.745|4.241|300|
|50|19.997|18.257|4.889|10.178|94.15|70.607|9.342|3.365|250|
|60|19.451|18.08|7.123|11.995|65.856|46.161|9.436|3.166|200|
|70|18.841|18.448|7.249|14.529|84.43|64.063|9.072|3.018|150|
|80|20.418|18.004|9.922|16.708|111.138|77.051|10.026|2.788|100|
|90|20.255|18.144|9.728|18.337|121.562|107.961|9.85|1.317|50|
|95|19.449|18.524|11.598|19.059|140.193|116.238|9.972|1.18|25|
|99|20.583|18.38|11.918|19.505|138.448|114.04|10.085|1.107|5|
|99.9|18.852|18.253|12.137|20.497|143.368|107.981|10.033|1.068|5000|
|99.99|20.024|18.199|11.799|20.181|138.805|111.367|10.631|1.111|500|

 

test: block_manager-stress-test (run the test in 30 seconds, with threads to 
write/read/delete blocks)
| |file|log|logr|
|Wrote blocks|28,320|71,680|77,920|
|Read blocks|3,557,279|3,588,357|3,554,305|
|Deleted blocks|26,681|70,041|76,281|

 

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when i

[jira] [Comment Edited] (KUDU-3371) Use RocksDB to store LBM metadata

2022-06-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559016#comment-17559016
 ] 

Yingchun Lai edited comment on KUDU-3371 at 6/27/22 6:48 AM:
-

Now I've completed the main work of introducing RocksDB to store log block 
manager's metadata, introduced another block manager type named "logr", and 
some related unit tests and benchmark tests. The benchmark tests include 
startup, it shows that reopen staged reduced upto 90% time cost of using 'log' 
type block manager (but the delete blocks stage increase about 1 time, create 
blocks stage cost similar time, shutdown block manager reduce about 20%).

test: log_block_manager-test 
--gtest_filter=EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 ...
|--startup_benchmark_block_count_per_batch_for_testing=1000
--startup_benchmark_batch_count_for_testing=5000| | | | | | | | | |
|--startup_benchmark_deleted_block_percentage|create-log|create-logr|delete-log|delete-logr|shutdown-log|shutdown-logr|reopen-log|reopen-logr|live_blocks|
|10|19.861|18.412|1.307|1.678|10.083|18.736|8.832|5.693|450|
|20|19.369|19.018|2.223|4.292|17.901|21.559|8.503|7.061|400|
|30|20.121|19.737|3.626|6.045|29.604|53.677|8.561|6.189|350|
|40|19.183|18.233|4.409|8.116|37.216|55.642|8.745|4.241|300|
|50|19.997|18.257|4.889|10.178|94.15|70.607|9.342|3.365|250|
|60|19.451|18.08|7.123|11.995|65.856|46.161|9.436|3.166|200|
|70|18.841|18.448|7.249|14.529|84.43|64.063|9.072|3.018|150|
|80|20.418|18.004|9.922|16.708|111.138|77.051|10.026|2.788|100|
|90|20.255|18.144|9.728|18.337|121.562|107.961|9.85|1.317|50|
|95|19.449|18.524|11.598|19.059|140.193|116.238|9.972|1.18|25|
|99|20.583|18.38|11.918|19.505|138.448|114.04|10.085|1.107|5|
|99.9|18.852|18.253|12.137|20.497|143.368|107.981|10.033|1.068|5000|
|99.99|20.024|18.199|11.799|20.181|138.805|111.367|10.631|1.111|500|

 

test: block_manager-stress-test (run the test in 30 seconds, with threads to 
write/read/delete blocks)
| |file|log|logr|
|Wrote blocks|28,320|71,680|77,920|
|Read blocks|3,557,279|3,588,357|3,554,305|
|Deleted blocks|26,681|70,041|76,281|

 


was (Author: laiyingchun):
Now I've completed the main work of introducing RocksDB to store log block 
manager's metadata, introduced another block manager type named "logr", and 
some related unit tests and benchmark tests. The benchmark tests include 
startup, it shows that reopen staged reduced upto 90% time cost of using 'log' 
type block manager (but the delete blocks stage increase about 1 time, create 
blocks stage cost similar time, shutdown block manager reduce about 20%).

test: log_block_manager-test 
--gtest_filter=EncryptionEnabled/LogBlockManagerTest.StartupBenchmark/0 ...
|--startup_benchmark_block_count_per_batch_for_testing=1000
--startup_benchmark_batch_count_for_testing=5000|create blocks|delete 
blocks|shutdown block manager|reopening block manager| |
|--startup_benchmark_deleted_block_percentage|create-log|create-logr|delete-log|delete-logr|shutdown-log|shutdown-logr|reopen-log|reopen-logr|live_blocks|
|10|19.861|18.412|1.307|1.678|10.083|18.736|8.832|5.693|450|
|20|19.369|19.018|2.223|4.292|17.901|21.559|8.503|7.061|400|
|30|20.121|19.737|3.626|6.045|29.604|53.677|8.561|6.189|350|
|40|19.183|18.233|4.409|8.116|37.216|55.642|8.745|4.241|300|
|50|19.997|18.257|4.889|10.178|94.15|70.607|9.342|3.365|250|
|60|19.451|18.08|7.123|11.995|65.856|46.161|9.436|3.166|200|
|70|18.841|18.448|7.249|14.529|84.43|64.063|9.072|3.018|150|
|80|20.418|18.004|9.922|16.708|111.138|77.051|10.026|2.788|100|
|90|20.255|18.144|9.728|18.337|121.562|107.961|9.85|1.317|50|
|95|19.449|18.524|11.598|19.059|140.193|116.238|9.972|1.18|25|
|99|20.583|18.38|11.918|19.505|138.448|114.04|10.085|1.107|5|
|99.9|18.852|18.253|12.137|20.497|143.368|107.981|10.033|1.068|5000|
|99.99|20.024|18.199|11.799|20.181|138.805|111.367|10.631|1.111|500|

 

test: block_manager-stress-test (run the test in 30 seconds, with threads to 
write/read/delete blocks)
| |file|log|logr|
|Wrote blocks|28,320|71,680|77,920|
|Read blocks|3,557,279|3,588,357|3,554,305|
|Deleted blocks|26,681|70,041|76,281|

 

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB

[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-06-26 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559021#comment-17559021
 ] 

Yingchun Lai commented on KUDU-3371:


The side effect of rocksdb is delete blocks cost more time, but I think the 
deletion of blocks are always backgroud, it doesn't take effect of user faced 
write, scan, or alter table.

And we have opportunity to research more in the future by tuning rocksdb 
options. Everybody can contribute on it then.

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when invalidate a container
> h2. Advantages
>  # Disk space amplification: There is still disk space amplification problem. 
> But we can tune RocksDB to reach a balanced point, I trust in most cases, 
> RocksDB is better than append only file.
>  # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
> maybe much faster than before.
>  # metadata compaction: we can leave it to rocksdb to do this work, of course 
> tuning needed.
> h2. test & benchmark
> I'm trying to use RocksDB to store LBM container metadata recently, finished 
> most of work now, and did some benchmark. It show that the fs module block 
> read/write/delete performance is similar to or little worse than the old 
> implemention, the bootstrap time may reduce several times.
> I not sure if it is worth to continue the work, or anybody know if there is 
> any discussion on this topic ever.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3371) Use RocksDB to store LBM metadata

2022-06-27 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559031#comment-17559031
 ] 

Yingchun Lai commented on KUDU-3371:


I have submmit a merge request on gerrit 
[1][,|https://gerrit.cloudera.org/c/18569/,] but it seems too large and not 
friendly for reviewers, I will split it to several small merge requests.
 # Refactor LogBlockManager as a base class, add LogfBlockManager extend from 
it. LogfBlockManager is the Log Block Manager which manage the append only file 
to store containers' metadata, it is how we do as before.
 # Refactor LogBlockContainer as a base class, add LogfBlockContainer extend 
from it. LogfBlockContainer is the Log Block Container which use append only 
file to store containers' metadata, it is how we do as before.
 # Intruduce rocksdb as a thirdparty lib.
 # Add LogrBlockContainer which use rocksdb to store containers metadata. and 
add LogrBlockManager to manage LogrBlockContainer. Add related unit tests.
 # Do some refactors to support batch operates on blocks.
 # Use existing benchmarks to show the effect.
 # Add some metrics. (TODO, not included in [1])
 # Add more kudu tools to operate on rocksdb metadata. (TODO, not included in 
[1])
 # futher tuning on rocksdb options.  (TODO, not included in [1])

 

1. [https://gerrit.cloudera.org/c/18569/|https://gerrit.cloudera.org/c/18569/,]

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> h1. Optimization by using RocksDB
> h2. Storage design
>  * RocksDB instance: one RocksDB instance per data directory.
>  * Key: .
>  * Value: the same as before, i.e. the serialized protobuf string, and only 
> store for CREATE entries.
>  * Put/Delete: put value to rocksdb when create block, delete it from rocksdb 
> when delete block
>  * Scan: happened only in bootstrap stage to retrieve all blocks
>  * DeleteRange: happened only when invalidate a container
> h2. Advantages
>  # Disk space amplification: There is still disk space amplification problem. 
> But we can tune RocksDB to reach a balanced point, I trust in most cases, 
> RocksDB is better than append only file.
>  # Bootstrap time: since there are only valid blocks left in rocksdb, so it 
> maybe much faster than before.
>  # metadata compaction: we can leave it to rocksdb to do this work, of course 
> tuning needed.
> h2. test & benchmark
> I'm trying to use RocksDB to store LBM container metadata recently, finished 
> most of work now, and did some benchmark. It show that the fs module block 
> read/write/delete performance is similar to or little worse than the old 
> implemention, the bootstrap time may reduce several times.
> I not sure if it is worth to conti

[jira] [Comment Edited] (KUDU-3371) Use RocksDB to store LBM metadata

2022-07-06 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559031#comment-17559031
 ] 

Yingchun Lai edited comment on KUDU-3371 at 7/6/22 3:59 PM:


I have submmit a merge request on gerrit [1], but it seems too large and not 
friendly for reviewers, I will split it to several small merge requests.
 # Refactor LogBlockManager as a base class, add LogfBlockManager extend from 
it. LogfBlockManager is the Log Block Manager which manage the append only file 
to store containers' metadata, it is how we do as before.
 # Refactor LogBlockContainer as a base class, add LogfBlockContainer extend 
from it. LogfBlockContainer is the Log Block Container which use append only 
file to store containers' metadata, it is how we do as before.
 # Intruduce rocksdb as a thirdparty lib.
 # Add LogrBlockContainer which use rocksdb to store containers metadata. and 
add LogrBlockManager to manage LogrBlockContainer. Add related unit tests.
 # Do some refactors to support batch operates on blocks.
 # Use existing benchmarks to show the effect.
 # Add some metrics. (TODO, not included in [1])
 # Add more kudu tools to operate on rocksdb metadata. (TODO, not included in 
[1])
 # futher tuning on rocksdb options.  (TODO, not included in [1])

 

1. [https://gerrit.cloudera.org/c/18569/|https://gerrit.cloudera.org/c/18569/,]


was (Author: laiyingchun):
I have submmit a merge request on gerrit 
[1][,|https://gerrit.cloudera.org/c/18569/,] but it seems too large and not 
friendly for reviewers, I will split it to several small merge requests.
 # Refactor LogBlockManager as a base class, add LogfBlockManager extend from 
it. LogfBlockManager is the Log Block Manager which manage the append only file 
to store containers' metadata, it is how we do as before.
 # Refactor LogBlockContainer as a base class, add LogfBlockContainer extend 
from it. LogfBlockContainer is the Log Block Container which use append only 
file to store containers' metadata, it is how we do as before.
 # Intruduce rocksdb as a thirdparty lib.
 # Add LogrBlockContainer which use rocksdb to store containers metadata. and 
add LogrBlockManager to manage LogrBlockContainer. Add related unit tests.
 # Do some refactors to support batch operates on blocks.
 # Use existing benchmarks to show the effect.
 # Add some metrics. (TODO, not included in [1])
 # Add more kudu tools to operate on rocksdb metadata. (TODO, not included in 
[1])
 # futher tuning on rocksdb options.  (TODO, not included in [1])

 

1. [https://gerrit.cloudera.org/c/18569/|https://gerrit.cloudera.org/c/18569/,]

> Use RocksDB to store LBM metadata
> -
>
> Key: KUDU-3371
> URL: https://issues.apache.org/jira/browse/KUDU-3371
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs
>Reporter: Yingchun Lai
>Priority: Major
>
> h1. Motivation
> The current LBM container use separate .data and .metadata files. The .data 
> file store the real user data, we can use hole punching to reduce disk space. 
> While the metadata use write protobuf serialized string to a file, in append 
> only mode. Each protobuf object is a struct of BlockRecordPB:
>  
> {code:java}
> message BlockRecordPB {
>   required BlockIdPB block_id = 1;  // int64
>   required BlockRecordType op_type = 2;  // CREATE or DELETE
>   required uint64 timestamp_us = 3;
>   optional int64 offset = 4; // Required for CREATE.
>   optional int64 length = 5; // Required for CREATE.
> } {code}
> That means each object is either type of CREATE or DELETE. To mark a 'block' 
> as deleted, there will be 2 objects in the metadata, one is CREATE type and 
> the other is DELETE type.
> There are some weak points of current LBM metadata storage mechanism:
> h2. 1. Disk space amplification
> The metadata live blocks rate may be very low, the worst case is there is 
> only 1 alive block (suppose it hasn't reach the runtime compact threshold), 
> all the other thousands of blocks are dead (i.e. in pair of CREATE-DELETE).
> So the disk space amplification is very serious.
> h2. 2. Long time bootstrap
> In Kudu server bootstrap stage, it have to replay all the metadata files, to 
> find out the alive blocks. In the worst case, we may replayed thousands of 
> blocks in metadata, but find only a very few blocks are alive.
> It may waste much time in almost all cases, since the Kudu cluster in 
> production environment always run without bootstrap with several months, the 
> LBM may be very loose.
> h2. 3. Metadada compaction
> To resolve the issues above, there is a metadata compaction mechanism in LBM, 
> both at runtime and bootstrap stage.
> The one at runtime will lock the container, and it's synchronous.
> The one in bootstrap stage is synchronous too, and may make the bootstrap 
> time longer.
> 

[jira] [Commented] (KUDU-3353) Support setnx semantic on column

2022-07-20 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568976#comment-17568976
 ] 

Yingchun Lai commented on KUDU-3353:


Let me clarify some use cases:

A user profile table in Kudu has a column "first_login_ts", it represent the 
first login time to the website. The data in the table is upsert by user event 
log, the log contains user's id, some attributes, and "first_login_ts". The 
first_login_ts is filled by the log produced time, that means for a specified 
user, his/her event logs have a different (higher and higher) "first_login_ts", 
but only the first one could be set, and the following logs should not update 
it.

 

The updated design:

1. Add a column attribute to define a column as IMMUTABLE, means the column 
cell value can not be updated after it's been written during inserting the row.

2. Use UPDATE_IGNORE and add UPSERT_IGNORE, for UPDATE and UPSERT ops but 
ignore update-errors on IMMUTABLE columns.

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KUDU-3353) Support setnx semantic on column

2022-07-20 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568976#comment-17568976
 ] 

Yingchun Lai edited comment on KUDU-3353 at 7/20/22 11:27 AM:
--

Let me clarify some use cases:

A user profile table in Kudu has a column "first_login_ts", it represent the 
first login time to the website. The data in the table is upsert by user event 
log, the log contains user's id, some attributes, and "first_login_ts". The 
first_login_ts is filled by the log produced time, that means for a specified 
user, his/her event logs have a different (higher and higher) "first_login_ts", 
but only the first one could be set, and the following logs should not update 
it.

The same to columns such as sex, birthday, birthplace and etc.

 

If the table column supports "immutable" attribute, the new value in 
update/upsert ops will not be applied to the change list, we can gain the 
profits of faster read.

And in some cases without immutable attribute, we have to read the old value, 
compare with the new value, and then judge which value wins, it would be much 
cost. 

 

The updated design:

1. Add a column attribute to define a column as IMMUTABLE, means the column 
cell value can not be updated after it's been written during inserting the row.

2. Use UPDATE_IGNORE and add UPSERT_IGNORE, for UPDATE and UPSERT ops but 
ignore update-errors on IMMUTABLE columns.


was (Author: laiyingchun):
Let me clarify some use cases:

A user profile table in Kudu has a column "first_login_ts", it represent the 
first login time to the website. The data in the table is upsert by user event 
log, the log contains user's id, some attributes, and "first_login_ts". The 
first_login_ts is filled by the log produced time, that means for a specified 
user, his/her event logs have a different (higher and higher) "first_login_ts", 
but only the first one could be set, and the following logs should not update 
it.

 

The updated design:

1. Add a column attribute to define a column as IMMUTABLE, means the column 
cell value can not be updated after it's been written during inserting the row.

2. Use UPDATE_IGNORE and add UPSERT_IGNORE, for UPDATE and UPSERT ops but 
ignore update-errors on IMMUTABLE columns.

> Support setnx semantic on column
> 
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-09-14 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3400:
--

 Summary: CompilationManager::RequestRowProjector consumed too much 
memory
 Key: KUDU-3400
 URL: https://issues.apache.org/jira/browse/KUDU-3400
 Project: Kudu
  Issue Type: Bug
  Components: codegen
Affects Versions: 1.12.0
Reporter: Yingchun Lai


In one of our cluster, we find that CompilationManager::RequestRowProjector 
function consumed too much memory accidentally. Some situaction of this cluster:
 # some tables have more than 1000 columns, so the table schema may be very 
costly to copy
 # sometimes the tservers have memory pressure, and then do flush operations 
more frequently (to try to reduce memory consumed by MRS/DMS)

I catched a heap profile on a tserver, found out that 
CompilationManager::RequestRowProjector cost most memory when Schema copied, 
the source code:

 
{code:java}
CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
CodeGenerator* generator)
  : base_(base),
proj_(proj),
cache_(cache),
generator_(generator) {} {code}
That is to say, Schemas (i.e. base and proj) are copied when construct 
CompilationTask objects.

The heap profile says that Schema consumed about 50GB memory, that really shock 
me, even though the Schema is large, but how can it consumed 50GB memory? I 
forget to `pstack` the process when it happend, maybe there are hundreds of 
thousands of CompilationManager::RequestRowProjector calls that time, but 
according to the code logic, it should not hang there for a long time?
{code:java}
if (!cached) {
  shared_ptr task(make_shared(
  *base_schema, *projection, &cache_, &generator_));
  WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
  "RowProjector compilation request submit failed", 10);
  return false;
} {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-09-14 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3400:
---
Attachment: data02heap.svg

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3353) Add an immutable attribute on column schema

2022-10-19 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3353:
---
Summary: Add an immutable attribute on column schema  (was: Support setnx 
semantic on column)

> Add an immutable attribute on column schema
> ---
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3353) Add an immutable attribute on column schema

2022-10-19 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620695#comment-17620695
 ] 

Yingchun Lai commented on KUDU-3353:


Now almost all parts of this feature have been implemented, but the kudu-spark 
part is left, currently there is no Spark use case, we can implement it if 
someone need it.

> Add an immutable attribute on column schema
> ---
>
> Key: KUDU-3353
> URL: https://issues.apache.org/jira/browse/KUDU-3353
> Project: Kudu
>  Issue Type: New Feature
>  Components: api, server
>Reporter: Yingchun Lai
>Assignee: Yingchun Lai
>Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create 
> time", which means it represent the create timestamp of the row. The other 
> columns have the similar semantic as before, for example, the user properties 
> like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every 
> cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to 
> write data to the table, every columns with data will overwrite the old data. 
> But if with the "create time" column, the cell data will be overwrote by the 
> following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the 
> column is NULL or not, if it's NULL, we can fill the row with the cell, if 
> not NULL, we will drop it from the data before UPSERT, to avoid overwite 
> "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means 
> cell data in changelist will update the base data if the latter is NULL, and 
> will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as 
> "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3419) Tablet server maybe get stuck when loading tablet metadata failed

2022-11-06 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629429#comment-17629429
 ] 

Yingchun Lai commented on KUDU-3419:


when tserver shutdown, all internal objects will shutdown too, why need manual 
shutdown tablet_manager_?
{code:java}
TabletServer::~TabletServer() {
  ShutdownImpl();
}

void TabletServer::ShutdownImpl() {
  if (kInitialized == state_ || kRunning == state_) {
const string name = rpc_server_->ToString();
LOG(INFO) << "TabletServer@" << name << " shutting down...";

// 1. Stop accepting new RPCs.
UnregisterAllServices();

// 2. Shut down the tserver's subsystems.
maintenance_manager_->Shutdown();
WARN_NOT_OK(heartbeater_->Stop(), "Failed to stop TS Heartbeat thread");
fs_manager_->UnsetErrorNotificationCb(ErrorHandlerType::DISK_ERROR);
fs_manager_->UnsetErrorNotificationCb(ErrorHandlerType::CFILE_CORRUPTION);
tablet_manager_->Shutdown();   // <== tablet_manager_ 
will be shutdown
    client_initializer_->Shutdown();

// 3. Shut down generic subsystems.
KuduServer::Shutdown();
LOG(INFO) << "TabletServer@" << name << " shutdown complete.";
  }
  state_ = kStopped;
} {code}

> Tablet server maybe get stuck when loading tablet metadata failed
> -
>
> Key: KUDU-3419
> URL: https://issues.apache.org/jira/browse/KUDU-3419
> Project: Kudu
>  Issue Type: Bug
>Reporter: Xixu Wang
>Priority: Major
> Attachments: image-2022-11-04-14-57-49-684.png, 
> image-2022-11-04-14-59-54-665.png, image-2022-11-04-15-25-05-437.png, 
> image-2022-11-04-15-29-27-092.png, image-2022-11-04-15-30-08-892.png, 
> image-2022-11-04-15-32-34-366.png
>
>
> Tablet server maybe get stuck when loading tablet metadata failed.
> The follow steps repeat the bug.
> 1. Change the permission of one tablet meta file to root. We use account: 
> *kudu* to run Kudu.
> !image-2022-11-04-14-57-49-684.png!
> 2.Start an instance of tablet server. A permission erro will be saw:
> !image-2022-11-04-15-29-27-092.png!
> 3. Tablet server gets stuck and will not exit automatically.
> !image-2022-11-04-15-30-08-892.png!
> 4. Pstack is as follow:
> As we can see. Tablet Server can not exit, because ThreadPool can not be 
> shutdown. TxnStatlessTrasckerTask is running, which cause threadpool can not 
> be shutdown.
> !image-2022-11-04-15-32-34-366.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-11-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633458#comment-17633458
 ] 

Yingchun Lai commented on KUDU-3367:


[~zhangyifan27]  KUDU-1625 depends the tablet supports 'live row count' (which 
is introduced since Kudu 1.12 ?), even if upgrading Kudu to a higher version, 
the old exists tablet still doesn't have such metadata, so the 
DeletedRowsetGCOp will not work on these tablets.

I guess [~Koppa] is trying to make these old tablet is able to GC such rowsets 
whose rows full deleted, right?

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-11-16 Thread Yingchun Lai (Jira)


[ https://issues.apache.org/jira/browse/KUDU-3400 ]


Yingchun Lai deleted comment on KUDU-3400:


was (Author: laiyingchun):
add the pstack

[^pstack.txt]

 

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-11-16 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635127#comment-17635127
 ] 

Yingchun Lai commented on KUDU-3400:


add the pstack

[^pstack.txt]

 

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-11-16 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3400:
---
Attachment: pstack.txt

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-11-16 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3400:
---
Attachment: heapprofile.svg

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-11-16 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635134#comment-17635134
 ] 

Yingchun Lai commented on KUDU-3400:


add pstack and heapprofile

[^pstack.txt]

^[^heapprofile.svg]^

^Any thought about it ? [~alexey] [~awong] [~zhangyifan27]^ 

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-12-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646952#comment-17646952
 ] 

Yingchun Lai edited comment on KUDU-3400 at 12/14/22 6:16 AM:
--

[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running \{{pprof}}


was (Author: laiyingchun):
[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running \{{pprof}}

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-12-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646952#comment-17646952
 ] 

Yingchun Lai commented on KUDU-3400:


[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running \{{pprof}}

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-12-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646952#comment-17646952
 ] 

Yingchun Lai edited comment on KUDU-3400 at 12/14/22 6:18 AM:
--

[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running  {{pprof}}  ?


was (Author: laiyingchun):
[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running \{{pprof}}

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KUDU-3400) CompilationManager::RequestRowProjector consumed too much memory

2022-12-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646952#comment-17646952
 ] 

Yingchun Lai edited comment on KUDU-3400 at 12/14/22 7:19 AM:
--

[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Thanks for your reminding. Do you mean it's needed to run the script before 
running  {{pprof}}  ?

Similar to  KUDU-3406, the tserver is in memory presure and flush op taking 
priority over delta compaction over and over again.

I suspect https://issues.apache.org/jira/browse/KUDU-3197 is related to the 
issue too if {{pprof }}is not properly used, both of them say thay "Schema" 
cost too much memory. After upgrading the cluster to a version including this 
patch ([https://gerrit.cloudera.org/c/18255/),] this situation hasn't 
reproduced after about 1 month.


was (Author: laiyingchun):
[~aserbin] 
{quote}When generating heap profile, it's important to use proper location for 
the toolchain and binary. If the binaries were built with devtoolset, it's 
necessary to set proper environment when running {{pprof}} from gperftools (see 
the {{$KUDU_HOME/build-support/enable_devtoolset.sh}} script).
{quote}
Do you mean it's needed to run the script before running  {{pprof}}  ?

> CompilationManager::RequestRowProjector consumed too much memory
> 
>
> Key: KUDU-3400
> URL: https://issues.apache.org/jira/browse/KUDU-3400
> Project: Kudu
>  Issue Type: Bug
>  Components: codegen
>Affects Versions: 1.12.0
>Reporter: Yingchun Lai
>Priority: Major
> Attachments: data02heap.svg, heapprofile.svg, pstack.txt
>
>
> In one of our cluster, we find that CompilationManager::RequestRowProjector 
> function consumed too much memory accidentally. Some situaction of this 
> cluster:
>  # some tables have more than 1000 columns, so the table schema may be very 
> costly to copy
>  # sometimes the tservers have memory pressure, and then do flush operations 
> more frequently (to try to reduce memory consumed by MRS/DMS)
> I catched a heap profile on a tserver, found out that 
> CompilationManager::RequestRowProjector cost most memory when Schema copied, 
> the source code:
>  
> {code:java}
> CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache,
> CodeGenerator* generator)
>   : base_(base),
> proj_(proj),
> cache_(cache),
> generator_(generator) {} {code}
> That is to say, Schemas (i.e. base and proj) are copied when construct 
> CompilationTask objects.
> The heap profile says that Schema consumed about 50GB memory, that really 
> shock me, even though the Schema is large, but how can it consumed 50GB 
> memory? I forget to `pstack` the process when it happend, maybe there are 
> hundreds of thousands of CompilationManager::RequestRowProjector calls that 
> time, but according to the code logic, it should not hang there for a long 
> time?
> {code:java}
> if (!cached) {
>   shared_ptr task(make_shared(
>   *base_schema, *projection, &cache_, &generator_));
>   WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }),
>   "RowProjector compilation request submit failed", 10);
>   return false;
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3292) Show non-default flags on varz Web UI

2023-01-09 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3292:
---
Attachment: image-2023-01-10-11-57-13-209.png

> Show non-default flags on varz Web UI
> -
>
> Key: KUDU-3292
> URL: https://issues.apache.org/jira/browse/KUDU-3292
> Project: Kudu
>  Issue Type: Improvement
>  Components: ui
>Reporter: Grant Henke
>Assignee: Bakai Ádám
>Priority: Minor
>  Labels: beginner, newbie, newbie++, trivial
> Attachments: image-2023-01-10-11-57-13-209.png
>
>
> Currently each Kudu server has a /varz webpage (the Flags tab) showing all of 
> the flags set on the server. It would be a nice usability change to include a 
> seperate section showing only the non-default flags. This should be super 
> straigtforward given we have the ability to get all the non-default flags via 
> GetNonDefaultFlags or GetNonDefaultFlagsMap in flags.cc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3292) Show non-default flags on varz Web UI

2023-01-09 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656408#comment-17656408
 ] 

Yingchun Lai commented on KUDU-3292:


There maybe some duplicate flags? The flags in 'non-default' section and the 
ones in 'all' section are duplicated.

Would it better to refactor this page, and show more infomation of flags, the 
description, default value, current value, and etc.

This is how does impala do:

!image-2023-01-10-11-57-13-209.png!

> Show non-default flags on varz Web UI
> -
>
> Key: KUDU-3292
> URL: https://issues.apache.org/jira/browse/KUDU-3292
> Project: Kudu
>  Issue Type: Improvement
>  Components: ui
>Reporter: Grant Henke
>Assignee: Bakai Ádám
>Priority: Minor
>  Labels: beginner, newbie, newbie++, trivial
> Attachments: image-2023-01-10-11-57-13-209.png
>
>
> Currently each Kudu server has a /varz webpage (the Flags tab) showing all of 
> the flags set on the server. It would be a nice usability change to include a 
> seperate section showing only the non-default flags. This should be super 
> straigtforward given we have the ability to get all the non-default flags via 
> GetNonDefaultFlags or GetNonDefaultFlagsMap in flags.cc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-2670) Splitting more tasks for spark job, and add more concurrent for scan operation

2023-01-31 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682837#comment-17682837
 ] 

Yingchun Lai commented on KUDU-2670:


C++ client has implemented this feature too by 
https://issues.apache.org/jira/browse/KUDU-3393

> Splitting more tasks for spark job, and add more concurrent for scan operation
> --
>
> Key: KUDU-2670
> URL: https://issues.apache.org/jira/browse/KUDU-2670
> Project: Kudu
>  Issue Type: Improvement
>  Components: java, spark
>Affects Versions: 1.8.0
>Reporter: yangz
>Assignee: Xu Yao
>Priority: Major
>  Labels: performance
>
> Refer to the KUDU-2437 Split a tablet into primary key ranges by size.
> We need a java client implementation to support the split the tablet scan 
> operation.
> We suggest two new implementation for the java client.
>  # A ConcurrentKuduScanner to get more scanner read data at the same time. 
> This will be useful for one case.  We scanner only one row, but the predicate 
> doesn't contain the primary key, for this case, we will send a lot scanner 
> request but only one row return.It will be slow to send so much scanner 
> request one by one. So we need a concurrent way. And by this case we test, 
> for a 10G tablet, it will save a lot time for one machine.
>  # A way to split more spark task. To do so, we need get scanner tokens for 
> two step, first we send to the tserver to give range, then with this range we 
> get more scanner tokens. For our usage we make a tablet 10G, but we split a 
> task to process only 1G data. So we get better performance.
> And all this feature has run well for us for half a year. We hope this 
> feature will be useful for the community.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3393) c++ client suport getTableKeyRanges

2023-01-31 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682838#comment-17682838
 ] 

Yingchun Lai commented on KUDU-3393:


Server and Java client parts have been implemented by 
https://issues.apache.org/jira/browse/KUDU-2670

> c++ client suport getTableKeyRanges 
> 
>
> Key: KUDU-3393
> URL: https://issues.apache.org/jira/browse/KUDU-3393
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Reporter: dengke
>Priority: Major
>
> The java client can split a tablet to mutil ranges and concurrent scan data. 
> This is a good feature, but the C++ client does not support this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3436) build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1

2023-07-04 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739953#comment-17739953
 ] 

Yingchun Lai commented on KUDU-3436:


{quote} * New in macOS Big Sur 11.0.1, the system ships with a built-in dynamic 
linker cache of all system-provided libraries. As part of this change, copies 
of dynamic libraries are no longer present on the filesystem. Code that 
attempts to check for dynamic library presence by looking for a file at a path 
or enumerating a directory will fail. Instead, check for library presence by 
attempting to {{dlopen()}} the path, which will correctly check for the library 
in the cache. (62986286)
{quote}
[https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11_0_1-release-notes#Kernel]

It seems we can't copy /usr/lib/libc+abi.dylib to the kudu-binary JAR artifact, 
except build the binaries on macOS 10.13 (the oldest version macOS Kudu 
support)?

> build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1
> -
>
> Key: KUDU-3436
> URL: https://issues.apache.org/jira/browse/KUDU-3436
> Project: Kudu
>  Issue Type: Bug
>Reporter: Bakai Ádám
>Priority: Major
>
>  
> {code:java}
> build_mini_cluster_binaries.sh {code}
> returns the following error:
> {code:java}
> Traceback (most recent call last):
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 503, in 
> main()
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 500, in main
> relocate_deps(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 408, in relocate_deps
> return relocate_deps_macos(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 397, in relocate_deps_macos
> copy_file(dep_src, dep_dst)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 287, in copy_file
> shutil.copyfile(src, dest)
>   File 
> "/opt/homebrew/Cellar/python@2/2.7.18/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py",
>  line 96, in copyfile
> with open(src, 'rb') as fsrc:
> IOError: [Errno 2] No such file or directory: u'/usr/lib/libc++abi.dylib' 
> {code}
> After further investigation, it looks like libc+{+}abi.dylib is in the 
> uninstrumented lib, but otool -L always gives back a path for 
> /usr/lib/libc{+}+abi.dylib . Simply adding the dylib into the 
> PAT_MACOS_LIB_EXCLUDE list doesn't work: it creates a jar file, but the 
> binaries can not be started.
> It is probably due to the changes in how dynamic linking works in newer 
> MacOS: 
> [https://stackoverflow.com/questions/70581876/macos-dynamic-linker-reports-it-loaded-library-which-doesnt-exist]
> It happens both on ARM64 and X86



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KUDU-3436) build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1

2023-07-04 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739953#comment-17739953
 ] 

Yingchun Lai edited comment on KUDU-3436 at 7/4/23 3:35 PM:


{quote} * New in macOS Big Sur 11.0.1, the system ships with a built-in dynamic 
linker cache of all system-provided libraries. As part of this change, copies 
of dynamic libraries are no longer present on the filesystem. Code that 
attempts to check for dynamic library presence by looking for a file at a path 
or enumerating a directory will fail. Instead, check for library presence by 
attempting to {{dlopen()}} the path, which will correctly check for the library 
in the cache. (62986286){quote}
[https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11_0_1-release-notes#Kernel]

It seems we can't copy /usr/lib/libc+abi.dylib to the kudu-binary JAR artifact, 
except build the binaries on macOS 10.13 (the oldest version macOS Kudu 
support) ~ 10.15?


was (Author: laiyingchun):
{quote} * New in macOS Big Sur 11.0.1, the system ships with a built-in dynamic 
linker cache of all system-provided libraries. As part of this change, copies 
of dynamic libraries are no longer present on the filesystem. Code that 
attempts to check for dynamic library presence by looking for a file at a path 
or enumerating a directory will fail. Instead, check for library presence by 
attempting to {{dlopen()}} the path, which will correctly check for the library 
in the cache. (62986286)
{quote}
[https://developer.apple.com/documentation/macos-release-notes/macos-big-sur-11_0_1-release-notes#Kernel]

It seems we can't copy /usr/lib/libc+abi.dylib to the kudu-binary JAR artifact, 
except build the binaries on macOS 10.13 (the oldest version macOS Kudu 
support)?

> build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1
> -
>
> Key: KUDU-3436
> URL: https://issues.apache.org/jira/browse/KUDU-3436
> Project: Kudu
>  Issue Type: Bug
>Reporter: Bakai Ádám
>Priority: Major
>
>  
> {code:java}
> build_mini_cluster_binaries.sh {code}
> returns the following error:
> {code:java}
> Traceback (most recent call last):
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 503, in 
> main()
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 500, in main
> relocate_deps(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 408, in relocate_deps
> return relocate_deps_macos(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 397, in relocate_deps_macos
> copy_file(dep_src, dep_dst)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 287, in copy_file
> shutil.copyfile(src, dest)
>   File 
> "/opt/homebrew/Cellar/python@2/2.7.18/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py",
>  line 96, in copyfile
> with open(src, 'rb') as fsrc:
> IOError: [Errno 2] No such file or directory: u'/usr/lib/libc++abi.dylib' 
> {code}
> After further investigation, it looks like libc+{+}abi.dylib is in the 
> uninstrumented lib, but otool -L always gives back a path for 
> /usr/lib/libc{+}+abi.dylib . Simply adding the dylib into the 
> PAT_MACOS_LIB_EXCLUDE list doesn't work: it creates a jar file, but the 
> binaries can not be started.
> It is probably due to the changes in how dynamic linking works in newer 
> MacOS: 
> [https://stackoverflow.com/questions/70581876/macos-dynamic-linker-reports-it-loaded-library-which-doesnt-exist]
> It happens both on ARM64 and X86



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3436) build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1

2023-07-05 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740206#comment-17740206
 ] 

Yingchun Lai commented on KUDU-3436:


What kind of runtime issues will be if skipping pack libc++abi.dylib to the 
artifact?
I can find 3 ways to solve the problem:
 # skip packing libc++abi.dylib to the artifact
 # copy libc++abi.dylib from thirdparty, i.e. 
thirdparty/installed/uninstrumented/lib, to the artifact
 # do not publish macOS artifact

> build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1
> -
>
> Key: KUDU-3436
> URL: https://issues.apache.org/jira/browse/KUDU-3436
> Project: Kudu
>  Issue Type: Bug
>Reporter: Bakai Ádám
>Priority: Major
>
>  
> {code:java}
> build_mini_cluster_binaries.sh {code}
> returns the following error:
> {code:java}
> Traceback (most recent call last):
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 503, in 
> main()
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 500, in main
> relocate_deps(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 408, in relocate_deps
> return relocate_deps_macos(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 397, in relocate_deps_macos
> copy_file(dep_src, dep_dst)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 287, in copy_file
> shutil.copyfile(src, dest)
>   File 
> "/opt/homebrew/Cellar/python@2/2.7.18/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py",
>  line 96, in copyfile
> with open(src, 'rb') as fsrc:
> IOError: [Errno 2] No such file or directory: u'/usr/lib/libc++abi.dylib' 
> {code}
> After further investigation, it looks like libc+{+}abi.dylib is in the 
> uninstrumented lib, but otool -L always gives back a path for 
> /usr/lib/libc{+}+abi.dylib . Simply adding the dylib into the 
> PAT_MACOS_LIB_EXCLUDE list doesn't work: it creates a jar file, but the 
> binaries can not be started.
> It is probably due to the changes in how dynamic linking works in newer 
> MacOS: 
> [https://stackoverflow.com/questions/70581876/macos-dynamic-linker-reports-it-loaded-library-which-doesnt-exist]
> It happens both on ARM64 and X86



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3436) build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1

2023-07-05 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740210#comment-17740210
 ] 

Yingchun Lai commented on KUDU-3436:


Besides, in recent years Mac with Apple chips are more popular, maybe we can 
change to/both publish ARM artifacts.

> build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1
> -
>
> Key: KUDU-3436
> URL: https://issues.apache.org/jira/browse/KUDU-3436
> Project: Kudu
>  Issue Type: Bug
>Reporter: Bakai Ádám
>Priority: Major
>
>  
> {code:java}
> build_mini_cluster_binaries.sh {code}
> returns the following error:
> {code:java}
> Traceback (most recent call last):
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 503, in 
> main()
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 500, in main
> relocate_deps(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 408, in relocate_deps
> return relocate_deps_macos(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 397, in relocate_deps_macos
> copy_file(dep_src, dep_dst)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 287, in copy_file
> shutil.copyfile(src, dest)
>   File 
> "/opt/homebrew/Cellar/python@2/2.7.18/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py",
>  line 96, in copyfile
> with open(src, 'rb') as fsrc:
> IOError: [Errno 2] No such file or directory: u'/usr/lib/libc++abi.dylib' 
> {code}
> After further investigation, it looks like libc+{+}abi.dylib is in the 
> uninstrumented lib, but otool -L always gives back a path for 
> /usr/lib/libc{+}+abi.dylib . Simply adding the dylib into the 
> PAT_MACOS_LIB_EXCLUDE list doesn't work: it creates a jar file, but the 
> binaries can not be started.
> It is probably due to the changes in how dynamic linking works in newer 
> MacOS: 
> [https://stackoverflow.com/questions/70581876/macos-dynamic-linker-reports-it-loaded-library-which-doesnt-exist]
> It happens both on ARM64 and X86



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3436) build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1

2023-07-19 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744690#comment-17744690
 ] 

Yingchun Lai commented on KUDU-3436:


Has been solved by https://gerrit.cloudera.org/c/20185/

> build_mini_cluster_binaries.sh doesn't work on Mac 13.0.1
> -
>
> Key: KUDU-3436
> URL: https://issues.apache.org/jira/browse/KUDU-3436
> Project: Kudu
>  Issue Type: Bug
>Reporter: Bakai Ádám
>Priority: Major
>
>  
> {code:java}
> build_mini_cluster_binaries.sh {code}
> returns the following error:
> {code:java}
> Traceback (most recent call last):
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 503, in 
> main()
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 500, in main
> relocate_deps(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 408, in relocate_deps
> return relocate_deps_macos(target_src, target_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 398, in relocate_deps_macos
> relocate_deps_macos(dep_src, dep_dst, config)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 397, in relocate_deps_macos
> copy_file(dep_src, dep_dst)
>   File 
> "/Users/adambakai/CLionProjects/kudu/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py",
>  line 287, in copy_file
> shutil.copyfile(src, dest)
>   File 
> "/opt/homebrew/Cellar/python@2/2.7.18/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py",
>  line 96, in copyfile
> with open(src, 'rb') as fsrc:
> IOError: [Errno 2] No such file or directory: u'/usr/lib/libc++abi.dylib' 
> {code}
> After further investigation, it looks like libc+{+}abi.dylib is in the 
> uninstrumented lib, but otool -L always gives back a path for 
> /usr/lib/libc{+}+abi.dylib . Simply adding the dylib into the 
> PAT_MACOS_LIB_EXCLUDE list doesn't work: it creates a jar file, but the 
> binaries can not be started.
> It is probably due to the changes in how dynamic linking works in newer 
> MacOS: 
> [https://stackoverflow.com/questions/70581876/macos-dynamic-linker-reports-it-loaded-library-which-doesnt-exist]
> It happens both on ARM64 and X86



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KUDU-3510) Docker images build failed

2023-09-11 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3510:
--

 Summary: Docker images build failed
 Key: KUDU-3510
 URL: https://issues.apache.org/jira/browse/KUDU-3510
 Project: Kudu
  Issue Type: Bug
  Components: build, docker
Affects Versions: 1.17.0
Reporter: Yingchun Lai


I encountered some issures when try to build Docker images:

1.

Enviroment:

CentOS 7.9, docker 24.0.1.

Error:
{code:java}
$ python ./docker/docker-build.py --action push --platforms linux/amd64 
linux/arm64
Starting docker build: 2023-09-12T13:43:53.888588
Version: 1.17.0 (a3cd1ef13)
...
 => CANCELED [linux/amd64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
bootstrap-python-env.sh                                     2.7s
 => ERROR [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
bootstrap-python-env.sh                                        2.0s
 => CANCELED [linux/arm64 runtime 5/5] RUN ./bootstrap-runtime-env.sh && rm 
bootstrap-runtime-env.sh                                                        
                                                                                
             2.4s
--
 > [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
bootstrap-python-env.sh:
#0 1.451 Error while loading ȇs//./bootstrap-dev-env.sh: No such file or 
directory
--
ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c 
./bootstrap-dev-env.sh   && ./bootstrap-java-env.sh   && 
./bootstrap-python-env.sh   && rm bootstrap-dev-env.sh   && rm 
bootstrap-java-env.sh   && rm bootstrap-python-env.sh" did not complete 
successfully: exit code: 1
Traceback (most recent call last):
  File "./docker/docker-build.py", line 384, in 
    main()
  File "./docker/docker-build.py", line 377, in main
    run_command(docker_build_cmd, opts)
  File "./docker/docker-build.py", line 145, in run_command
    subprocess.check_output(cmd, shell=True)
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command 'docker buildx build --push --platform 
linux/arm64,linux/amd64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache Kudu 
" --build-arg URL="https://kudu.apache.org"; --build-arg 
VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" 
--build-arg VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data1/laiyingchun/dev/ap_kudu_117/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data1/laiyingchun/dev/ap_kudu_117' returned non-zero 
exit status 1 {code}
This issue seems can be resolved by [https://gerrit.cloudera.org/c/20299/,] but 
I didn't troubleshoot the root cause.

2.

Enviroment: Rocky 8.6, 20.10.17

Error:
{code:java}
$ python3 ./docker/docker-build.py --action push --platforms linux/amd64 
linux/arm64
Starting docker build: 2023-09-12T13:43:42.725191
Version: 1.17.0 (a3cd1ef13)
...
 => CACHED [linux/amd64 kudu 6/6] COPY --chown=kudu:kudu 
./docker/kudu-entrypoint.sh /                                                   
                                                                                
                                0.0s
 => ERROR [linux/arm64 build 10/17] RUN 
--mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
--mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle  
 ../../build-support/enable_devtoolset.sh   ../../  727.5s
--
 > [linux/arm64 build 10/17] RUN 
--mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
--mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle  
 ../../build-support/enable_devtoolset.sh   
../../thirdparty/installed/common/bin/cmake   -DCMAKE_BUILD_TYPE=release   
-DKUDU_LINK=static   -DKUDU_GIT_HASH=a3cd1ef13   -DNO_TESTS=1   ../..   && make 
-j4   && sudo make install   && if [ "1" == "1" ]; then find "bin" -name 
"kudu*" -type f -exec strip {} ;; fi   && if [[ "1" == "1" ]]; then find 
"/usr/local" -name "libkudu*" -type f -exec strip {} ;; fi:
#0 2.029 -- The C compiler identification is GNU 7.5.0
#0 2.930 -- The CXX compiler identification is GNU 7.5.0
...
#0 704.1 [100%] Linking CXX executable ../../../bin/kudu
#0 723.3 [100%] Built target kudu
#0 723.4 sudo: effective uid is not 0

[jira] [Commented] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764106#comment-17764106
 ] 

Yingchun Lai commented on KUDU-3510:


The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: https://github.com/docker/buildx/issues/1335

> Docker images build failed
> --
>
> Key: KUDU-3510
> URL: https://issues.apache.org/jira/browse/KUDU-3510
> Project: Kudu
>  Issue Type: Bug
>  Components: build, docker
>Affects Versions: 1.17.0
>Reporter: Yingchun Lai
>Priority: Major
>
> I encountered some issures when try to build Docker images:
> 1.
> Enviroment:
> CentOS 7.9, docker 24.0.1.
> Error:
> {code:java}
> $ python ./docker/docker-build.py --action push --platforms linux/amd64 
> linux/arm64
> Starting docker build: 2023-09-12T13:43:53.888588
> Version: 1.17.0 (a3cd1ef13)
> ...
>  => CANCELED [linux/amd64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                     2.7s
>  => ERROR [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                        2.0s
>  => CANCELED [linux/arm64 runtime 5/5] RUN ./bootstrap-runtime-env.sh && rm 
> bootstrap-runtime-env.sh                                                      
>                                                                               
>                  2.4s
> --
>  > [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh:
> #0 1.451 Error while loading ȇs//./bootstrap-dev-env.sh: No such file or 
> directory
> --
> ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c 
> ./bootstrap-dev-env.sh   && ./bootstrap-java-env.sh   && 
> ./bootstrap-python-env.sh   && rm bootstrap-dev-env.sh   && rm 
> bootstrap-java-env.sh   && rm bootstrap-python-env.sh" did not complete 
> successfully: exit code: 1
> Traceback (most recent call last):
>   File "./docker/docker-build.py", line 384, in 
>     main()
>   File "./docker/docker-build.py", line 377, in main
>     run_command(docker_build_cmd, opts)
>   File "./docker/docker-build.py", line 145, in run_command
>     subprocess.check_output(cmd, shell=True)
>   File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
>     raise CalledProcessError(retcode, cmd, output=output)
> subprocess.CalledProcessError: Command 'docker buildx build --push --platform 
> linux/arm64,linux/amd64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" 
> --build-arg DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" 
> --build-arg DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache 
> Kudu " --build-arg URL="https://kudu.apache.org"; 
> --build-arg VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg 
> VCS_TYPE="git" --build-arg 
> VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
> /data1/laiyingchun/dev/ap_kudu_117/docker/Dockerfile --target kudu --tag 
> apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
> apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag 
> apache/kudu:latest-ubuntu --tag apache/kudu:latest 
> /data1/laiyingchun/dev/ap_kudu_117' returned non-zero exit status 1 {code}
> This issue seems can be resolved by [https://gerrit.cloudera.org/c/20299/,] 
> but I didn't troubleshoot the root cause.
> 2.
> Enviroment: Rocky 8.6, 20.10.17
> Error:
> {code:java}
> $ python3 ./docker/docker-build.py --action push --platforms linux/amd64 
> linux/arm64
> Starting docker build: 2023-09-12T13:43:42.725191
> Version: 1.17.0 (a3cd1ef13)
> ...
>  => CACHED [linux/amd64 kudu 6/6] COPY --chown=kudu:kudu 
> ./docker/kudu-entrypoint.sh /                                                 
>                                                                               
>                                     0.0s
>  => ERROR [linux/arm64 build 10/17] RUN 
> --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
> --mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle
>    ../../build-support/enable_devtoolset.sh   ../../  727.5s
> --
>  > [linux/arm64 build 10/17] RUN 
> --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
> --mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle
>    ../../build-support/enable_devtoolset.sh   
> ../../thirdp

[jira] [Comment Edited] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764106#comment-17764106
 ] 

Yingchun Lai edited comment on KUDU-3510 at 9/12/23 9:50 AM:
-

The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu
[+] Building 11.8s (52/52) FINISHED
...
Finished Docker build: 2023-09-12T17:37:37.680546 (0:02:11.471078) {code}
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/arm64 
Starting docker build: 2023-09-12T17:41:29.525245 Version: 1.17.0 (a3cd1ef13) 
Bases: ['ubuntu:bionic'] Targets: ['kudu', 'kudu-python'] Building targets for 
ubuntu:bionic... Building kudu target... Running: docker buildx build --load 
--platform linux/arm64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache Kudu 
" --build-arg URL="https://kudu.apache.org"; --build-arg 
VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" 
--build-arg VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu [+] Building 21.5s (52/52) 
FINISHED  => [internal] load .dockerignore                                      
                                                                                
                                                                                
                  0.0s ...
Finished Docker build: 2023-09-12T17:48:39.189521 (0:07:09.664276)
{code}


was (Author: laiyingchun):
The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: https://github.com/docker/buildx/issues/1335

> Docker images build failed
> --
>
> Key: KUDU-3510
> URL: https://issues.apache.org/jira/browse/KUDU-3510
> Project: Kudu
>  Issue Type: Bug
>  Components: build, docker
>Affects Versions: 1.17.0
>Reporter: Yingchun Lai
>Priority: Major
>
> I encountered some issures when try to build Docker images:
> 1.
> Enviroment:
> CentOS 7.9, docker 24.0.1.
> Error:
> {code:java}
> $ python ./docker/docker-build.py --action push --platforms linux/amd64 
> linux/arm64
> Starting docker build: 2023-09-12T13:43:53.888588
> Version: 1.17.0 (a3cd1ef13)
> ...
>  => CANCELED [linux/amd64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                     2.7s
>  => ERROR [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                        2.0s
>  => CANCELED [linux/arm64 runtime 5/5] RUN ./bootstrap-runtime-env.sh && rm 
> bootstrap-runtime-env.sh                                                      
>                                                                               
>                  2.4s
> --
>  > [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh:
> #0 1.451 Error w

[jira] [Comment Edited] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764106#comment-17764106
 ] 

Yingchun Lai edited comment on KUDU-3510 at 9/12/23 9:51 AM:
-

The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu
[+] Building 11.8s (52/52) FINISHED
...
Finished Docker build: 2023-09-12T17:37:37.680546 (0:02:11.471078) {code}
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/arm64
Starting docker build: 2023-09-12T17:41:29.525245 Version: 1.17.0 (a3cd1ef13) 
Bases: ['ubuntu:bionic'] Targets: ['kudu', 'kudu-python'] Building targets for 
ubuntu:bionic... Building kudu target... Running: docker buildx build --load 
--platform linux/arm64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache Kudu 
" --build-arg URL="https://kudu.apache.org"; --build-arg 
VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" 
--build-arg VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu [+] Building 21.5s (52/52) 
FINISHED  => [internal] load .dockerignore                                      
                                                                                
                                                                                
                  0.0s ...
Finished Docker build: 2023-09-12T17:48:39.189521 (0:07:09.664276)
{code}


was (Author: laiyingchun):
The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu
[+] Building 11.8s (52/52) FINISHED
...
Finished Docker build: 2023-09-12T17:37:37.680546 (0:02:11.471078) {code}
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/arm64 
Starting docker build: 2023-09-12T17:41:29.525245 Version: 1.17.0 (a3cd1ef13) 
Bases: ['ubuntu:bionic'] Targets: ['kudu', 'kudu-python'] Building targets for 
ubuntu:bionic... Building kudu target... Running: docker buildx build --load 
--platform linux/arm64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="

[jira] [Comment Edited] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764106#comment-17764106
 ] 

Yingchun Lai edited comment on KUDU-3510 at 9/12/23 9:51 AM:
-

The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu
[+] Building 11.8s (52/52) FINISHED
...
Finished Docker build: 2023-09-12T17:37:37.680546 (0:02:11.471078) {code}
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/arm64
Starting docker build: 2023-09-12T17:41:29.525245 Version: 1.17.0 (a3cd1ef13) 
Bases: ['ubuntu:bionic'] Targets: ['kudu', 'kudu-python'] Building targets for 
ubuntu:bionic... Building kudu target... Running: docker buildx build --load 
--platform linux/arm64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache Kudu 
" --build-arg URL="https://kudu.apache.org"; --build-arg 
VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" 
--build-arg VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu [+] Building 21.5s (52/52) 
FINISHED  => [internal] load .dockerignore                                      
                                                                                
                                                                                
                  0.0s ...
Building kudu-python target...
Running: docker buildx build --load --platform linux/arm64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu-python --tag 
apache/kudu:kudu-python-1.17.0-ubuntu --tag apache/kudu:kudu-python-1.17.0 
--tag apache/kudu:kudu-python-1.17-ubuntu --tag apache/kudu:kudu-python-1.17 
--tag apache/kudu:kudu-python-latest-ubuntu --tag 
apache/kudu:kudu-python-latest /data/qdev/laiyingchun/kudu
[+] Building 407.2s (53/53) FINISHED
Finished Docker build: 2023-09-12T17:48:39.189521 (0:07:09.664276)
{code}


was (Author: laiyingchun):
The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/

[jira] [Closed] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai closed KUDU-3510.
--
Resolution: Fixed

> Docker images build failed
> --
>
> Key: KUDU-3510
> URL: https://issues.apache.org/jira/browse/KUDU-3510
> Project: Kudu
>  Issue Type: Bug
>  Components: build, docker
>Affects Versions: 1.17.0
>Reporter: Yingchun Lai
>Priority: Major
>
> I encountered some issures when try to build Docker images:
> 1.
> Enviroment:
> CentOS 7.9, docker 24.0.1.
> Error:
> {code:java}
> $ python ./docker/docker-build.py --action push --platforms linux/amd64 
> linux/arm64
> Starting docker build: 2023-09-12T13:43:53.888588
> Version: 1.17.0 (a3cd1ef13)
> ...
>  => CANCELED [linux/amd64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                     2.7s
>  => ERROR [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh                                        2.0s
>  => CANCELED [linux/arm64 runtime 5/5] RUN ./bootstrap-runtime-env.sh && rm 
> bootstrap-runtime-env.sh                                                      
>                                                                               
>                  2.4s
> --
>  > [linux/arm64 dev 7/7] RUN ./bootstrap-dev-env.sh   && 
> ./bootstrap-java-env.sh   && ./bootstrap-python-env.sh   && rm 
> bootstrap-dev-env.sh   && rm bootstrap-java-env.sh   && rm 
> bootstrap-python-env.sh:
> #0 1.451 Error while loading ȇs//./bootstrap-dev-env.sh: No such file or 
> directory
> --
> ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/sh -c 
> ./bootstrap-dev-env.sh   && ./bootstrap-java-env.sh   && 
> ./bootstrap-python-env.sh   && rm bootstrap-dev-env.sh   && rm 
> bootstrap-java-env.sh   && rm bootstrap-python-env.sh" did not complete 
> successfully: exit code: 1
> Traceback (most recent call last):
>   File "./docker/docker-build.py", line 384, in 
>     main()
>   File "./docker/docker-build.py", line 377, in main
>     run_command(docker_build_cmd, opts)
>   File "./docker/docker-build.py", line 145, in run_command
>     subprocess.check_output(cmd, shell=True)
>   File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
>     raise CalledProcessError(retcode, cmd, output=output)
> subprocess.CalledProcessError: Command 'docker buildx build --push --platform 
> linux/arm64,linux/amd64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" 
> --build-arg DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" 
> --build-arg DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache 
> Kudu " --build-arg URL="https://kudu.apache.org"; 
> --build-arg VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg 
> VCS_TYPE="git" --build-arg 
> VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
> /data1/laiyingchun/dev/ap_kudu_117/docker/Dockerfile --target kudu --tag 
> apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
> apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag 
> apache/kudu:latest-ubuntu --tag apache/kudu:latest 
> /data1/laiyingchun/dev/ap_kudu_117' returned non-zero exit status 1 {code}
> This issue seems can be resolved by [https://gerrit.cloudera.org/c/20299/,] 
> but I didn't troubleshoot the root cause.
> 2.
> Enviroment: Rocky 8.6, 20.10.17
> Error:
> {code:java}
> $ python3 ./docker/docker-build.py --action push --platforms linux/amd64 
> linux/arm64
> Starting docker build: 2023-09-12T13:43:42.725191
> Version: 1.17.0 (a3cd1ef13)
> ...
>  => CACHED [linux/amd64 kudu 6/6] COPY --chown=kudu:kudu 
> ./docker/kudu-entrypoint.sh /                                                 
>                                                                               
>                                     0.0s
>  => ERROR [linux/arm64 build 10/17] RUN 
> --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
> --mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle
>    ../../build-support/enable_devtoolset.sh   ../../  727.5s
> --
>  > [linux/arm64 build 10/17] RUN 
> --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache   
> --mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle
>    ../../build-support/enable_devtoolset.sh   
> ../../thirdparty/installed/common/bin/cmake   -DCMAKE_BUILD_TYPE=release   
> -DKUDU_LINK=static   -DKUDU_GIT_HASH=a3cd1ef13   -DNO_TESTS=1   ../..   && 
> make -j4   && sudo make install   && if [ "1" == "1" ]; then find "bin" -name 
> "kudu*" -type f -exe

[jira] [Comment Edited] (KUDU-3510) Docker images build failed

2023-09-12 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764106#comment-17764106
 ] 

Yingchun Lai edited comment on KUDU-3510 at 9/12/23 9:52 AM:
-

The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu
[+] Building 11.8s (52/52) FINISHED
...
Finished Docker build: 2023-09-12T17:37:37.680546 (0:02:11.471078) {code}
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/arm64
Starting docker build: 2023-09-12T17:41:29.525245 Version: 1.17.0 (a3cd1ef13) 
Bases: ['ubuntu:bionic'] Targets: ['kudu', 'kudu-python'] Building targets for 
ubuntu:bionic... Building kudu target... Running: docker buildx build --load 
--platform linux/arm64 --build-arg RUNTIME_BASE_OS="ubuntu:bionic" --build-arg 
DEV_BASE_OS="ubuntu:bionic" --build-arg BASE_OS="ubuntu:bionic" --build-arg 
DOCKERFILE="docker/Dockerfile" --build-arg MAINTAINER="Apache Kudu 
" --build-arg URL="https://kudu.apache.org"; --build-arg 
VERSION="1.17.0" --build-arg VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" 
--build-arg VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu --tag 
apache/kudu:1.17.0-ubuntu --tag apache/kudu:1.17.0 --tag 
apache/kudu:1.17-ubuntu --tag apache/kudu:1.17 --tag apache/kudu:latest-ubuntu 
--tag apache/kudu:latest /data/qdev/laiyingchun/kudu [+] Building 21.5s (52/52) 
FINISHED  => [internal] load .dockerignore                                      
                                                                                
                                                                                
                  0.0s ...
Building kudu-python target...
Running: docker buildx build --load --platform linux/arm64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/kudu/docker/Dockerfile --target kudu-python --tag 
apache/kudu:kudu-python-1.17.0-ubuntu --tag apache/kudu:kudu-python-1.17.0 
--tag apache/kudu:kudu-python-1.17-ubuntu --tag apache/kudu:kudu-python-1.17 
--tag apache/kudu:kudu-python-latest-ubuntu --tag 
apache/kudu:kudu-python-latest /data/qdev/laiyingchun/kudu
[+] Building 407.2s (53/53) FINISHED
...
Finished Docker build: 2023-09-12T17:48:39.189521 (0:07:09.664276)
{code}


was (Author: laiyingchun):
The issue #2 seems can be resolved by command:
{code:java}
docker run --privileged multiarch/qemu-user-static:latest --reset -p yes 
--credential yes {code}
ref: [https://github.com/docker/buildx/issues/1335]

 

It works well after the command.
{code:java}
$ python3 ./docker/docker-build.py --action load --platforms linux/amd64
Starting docker build: 2023-09-12T17:35:26.209468
Version: 1.17.0 (a3cd1ef13)
Bases: ['ubuntu:bionic']
Targets: ['kudu', 'kudu-python']
Building targets for ubuntu:bionic...
Building kudu target...
Running: docker buildx build --load --platform linux/amd64 --build-arg 
RUNTIME_BASE_OS="ubuntu:bionic" --build-arg DEV_BASE_OS="ubuntu:bionic" 
--build-arg BASE_OS="ubuntu:bionic" --build-arg DOCKERFILE="docker/Dockerfile" 
--build-arg MAINTAINER="Apache Kudu " --build-arg 
URL="https://kudu.apache.org"; --build-arg VERSION="1.17.0" --build-arg 
VCS_REF="a3cd1ef13" --build-arg VCS_TYPE="git" --build-arg 
VCS_URL="https://gitbox.apache.org/repos/asf/kudu.git"; --file 
/data/qdev/laiyingchun/k

[jira] [Created] (KUDU-3580) Kudu servers and tests crash after linking RocksDB library

2024-05-26 Thread Yingchun Lai (Jira)
Yingchun Lai created KUDU-3580:
--

 Summary: Kudu servers and tests crash after linking RocksDB library
 Key: KUDU-3580
 URL: https://issues.apache.org/jira/browse/KUDU-3580
 Project: Kudu
  Issue Type: Bug
  Components: master, test, tserver
Reporter: Yingchun Lai






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KUDU-3580) Kudu servers and tests crash after linking RocksDB library

2024-05-26 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3580:
---
Description: 
After this commit [1] is merged, it's reported that the binaries (both test 
binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in SIGILL 
with coredumps). 
 
GDB shows the following stack:
(gdb) run
Starting program: /home/aserbin/tmp/kudu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
std::function::swap(std::function&) (__x=..., 
this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
548 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No 
such file or directory.
(gdb) bt
#0  std::function::swap(std::function&) (
__x=..., this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
#1  std::function::operator=(std::function 
const&) (__x=..., this=0x7fffe108)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
#2  rocksdb::OptionTypeInfo::SetParseFunc(std::function 
const&)
(f=..., this=0x7fffe100)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
#3  rocksdb::OptionTypeInfo::AsCustomSharedPtr (
offset=offset@entry=0, 
ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
#4  0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
[clone .constprop.449] ()
at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
#5  0x03ca23cd in __libc_csu_init ()
#6  0x75a69c18 in __libc_start_main (main=0xed8de0 , argc=1, 
argv=0x7fffe4f8, init=0x3ca2380 <__libc_csu_init>, 
fini=, rtld_fini=, stack_end=0x7fffe4e8)
at ../csu/libc-start.c:266
#7  0x00f8f4c4 in _start ()
at /root/Projects/kudu/src/kudu/tools/tool_main.cc:306
(gdb) 
 

1. 
https://github.com/apache/kudu/commit/4da8b20070a7c0070a1829dfd50fdc78cad88b6a

> Kudu servers and tests crash after linking RocksDB library
> --
>
> Key: KUDU-3580
> URL: https://issues.apache.org/jira/browse/KUDU-3580
> Project: Kudu
>  Issue Type: Bug
>  Components: master, test, tserver
>Reporter: Yingchun Lai
>Priority: Critical
>
> After this commit [1] is merged, it's reported that the binaries (both test 
> binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in 
> SIGILL with coredumps). 
>  
> GDB shows the following stack:
> (gdb) run
> Starting program: /home/aserbin/tmp/kudu 
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Program received signal SIGILL, Illegal instruction.
> std::function const&, std::string const&, void*)>::swap(std::function (rocksdb::ConfigOptions const&, std::string const&, std::string const&, 
> void*)>&) (__x=..., 
> this=0x7fffe0e0)
> at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
> 548   /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No 
> such file or directory.
> (gdb) bt
> #0  std::function const&, std::string const&, void*)>::swap(std::function (rocksdb::ConfigOptions const&, std::string const&, std::string const&, 
> void*)>&) (
> __x=..., this=0x7fffe0e0)
> at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
> #1  std::function const&, std::string const&, void*)>::operator=(std::function (rocksdb::ConfigOptions const&, std::string const&, std::string const&, 
> void*)> const&) (__x=..., this=0x7fffe108)
> at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
> #2  rocksdb::OptionTypeInfo::SetParseFunc(std::function (rocksdb::ConfigOptions const&, std::string const&, std::string const&, 
> void*)> const&)
> (f=..., this=0x7fffe100)
> at 
> /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
> #3  rocksdb::OptionTypeInfo::AsCustomSharedPtr (
> offset=offset@entry=0, 
> ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
> flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
> at 
> /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
> #4  0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
> [clone .constprop.449] ()
> at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
> #5  0x03ca23cd in __libc_csu_init ()
> #6  0x75a69c18 in __l

[jira] [Updated] (KUDU-3580) Kudu servers and tests crash after linking RocksDB library

2024-05-26 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3580:
---
Description: 
After this commit [1] is merged, it's reported that the binaries (both test 
binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in SIGILL 
with coredumps). 
 
GDB shows the following stack:
{code:java}
(gdb) run
Starting program: /home/aserbin/tmp/kudu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGILL, Illegal instruction.
std::function::swap(std::function&) (__x=..., 
this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
548 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No such 
file or directory.
(gdb) bt
#0 std::function::swap(std::function&) (
__x=..., this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
#1 std::function::operator=(std::function 
const&) (__x=..., this=0x7fffe108)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
#2 rocksdb::OptionTypeInfo::SetParseFunc(std::function 
const&)
(f=..., this=0x7fffe100)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
#3 rocksdb::OptionTypeInfo::AsCustomSharedPtr (
offset=offset@entry=0, 
ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
#4 0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
[clone .constprop.449] ()
at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
#5 0x03ca23cd in __libc_csu_init ()
#6 0x75a69c18 in __libc_start_main (main=0xed8de0 , argc=1, 
argv=0x7fffe4f8, init=0x3ca2380 <__libc_csu_init>, 
fini=, rtld_fini=, stack_end=0x7fffe4e8)
at ../csu/libc-start.c:266
#7 0x00f8f4c4 in _start ()
at /root/Projects/kudu/src/kudu/tools/tool_main.cc:306
(gdb) {code}
 

1. 
[https://github.com/apache/kudu/commit/4da8b20070a7c0070a1829dfd50fdc78cad88b6a]

  was:
After this commit [1] is merged, it's reported that the binaries (both test 
binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in SIGILL 
with coredumps). 
 
GDB shows the following stack:
(gdb) run
Starting program: /home/aserbin/tmp/kudu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
std::function::swap(std::function&) (__x=..., 
this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
548 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No 
such file or directory.
(gdb) bt
#0  std::function::swap(std::function&) (
__x=..., this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
#1  std::function::operator=(std::function 
const&) (__x=..., this=0x7fffe108)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
#2  rocksdb::OptionTypeInfo::SetParseFunc(std::function 
const&)
(f=..., this=0x7fffe100)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
#3  rocksdb::OptionTypeInfo::AsCustomSharedPtr (
offset=offset@entry=0, 
ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
#4  0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
[clone .constprop.449] ()
at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
#5  0x03ca23cd in __libc_csu_init ()
#6  0x75a69c18 in __libc_start_main (main=0xed8de0 , argc=1, 
argv=0x7fffe4f8, init=0x3ca2380 <__libc_csu_init>, 
fini=, rtld_fini=, stack_end=0x7fffe4e8)
at ../csu/libc-start.c:266
#7  0x00f8f4c4 in _start ()
at /root/Projects/kudu/src/kudu/tools/tool_main.cc:306
(gdb) 
 

1. 
https://github.com/apache/kudu/commit/4da8b20070a7c0070a1829dfd50fdc78cad88b6a


> Kudu servers and tests crash after linking RocksDB library
> --
>
> Key: KUDU-3580
> URL: https://issues.apache.org/jira/browse/KUDU-3580
> Project: Kudu
>  Issue Type: Bug
>  Components: master, test, tserver
>Reporter: Yingchun Lai
>Priority: Critical
>
> After this commit [1] is merged, it's reported that the binaries (both test 
> binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in 
> SIGILL with coredumps). 
>  

[jira] [Updated] (KUDU-3580) Kudu servers and tests crash after linking RocksDB library

2024-05-26 Thread Yingchun Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai updated KUDU-3580:
---
Description: 
After this commit [1] is merged, it's reported that the binaries (both test 
binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in SIGILL 
with coredumps). 
 
GDB shows the following stack:
{code:java}
(gdb) run
Starting program: /home/aserbin/tmp/kudu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGILL, Illegal instruction.
std::function::swap(std::function&) (__x=..., 
this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
548 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No such 
file or directory.
(gdb) bt
#0 std::function::swap(std::function&) (
__x=..., this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
#1 std::function::operator=(std::function 
const&) (__x=..., this=0x7fffe108)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
#2 rocksdb::OptionTypeInfo::SetParseFunc(std::function 
const&)
(f=..., this=0x7fffe100)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
#3 rocksdb::OptionTypeInfo::AsCustomSharedPtr (
offset=offset@entry=0, 
ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
#4 0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
[clone .constprop.449] ()
at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
#5 0x03ca23cd in __libc_csu_init ()
#6 0x75a69c18 in __libc_start_main (main=0xed8de0 , argc=1, 
argv=0x7fffe4f8, init=0x3ca2380 <__libc_csu_init>, 
fini=, rtld_fini=, stack_end=0x7fffe4e8)
at ../csu/libc-start.c:266
#7 0x00f8f4c4 in _start ()
at /root/Projects/kudu/src/kudu/tools/tool_main.cc:306
(gdb) {code}
And an example of results where SIGILL is observed (just built the binaries 
with the top of the master branch at 634d967a0c620db2b3932c09b1fe13be1dc70f44): 
[http://dist-test.cloudera.org/job?job_id=root.1712768932.261750]
 
1. 
[https://github.com/apache/kudu/commit/4da8b20070a7c0070a1829dfd50fdc78cad88b6a]

  was:
After this commit [1] is merged, it's reported that the binaries (both test 
binaries, {{{}kudu{}}}, {{{}kudu-tserver{}}}, {{kudu-master}} results in SIGILL 
with coredumps). 
 
GDB shows the following stack:
{code:java}
(gdb) run
Starting program: /home/aserbin/tmp/kudu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGILL, Illegal instruction.
std::function::swap(std::function&) (__x=..., 
this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
548 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h: No such 
file or directory.
(gdb) bt
#0 std::function::swap(std::function&) (
__x=..., this=0x7fffe0e0)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:548
#1 std::function::operator=(std::function 
const&) (__x=..., this=0x7fffe108)
at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:463
#2 rocksdb::OptionTypeInfo::SetParseFunc(std::function 
const&)
(f=..., this=0x7fffe100)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:591
#3 rocksdb::OptionTypeInfo::AsCustomSharedPtr (
offset=offset@entry=0, 
ovt=ovt@entry=rocksdb::OptionVerificationType::kByName, 
flags=flags@entry=rocksdb::OptionTypeFlags::kDontSerialize)
at 
/root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/include/rocksdb/utilities/options_type.h:497
#4 0x00ee8c5e in __static_initialization_and_destruction_0(int, int) 
[clone .constprop.449] ()
at /root/Projects/kudu/thirdparty/src/rocksdb-7.7.3/env/env.cc:1267
#5 0x03ca23cd in __libc_csu_init ()
#6 0x75a69c18 in __libc_start_main (main=0xed8de0 , argc=1, 
argv=0x7fffe4f8, init=0x3ca2380 <__libc_csu_init>, 
fini=, rtld_fini=, stack_end=0x7fffe4e8)
at ../csu/libc-start.c:266
#7 0x00f8f4c4 in _start ()
at /root/Projects/kudu/src/kudu/tools/tool_main.cc:306
(gdb) {code}
 

1. 
[https://github.com/apache/kudu/commit/4da8b20070a7c0070a1829dfd50fdc78cad88b6a]


> Kudu servers and tests crash after linking RocksDB library
> --
>
> Key: KUDU-3580
> URL: https://issues.apache.org/jira/browse/KUDU-3580
> Project: Kudu
>  Issue Type: Bug
>  Components: master, test, tserver
>Reporter: Yingchun Lai
>Priority: Critical
>
> After this commit 

  1   2   >