[ https://issues.apache.org/jira/browse/KUDU-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xixu Wang updated KUDU-3523: ---------------------------- Attachment: image-2023-12-12-11-47-17-460.png > st_blksize is not alway equal to the filesystem block size > ---------------------------------------------------------- > > Key: KUDU-3523 > URL: https://issues.apache.org/jira/browse/KUDU-3523 > Project: Kudu > Issue Type: Bug > Reporter: Xixu Wang > Priority: Major > Attachments: image-2023-11-06-15-42-46-082.png, > image-2023-11-06-15-45-11-819.png, image-2023-11-06-15-45-39-233.png, > image-2023-11-06-15-52-41-834.png, image-2023-11-08-14-35-08-189.png, > image-2023-12-05-14-46-08-770.png, image-2023-12-05-14-50-58-889.png, > image-2023-12-05-14-56-11-794.png, image-2023-12-05-15-03-29-724.png, > image-2023-12-05-15-04-05-323.png, image-2023-12-05-15-04-51-010.png, > image-2023-12-07-10-50-52-642.png, image-2023-12-07-10-51-44-361.png, > image-2023-12-12-11-46-38-510.png, image-2023-12-12-11-47-17-460.png > > > In my ** aarch64 architecture system, the st_blksize is not equal to the real > filesystem block size. The st_blksize in my system is 65536 bytes, but the > block size of the filesystem is 4096 bytes. When writing some data which size > is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes. > But in kudu, it use st_blksize to decide the filesystem block size, which is > not always right. > > There is a unit test which causing this issue: > EncryptionEnabled/LogBlockManagerTest.ContainerPreallocationTest/1 > {code:java} > /root/kudu/src/kudu/fs/log_block_manager-test.cc:541: Failure > Expected equality of these values: > FLAGS_log_container_preallocate_bytes > Which is: 33554432 > size > Which is: 33492992 {code} > The code is follow: > !image-2023-11-08-14-35-08-189.png! > > FLAGS_log_container_preallocate_bytes=33554432 bytes > The file is encrypted, so the encryption header occupies one block in file > system. After creating the first block, there should be 2 blocks on the disk. > In my system (aarch64 kylin-10), the st_blksize=65536, but the block size of > file system is 4096, see part-4 follow. > When write the encryption header into the file, the on disk size is 4096, > when writing a new block, it's offset is 65536(it uses st_blksize to decide > the next block offset, see function: > src/kudu/util/env_posix.cc#GetBlockSize()). Therefore, in the first file > system block, only 4096 bytes on disk, but Kudu thinks it occupies 65536 > bytes, and preallocate (FLAGS_log_container_preallocate_bytes - 1) bytes for > this file. Actually, it generates (65536 - 4096) bytes hole in the file > system block. Finally, the file size on disk is > (FLAGS_log_container_preallocate_bytes - (65536 - 4096)) = 33492992. > > {color:#de350b}In my opinion, Kudu should use the file system block > size(f_bsize) as the Kudu block size, not st_blksize.{color} > > *1. The test environment* > Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 > CST 2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it. > *2.Create a file with encryption header* > > {code:java} > const string kFile = JoinPathSegments(test_dir_, "encrypted_file"); > unique_ptr<RWFile> rw; > RWFileOptions opts; > opts.is_sensitive = true; > ASSERT_OK(env_->NewRWFile(opts, kFile, &rw)); > uint64_t file_size = 0; > env_->GetFileSizeOnDisk(kFile, &file_size); {code} > *3.stat the file* > > The IO Block size is 65536, which means st_blsize is 65536, the file logic > size is 64 bytes. > !image-2023-11-06-15-42-46-082.png! > *4. filesystem block size is 4096 bytes* > !image-2023-11-06-15-45-39-233.png! > *5.The file on disk size is 4096 bytes* > !image-2023-11-06-15-52-41-834.png! > > *More information about my environment:* > {color:#de350b}*Kudu runs in a docker container, the docker image is created > by myself.*{color} > *1.CPU architecture(Host machine)* > !image-2023-12-05-14-46-08-770.png! > *2.Operation system(Host machine)* > !image-2023-12-05-14-50-58-889.png! > > *3.Filesystem type(Host machine): {color:#de350b}Kudu runs on the directory: > /sensorsmounts/hybriddata{color}* > !image-2023-12-05-14-56-11-794.png! > *4.CPU architecture(Kudu docker instance)* > *!image-2023-12-05-15-03-29-724.png!* > {*}5.operation system(Kudu docker instance){*}{*}{{*}} > *!image-2023-12-05-15-04-05-323.png!* > > *6.Filesystem type(Kudu docker instance): {color:#de350b}Kudu runs on > directory: /root/kudu, but the data directory is /tmp/kudutest-0/ when > running unit tests.{color}* > *!image-2023-12-05-15-04-51-010.png!* > > *7. In my docker instance, XFS filesystem will also hit this case.* > !image-2023-12-07-10-51-44-361.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)