Hello all, for my master's thesis I'm analyzing different storage policies in openstack swift. I'm manly interested in the reconstruction speed of the different EC implementations.
I've noticed in my tests, that there is no reconstruction of fragments/parity to other nodes/disks if a disk fails. My test setup consists of 8 nodes with each 4 disks. OS is Ubuntu 16.04 LTS and the swift version is 2.15.1/pike and here are my 2 example policies: --- [storage-policy:2] name = liberasurecode-rs-vand-4-2 policy_type = erasure_coding ec_type = liberasurecode_rs_vand ec_num_data_fragments = 4 ec_num_parity_fragments = 2 ec_object_segment_size = 1048576 [storage-policy:3] name = liberasurecode-rs-vand-3-1 policy_type = erasure_coding ec_type = liberasurecode_rs_vand ec_num_data_fragments = 3 ec_num_parity_fragments = 1 ec_object_segment_size = 1048576 --- ATM I've tested only the ec_type liberasurecode_rs_vand. With other implementations the startup of swift fails, but I think this is another topic. To simulate a disk failure I'm using fault injection [1]. Testrun example: 1. fill with objects (32.768 1M Objects, Sum: 32GB) 2. make a disk "fail" 3. disk failure is detected, /but no reconstruction/ 4. replace "failed" disk, mount "new" empty disk 5. missing fragments/parity is reconstructed on new empty disk Expected: 1. fill with objects (32.768 1M Objects, Sum: 32GB) 2. make a disk "fail" 3. disk failure is detected, reconstruction to remaining disks/nodes 4. replace "failed" disk, mount "new" empty disk 5. rearrange data in ring to pre fail state Shouldn't be the missing fragments/parity reconstructed on the remaining disks/nodes? (See point 3, in Testrun example) [1] https://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt Cheers, Hannes Fuchs _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack