大规模高校纠删码键值存储读写负载均衡研究

1)复旦大学信息化办公室,上海 200433; 2)复旦大学计算机科学技术学院,上海 200433; 3)上海市智能信息处理重点实验室,上海 200433

纠删码存储系统; 键值存储系统; 云存储系统; 存储架构; 负载均衡; 读写性能优化

Efficient load balance of the I/O operations for large-scale erasure-coded key-value storage systems in higher education
SHEN Jiajie1, ZHU Liangjie2, 3, XIANG Wang1, 2, 3, REN Chen1, and WANG Xin1, 2, 3

1)Informatization Office, Fudan University, Shanghai 200433, P.R.China2)School of Computer Science, Fudan University, Shanghai 200433, P.R.China3)Shanghai Key Laboratory of Intelligent Information Processing, Shanghai 200433, P.R.China

erasure-coded storage systems; key-value storage systems; cloud storage systems; storage architecture; load balance; I/O performance optimization

DOI: 10.3724/SP.J.1249.2020.99175

备注

纠删码广泛部署于分布式键值存储系统来保证数据可靠性.通过将用户数据编码并存储到多个存储节点,纠删码存储系统可以在部分节点失效的情况下恢复原始数据.随着存储节点数量的增加,存储节点往往会出现负载不均衡的情况,限制其在高校云计算和信息化领域的应用场景.为解决上述问题,提出大规模纠删码键值存储系统负载均衡方案.通过将逻辑控制和存储功能分离,纠删码存储系统可以高效地确定存储节点的负载状态.为充分利用节点之间网络带宽资源,提出多切片数据编码传输方案.根据用户写入数据量,设计混合数据写入机制来提升数据写入操作的性能.在此基础上,设计了原型纠删码键值存储系统,实际原型系统测试验证了本研究中负载均衡算法的有效性.

Erasure codes are widely used in distributed key-value(KV)storage systems to enhance the data reliability. However, load balance of storage nodes is a well-known challenge when deploying such erasure-coded storage systems in cloud computing and information service scenarios. To solve the above problems, we propose a large-scale load balance scheme for erasure-coded KV storage systems. By adding the control nodes to storage systems, our scheme efficiently obtains current states of the storage nodes. To improve the utilization of network bandwidth, we design a multiple-coded shard transmission proposal. Based on data volume of write requests, we further provide an efficient hybrid writing scheme. Finally, we implement a prototype erasure-coded storage system and conduct extensive experiments to verify the efficiency of our scheme.

·