[1]蔡晔,刘刚,毛睿,等.KD-90普及型个人高性能计算机系统设计与性能优化[J].深圳大学学报理工版,2013,30(No.2(111-220)):138-143.[doi:10.3724/SP.J.1249.2013.02138]
 Cai Ye,Liu Gang,Mao Rui,et al.Design and performance optimization of a popular high performance computing system KD-90[J].Journal of Shenzhen University Science and Engineering,2013,30(No.2(111-220)):138-143.[doi:10.3724/SP.J.1249.2013.02138]
点击复制

KD-90普及型个人高性能计算机系统设计与性能优化()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第30卷
期数:
2013年No.2(111-220)
页码:
138-143
栏目:
电子与信息科学
出版日期:
2013-03-18

文章信息/Info

Title:
Design and performance optimization of a popular high performance computing system KD-90
作者:
蔡晔刘刚毛睿罗秋明陈国良
深圳大学国家高性能中心深圳分中心,深圳大学计算机与软件学院,深圳 518060
Author(s):
Cai Ye Liu Gang Mao Rui Luo Qiuming and Chen Guoliang
National High Performance Computing Center at Shenzhen University, College of Computer Science and Software Engineering, Shenzhen 518060, P.R.China
关键词:
计算机工程个人高性能计算机系统龙芯并行体系结构高性能计算
Keywords:
compute engineering popular high performance computer Godson parallel architecture high performance computing
分类号:
TP 301
DOI:
10.3724/SP.J.1249.2013.02138
文献标志码:
A
摘要:
报道中国首台采用自主设计研制的龙芯3B 8核处理器的万亿次高性能计算机系统. 该系统具有高计算密度、低功耗、低成本、低占地的特点,其应用SMP→CC-NUMA→Cluster 3级并行体系结构,采用通用协议与专用协议结合的互连网络硬件设计,实现了CC-NUMA机群架构关键技术的突破;应用矢量部件加速技术实现了一种通用处理器与向量协处理器相结合的编程模型.结合体系结构特点和操作系统内核对系统性能优化并进行了性能测试和分析.
Abstract:
KD-90 is a teraflops high-performance computer system, the first of its kind, designed with 8-core Godson 3B CPU. It has the characteristics of high computing density, low-power, low-cost and low-occupies. Three parallel hierarchy, i.e. SMP-CC-NUMA-Cluster, is utilized in KD-90. Both the common communication protocol and the proprietary communication protocol are integrated in the system interconnect design, therefore the breakthroughs in performance of the CC-NUMA cluster is achieved. The parallel programming model of system is implemented by the combination of general-purpose processor and vector coprocessor acceleration technology. The Linpack performance optimization work by the architectural features and operating system kernel is discussed in detail.

参考文献/References:

[1] Dong C,Noel A E, Heidelberger P,et al.The IBM blue Gene/Q interconnection fabric[J].IEEE Micro,2012,32(1):32-43.
[2] Haring R A,Ohmacht M,Fox T W,et al.The IBM blue Gene/Q compute chip[J].IEEE Micro,2012,32(2):48-60.
[3] Yang Xuejun,Liao Xiangke, Song Junqiang,et al. The TianHe-1A supercomputer: its hardware and software[J].Journal of Computer Science and Technology,2011,26(3):344-351.
[4] Sun Ninghui,Xing Jing,Huo Zhigang,et al.Dawning nebulae: a petaflops supereomputer with a heterogeneous structure[J].Journal of Computer Science and Technology,201l,26(3):352-362.
[5] Zhang Yunquan, Sun Jiachang, Yuan Guoxing,et al. China HPC 2011: State of the Art Analysis and Perspective[J].E-Science Technology &Application,2012,3(1):89-96.(in Chinese)
张云泉,孙家昶,袁国兴,等.2011 年中国高性能计算机发展现状分析与展望[J].科研信息化技术与应用,2012,3(1):89-96.
[6] Sun Ninghui,Chen Guoliang.PHPC:a spreading kind of high performance computer[J].Journal of University of Science and Technology of China,2008,38(7):745-752.(in Chinese)
孙凝晖,陈国良.PHPC:一种普及型高性能计算机[J].中国科学技术大学学报,2008,38(7):745-752.
[7] Zhang Junxia,Zhang Huanjie,Li Huiming.Design of tera flops high performance computer KD-50-I based on Loongson 2F CPU[J].Journal of University of Science and Technology of China,2008, 38(1):105-108.(in Chinese)
张俊霞,张焕杰,李会民.基于龙芯2F的国产万亿次高性能计算机KD-50-I的研制[J].中国科学技术大学学报,2008,38(1):105-108.
[8] Zhang Junxia,Li Chunshen,Zhang Huanjie.KD-50-I-E: an enhanced high performance computer[J].Journal of University of Science and Technology of China,2009,39(8):894-896.(in Chinese)
张俊霞,李春生,张焕杰,等.KD-50-I-E:一台增强型高性能计算机[J].中国科学技术大学学报,2009,39(8):894-896.
[9] Gu Naijie,Li Kai,Chen Guoliang,et al.Optimization of BLAS based on Loongson 2F architecture[J].Journal of University of Science and Technology of China,2008,38(7):854-859.(in Chinese)
顾乃杰,李凯,陈国良,等.基于龙芯2F体系结构的BLAS库优化[J].中国科学技术大学学报,2008,38(7):854-859.
[10] Wu Chao,Sun Guangzhong,Chen Guoliang,et al.Linpack benchmark test on KD-50-I high performance computer[C]// The International Symposium on Parallel Architectures,Algorithms and Programming.Hefei(China):IEEE Press,2008:177-188.
[11] Chen Guoliang,Cai Ye,Luo Qiuming.Optimization of BLAS based on Loongson 2F architecture[J].Journal of Shenzhen University Science and Engineering,2011,28(6):471-477.(in Chinese)
陈国良,蔡晔,罗秋明.国产个人高性能计算机系统研制[J].深圳大学学报理工版,2011,28(6):471-477.
[12] Ohio Supercomputer Center.Blue collar computing[EB/OL](2004-07-01) [2004-09-01]. http://www.osc.edu/bluecollarcomputing/
[13] Cray Inc.Cray CX1 Tesla GPU Computing Blade[EB/OL](2009-01-01)[2009-03-01]. http://www.cray.com/Assets/PDF/products/cx1/CX1

相似文献/References:

[1]王新安,叶兆华,戴鹏,等.可重构阵列DSP结构ReMAP[J].深圳大学学报理工版,2010,27(1):16.
 WANG Xin-an,YE Zhao-hua,DAI Peng,et al.ReMAP:a reconfigurable array DSP architecture[J].Journal of Shenzhen University Science and Engineering,2010,27(No.2(111-220)):16.
[2]陈国良,毛睿,蔡晔.高性能计算及其相关新兴技术[J].深圳大学学报理工版,2015,32(1):25.[doi:10.3724/SP.J.1249.2015.01025]
 Chen Guoliang,Mao Rui,and Cai Ye.High performance computing and related new technologies[J].Journal of Shenzhen University Science and Engineering,2015,32(No.2(111-220)):25.[doi:10.3724/SP.J.1249.2015.01025]
[3]陈国良,蔡晔,罗秋明.国产个人高性能计算机系统研制[J].深圳大学学报理工版,2011,28(No.6(471-564)):471.
 CHEN Guo-liang,CAI Ye,and LUO Qiu-ming.The China made personal high performance computing system[J].Journal of Shenzhen University Science and Engineering,2011,28(No.2(111-220)):471.

备注/Memo

备注/Memo:
 Received:2013-01-10;Accepted:2013-02-28
Foundation:National Science and Technology Major Projects (2009ZX01028-002-003);Shenzhen Science and Technology Foundation (JCYJ20120613102224576)
Corresponding author:Academican Chen Guoliang.E-mail:glchen@szu.edu.cn
Citation:Cai Ye,Liu Gang,Mao Rui,et al.Design and performance optimization of a popular high performance computing system KD-90[J]. Journal of Shenzhen University Science and Engineering, 2013, 30(2): 138-143.(in Chinese)

基金项目:国家科技重大专项基金资助项目(2009ZX01028-002-003);深圳市科技基础研究基金资助项目(JCYJ20120613102224576)
作者简介:蔡晔(1974-),男(汉族),深圳大学副教授、博士.E-mail: caiye@szu.edu.cn
引文:蔡晔,刘刚,毛睿,等.KD-90普及型个人高性能计算机系统设计与性能优化[J]. 深圳大学学报理工版,2013,30(2):138-143.
更新日期/Last Update: 2013-03-19