Eflops,Zflops的芯片挑战

《Eflops,Zflops的芯片挑战》由会员分享,可在线阅读,更多相关《Eflops,Zflops的芯片挑战(19页珍藏版)》请在文档大全上搜索。
1、挑战1:Eflops,Zflops的芯片 每10年1000倍性能提升21998(Tflops) 2008(Pflops) 2018(Eflops) 2028(Zflops)今后构建大规模高性能计算机的挑战有哪些?2022-5-27并行计算介绍应用需求在人类历史上很少有任何技术产品能够向高性能计算机的峰值速度一样在如此长的时间内维持指数速度的增长,从过去20年高性能计算机峰值速度的发展规律来看,大约每10年性能可以提高3个数量级。以此推算: 在2008年已出现了Pflops(1015 flops)计算机, 预计到2018年可能出现峰值速度为Eflops(1018flop/s)的系统,2028年可
2、能出现峰值速度为Zflops(1021flop/s)的系统。2022-5-273并行计算介绍是否需要这么快的计算机 一个直观的问题是人类是否需要这么快的计算机,到底有哪些应用需要Eflops,Zflops的性能?实际上,应用对性能的需求几乎是没有止境的,1KM网格的气象模式可以更加准确地对天气情况进行预报,但需要20Pflops的持续性能,以目前的实际应用程序的计算效率为5%左右来看,这意味着接近0.5 Eflops的峰值速度。其它应用,如全球气候模式的模拟需要更多的计算能力。计算化学中很多近似算法的复杂度都是N4,可以很容易地“消费”掉所能提供的计算能力。 密码破译、武器研制、高精度气象预报
3、、地球系统模式研究以及新材料研究等,都对使用更高性能的计算机提出了强烈需求。 因此,研制速度为Eflops,Zflops的计算系统是保障我国经济建设、科技发展和国防安全的重要任务。2022-5-274并行计算介绍 Computing are Pervasive and PowerfulComputing resources become cheap and prolific.Increasingly low cost for fast CPUs and large memory.Cluster and Internet connect computing nodes easily. Three
4、 types of major computing resources:High end systems, e.g. Blue Gene/L, Earth Simulator.Ultra high performance but expensive. (customer designed nodes/networks)Cluster systems, e.g. ICTs Downing (and many other Top-500s)Low cost, but low sustained performance. (commodity nodes/networks)Google has be
5、en a successfully scalable example. Grid systems, e.g., TeraGrid. Microsoft/IBM “cloud computing”Utilizing global computing resources, but high Internet cost and overhead. Clients are pervasive in everywhere in the globeDesktops, laptops, PDAs, et. al directly connect to the Internet or via wireless
6、Major Resources in Computing and Network Systems Good News in supplyCPU cycles: oversupplied for many applications. Memory bandwidth: improved dramatically.Memory capacity: increasingly large and low cost.I/O bandwidth: improved dramatically. Disk capacity: huge and cheap. Cluster and Internet bandw
7、idths: very rich.Bad News in demandCPU cycles per Watt decreases. (less energy efficient).Cache capacity: always limited. Improvement of data access latencies significantly lags behind. Adam Smith: the balance is guided by an “invisible hand” in the market. We need to balanceOversupplied cyclesHigh
8、demand of fast data accesses and low energy cost7q 1970s-80s:1970s-80s: Killer applications demand a lot of CPU cycles a single processor was very slow (below 1MH) challenges: parallel algorithms, architecture, implement q 1980s:1980s: communication bottlenecks and burden of PP challenge I: fast int
9、erconnection networks challenge II: automatic PP, and shared virtual memoryq 1990s:1990s: “Memory Wall” and utilization of commodity processors challenge I: cache design and optimization challenge II: Networks of Workstations for HPCq 2000s and now:2000s and now: “Disk Wall” and Multi-core processor