ABSTRACT
Compilers are an indispensable component in the software stack. Besides generating machine code, compilers perform multiple optimizations to improve code performance. Typically, scientific programmers treat compilers as a blackbox and expect them to optimize code thoroughly. However, optimizing compilers are not performance panacea. They can miss optimization opportunities or even introduce inefficiencies that are not in the source code. There is a lack of tool infrastructures and datasets that can provide such a study to help understand compiler optimizations.
In this paper, we investigate an important compiler optimization---dead and redundant operation elimination. We first develop a tool CIDetector to analyze a large number of programs. In our analysis, we select 12 representative programs from different domains to form a dataset called CIBench. We utilize five compilers to optimize CIBench with the highest optimization options available and leverage CIDetector to study each generated binary. We provide insights into two aspects. First, we show that modern compilers miss several optimization opportunities, in fact they even introduce some inefficiencies, which require programmers to refactor the source code. Second, we show how compilers have advanced in a vertical evolution (the same compiler of different release versions) and a horizontal comparison (different compilers of the most recent releases). With empirical studies, we provide insights for software engineers, compiler writers, and tool developers.
References
- Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A Scalable Memory Allocator for Multithreaded Applications. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IX). ACM, New York, NY, USA, 117--128. Google Scholar
- James Bucek, Klaus-Dieter Lange, and Jóakim v. Kistowski. 2018. SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering. 41--42. Google Scholar Digital Library
- Milind Chabbi, Xu Liu, and John Mellor-Crummey. 2014. Call paths for pin tools. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 76. Google Scholar Digital Library
- Milind Chabbi and John Mellor-Crummey. 2012. DeadSpy: A Tool to Pinpoint Program Inefficiencies. In Proceedings of the Tenth International Symposium on Code Generation and Optimization (CGO '12). ACM, New York, NY, USA, 124--134. Google Scholar Digital Library
- Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC). IEEE, 44--54. Google Scholar Digital Library
- Eui-Young Chung, Luca Benini, and Giovanni De Micheli. 2000. Energy Efficient Source Code Transformation based on Value Profiling. In PROC. INTERNATIONAL WORKSHOP ON COMPILERS AND OPERATING SYSTEMS FOR LOW POWER. Google Scholar
- Intel Co. 2019. Pin 3.10. https://software.intel.com/sites/landingpage/pintool/docs/97971/Pin/html/. Google Scholar
- Coral Collaboration. 2017. Coral-2 Benchmarks. https://asc.llnl.gov/coral-2-benchmarks/. Google Scholar
- Keith Cooper, Jason Eckhardt, and Ken Kennedy. 2008. Redundancy elimination revisited. In Proceedings of the 17th International Conference on Parallel architectures and compilation techniques. 12--21. Google Scholar Digital Library
- Steven J. Deitz, Bradford L. Chamberlain, and Lawrence Snyder. 2001. Eliminating Redundancies in Sum-of-product Array Computations. In Proceedings of the 15th International Conference on Supercomputing (ICS '01). ACM, New York, NY, USA, 65--77. Google Scholar Digital Library
- Luca Della Toffola, Michael Pradel, and Thomas R. Gross. 2015. Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 607--622. Google Scholar Digital Library
- Monika Dhok and Murali Krishna Ramanathan. 2016. Directed Test Generation to Detect Loop Inefficiencies. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 895--907. Google Scholar Digital Library
- Yufei Ding, Lin Ning, Hui Guan, and Xipeng Shen. 2017. Generalizations of the Theory and Deployment of Triangular Inequality for Compiler-based Strength Reduction. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 33--48. Google Scholar Digital Library
- Yufei Ding and Xipeng Shen. 2017. GLORE: Generalized Loop Redundancy Elimination Upon LER-notation. Proc. ACM Program. Lang. 1, OOPSLA, Article 74 (Oct. 2017), 28 pages. Google Scholar Digital Library
- Tom Duff. 1988. Duff's device. https://www.lysator.liu.se/c/duffs-device.html. Google Scholar
- Robert G Edwards and Balint Joo. 2004. The Chroma software system for lattice QCD. arXiv preprint hep-lat/0409003 (2004). Google Scholar
- Mary F. Fernández. 1995. Simple and Effective Link-time Optimization of Modula-3 Programs. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI '95). ACM, New York, NY, USA, 103--115. Google Scholar Digital Library
- Brian Gough. 2009. GNU scientific library reference manual. Network Theory Ltd. Google Scholar
- Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patanaanake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems. In Proceedings of the ACM Symposium on Cloud Computing (SOCC '14). ACM, New York, NY, USA, Article 7, 14 pages. Google Scholar Digital Library
- John L Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4 (2006), 1--17. Google Scholar Digital Library
- Sylvain Henry, Hugo Bolloré, and Emmanuel Oseret. 2015. Towards the Generalization of Value Profiling for High-Performance Application Optimization. http://sylvain-henry.info/home/files/papers/shenry_2015_vprof.pdf. Google Scholar
- Jian Huang, Moinuddin K. Qureshi, and Karsten Schwan. 2016. An Evolutionary Study of Linux Memory Management for Fun and Profit. In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '16). USENIX Association, Berkeley, CA, USA, 465--478. http://dl.acm.org/citation.cfm?id=3026959.3027002 Google Scholar
- Wei Huang, Shougata Ghosh, Sivakumar Velusamy, Karthik Sankaranarayanan, Kevin Skadron, and Mircea R Stan. 2006. HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14, 5 (2006), 501--513. Google Scholar Digital Library
- Robert Hundt, Easwaran Raman, Martin Thuresson, and Neil Vachharajani. 2011. MAO - An Extensible Micro-architectural Optimizer. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '11). IEEE Computer Society, Washington, DC, USA, 1--10. Google Scholar Digital Library
- Intel. 2019. Intel C++ Compiler 19.1 Developer Guide and Reference. https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-introducing-the-intel-c-compiler. Google Scholar
- GNU Compiler Collection (GCC) Internals. 2019. LinkTime Optimization. https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO Google Scholar
- Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-world Performance Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '12). ACM, New York, NY, USA, 77--88. Google Scholar Digital Library
- Teresa Johnson, Mehdi Amini, and Xinliang David Li. 2017. ThinLTO: scalable and incremental LTO. In 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 111--121. Google Scholar Cross Ref
- Takahiro Kamio and Hidehiko Masahura. 2004. A Value Profiler for Assisting Object-Oriented Program Specialization. In Proceedings of Workshop on New Approaches to Software Construction. Google Scholar
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE Computer Society, 75. Google Scholar Digital Library
- Yepang Liu, Chang Xu, and Shing-Chi Cheung. 2014. Characterizing and Detecting Performance Bugs for Smartphone Applications. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 1013--1024. Google Scholar Digital Library
- Lanyue Lu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Shan Lu. 2014. A Study of Linux File System Evolution. Trans. Storage 10, 1, Article 3 (Jan. 2014), 32 pages. Google Scholar Digital Library
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '05). ACM, New York, NY, USA, 190--200. Google Scholar Digital Library
- YuLong Luo and GuangMing Tan. 2014. Optimizing Stencil Code via Locality of Computation. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. 477--478. Google Scholar Digital Library
- Saeed Maleki, Yaoqing Gao, Maria J. Garzarán, Tommy Wong, and David A. Padua. 2011. An Evaluation of Vectorizing Compilers. In Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT '11). IEEE Computer Society, Washington, DC, USA, 372--382. Google Scholar Digital Library
- Matthias S Müller, John Baron, William C Brantley, Huiyu Feng, Daniel Hackenberg, Robert Henschel, Gabriele Jost, Daniel Molka, Chris Parrott, Joe Robichaux, et al. 2012. SPEC OMP2012 - an application benchmark suite for parallel systems using OpenMP. In International Workshop on OpenMP. Springer, 223--236. Google Scholar
- Robert Muth, Scott A. Watterson, and Saumya K. Debray. 2000. Code Specialization Based on Value Profiles. In Proceedings of the 7th International Symposium on Static Analysis (SAS '00). Springer-Verlag, London, UK, UK, 340--359. Google Scholar
- A. Nistor, L. Song, D. Marinov, and S. Lu. 2013. Toddler: Detecting performance problems via similar memory-access patterns. In 2013 35th International Conference on Software Engineering (ICSE). 562--571. Google Scholar Cross Ref
- Taewook Oh, Hanjun Kim, Nick P. Johnson, Jae W. Lee, and David I. August. 2013. Practical Automatic Loop Specialization. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '13). ACM, New York, NY, USA, 419--430. Google Scholar
- GNU Organization. 2019. The GNU Compiler Collection. https://gcc.gnu.org/. Google Scholar
- Rohan Padhye and Koushik Sen. 2017. Travioli: A Dynamic Analysis for Detecting Data-structure Traversals. In Proceedings of the 39th International Conference on Software Engineering (ICSE '17). IEEE Press, Piscataway, NJ, USA, 473--483. Google Scholar Digital Library
- Mahesh Rajan, Douglas W. Doerfler, and Simon David Hammond. 2015. Trinity Benchmarks on the Intel Xeon Phi (Knights Corner). (1 2015). Google Scholar Cross Ref
- Barry K Rosen, Mark N Wegman, and F Kenneth Zadeck. 1988. Global value numbers and redundant computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 12--27. Google Scholar Digital Library
- Marija Selakovic and Michael Pradel. 2016. Performance Issues and Optimizations in JavaScript: An Empirical Study. In Proceedings of the 38th International Conference on Software Engineering (ICSE '16). ACM, New York, NY, USA, 61--72. Google Scholar Digital Library
- Julian Seward. 2000. bzip2. http://www.bzip.org/. Google Scholar
- Linhai Song and Shan Lu. 2017. Performance Diagnosis for Inefficient Loops. In Proceedings of the 39th International Conference on Software Engineering (ICSE '17). IEEE Press, Piscataway, NJ, USA, 370--380. Google Scholar Digital Library
- Pengfei Su, Shasha Wen, Hailong Yang, Milind Chabbi, and Xu Liu. 2019. Redundant Loads: A Software Inefficiency Indicator. arXiv preprint arXiv:1902.05462 (2019). Google Scholar
- Pengfei Su, Shasha Wen, Hailong Yang, Milind Chabbi, and Xu Liu. 2019. Redundant Loads: A Software Inefficiency Indicator. In Proceedings of the 41st International Conference on Software Engineering (ICSE âĂŹ19). IEEE Press, 982âĂŞ993. Google Scholar Digital Library
- Using the GNU Compiler Collection (GCC). 2019. Program Instrumentation Options. https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html. Google Scholar
- Linda Torczon and Keith Cooper. 2011. Engineering A Compiler (2nd ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Google Scholar
- Marat Valiev, Eric J Bylaska, Niranjan Govind, Karol Kowalski, Tjerk P Straatsma, Hubertus JJ Van Dam, Dunyou Wang, Jarek Nieplocha, Edoardo Apra, Theresa L Windus, et al. 2010. NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications 181, 9 (2010), 1477--1489. Google Scholar Cross Ref
- Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant Propagation with Conditional Branches. ACM Trans. Program. Lang. Syst. 13, 2 (Apr 1991), 181--210. Google Scholar Digital Library
- Shasha Wen, Milind Chabbi, and Xu Liu. 2017. REDSPY: exploring value locality in software. In ACM SIGARCH Computer Architecture News , Vol. 45. ACM, 47--61. Google Scholar Digital Library
- Shasha Wen, Xu Liu, John Byrne, and Milind Chabbi. 2018. Watching for software inefficiencies with witch. ACM SIGPLAN Notices 53, 2 (2018), 332--347. Google Scholar Digital Library
- Shasha Wen, Xu Liu, and Milind Chabbi. 2015. Runtime Value Numbering: A Profiling Technique to Pinpoint Redundant Computations. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT) (PACT '15). IEEE Computer Society, Washington, DC, USA, 254--265. Google Scholar Digital Library
Index Terms
-
What every scientific programmer should know about compiler optimizations?
Comments
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign inFull Access
-
Published in
ICS '20: Proceedings of the 34th ACM International Conference on Supercomputing
June 2020
499 pages
ISBN: 9781450379830
DOI: 10.1145/3392717
- General Chairs:
- Eduard Ayguadé
Universitat Politècnica de Catalunya and Barcelona Supercomputing Center
, - Wen-mei Hwu
University of Illinois at Urbana-Champaign
, - Program Chairs:
- Rosa M. Badia
Barcelona Supercomputing Center and Universitat Politècnica de Catalunya
, - H. Peter Hofstee
IBM Austin
Copyright © 2020 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 June 2020
Permissions
Request permissions about this article.
Request PermissionsAuthor Tags
Qualifiers
- research-article
Conference
Funding Sources
-
- Authors:
-
Jialiang Tan
College of William and Mary
College of William and Mary
Search about this author -
Shuyin Jiao
College of William and Mary
College of William and Mary
Search about this author -
Milind Chabbi
Scalable Machines Research
Scalable Machines Research
Search about this author -
Xu Liu
College of William and Mary
College of William and Mary
Search about this author - Authors Info & Affiliations
Other Metrics
-
Article Metrics
-
0
-
0
- Downloads (Last 12 months) 0
- Downloads (Last 6 weeks) 0
Other Metrics
-
-
Cited By
This publication has not been cited yet
PDF Format
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderDigital Edition
View this article in digital edition.
Share this Publication link
https://dl.acm.org/doi/abs/10.1145/3392717.3392754
Share on Social Media
Share on
以上所述就是小编给大家介绍的《What every scientific programmer should know about compiler optimizations?》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
我看电商2(双色)
黄若 / 电子工业出版社 / 2016-6 / 39.00元
《我看电商2》是行业畅销书《我看电商》的续集。 《我看电商》自出版以来,连续印刷14 次,受到业界人士和广大读者的高度好评。《我看电商2》承续作者一贯的风格,以行业观察、经验分享为出发点,重点分析了过去一年中国电商界的最新动态与趋势,包括双11点评、京东关闭拍拍、上市公司私有化等。 电子商务是我国近年来发展最快的新兴行业之一,作者作为这个行业的长老级领军人物,善于思考,长于实操。《我看......一起来看看 《我看电商2(双色)》 这本书的介绍吧!