内容简介:We recently evaluated Facebook’s BOLT, a Post Link Optimizer framework, on large google benchmarks and noticed that it improves key performance metrics of these benchmarks by 2% to 6%, which is pretty impressive as this is over and above a baseline binary
Propeller: Profile Guided Optimizing Large Scale LLVM-based Relinker
Background
We recently evaluated Facebook’s BOLT, a Post Link Optimizer framework, on large google benchmarks and noticed that it improves key performance metrics of these benchmarks by 2% to 6%, which is pretty impressive as this is over and above a baseline binary already heavily optimized with ThinLTO + PGO. Furthermore, BOLT is also able to improve the performance of binaries optimized via Context-Sensitive PGO. While ThinLTO + PGO is also profile guided and does very aggressive performance optimizations, there is more room for performance improvements due to profile approximations while applying the transformations. BOLT uses exact profiles from the final binary and is able to fill the gaps left by ThinLTO + PGO. The performance improvements due to BOLT come from basic block layout, function reordering and function splitting.
While BOLT does an excellent job of squeezing extra performance from highly optimized binaries with optimizations such as code layout, it has these major issues:
-
It does not take advantage of distributed build systems.
-
It has scalability issues and to rewrite a binary with a ~300M text segment size:
-
Memory foot-print is 70G.
-
It takes more than 10 minutes to rewrite the binary.
-
Similar to Full LTO, BOLT’s design is monolithic as it disassembles the original binary, optimizes and rewrites the final binary in one process. This limits the scalability of BOLT and the memory and time overhead shoots up quickly for large binaries.
Inspired by the performance gains and to address the scalability issue of BOLT, we went about designing a scalable infrastructure that can perform BOLT-like post-link optimizations. In this RFC, we discuss our system, “Propeller”, which can perform profile guided link time binary optimizations in a scalable way and is friendly to distributed build systems. Our system leverages the existing capabilities of the compiler tool-chain and is not a stand alone tool. Like BOLT, our system boosts the performance of optimized binaries via link-time optimizations using accurate profiles of the binary. We discuss the Propeller system and show how to do the whole program basic block layout using Propeller.
Propeller does whole program basic block layout at link time via basic block sections. We have added support for having each basic block in its own section which allows the linker to do arbitrary reorderings of basic blocks to achieve any desired fine-grain code layout which includes block layout, function splitting and function reordering. Our experiments on large real-world applications and SPEC with code layout show that Propeller can optimize as effectively as BOLT, with just 20% of its memory footprint and time overhead.
An LLVM branch with propeller patches is available in the git repository here: https://github.com/google/llvm-propeller/ We will upload patches for review for the various elements
This directory and its sub-directories contain source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The README briefly describes how to get started with building LLVM. For more information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Getting Started with the LLVM System
Taken from https://llvm.org/docs/GettingStarted.html .
Overview
Welcome to the LLVM project!
The LLVM project has multiple components. The core of the project is itself called "LLVM". This contains all of the tools, libraries, and header files needed to process intermediate representations and converts it into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer. It also contains basic regression tests.
C-like languages use the Clang front end. This component compiles C, C++, Objective C, and Objective C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library , the LLD linker , and more.
Getting the Source Code and Building LLVM
The LLVM Getting Started documentation may be out of date. The Clang Getting Started page might have more accurate information.
This is an example work-flow and configuration to get and build the LLVM source:
-
Checkout LLVM (including related sub-projects like Clang):
-
git clone https://github.com/llvm/llvm-project.git
-
Or, on windows,
git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git
-
-
Configure and build LLVM and Clang:
-
cd llvm-project
-
mkdir build
-
cd build
-
cmake -G <generator> [options] ../llvm
Some common build system generators are:
-
Ninja
--- for generating Ninja build files. Most llvm developers use Ninja. -
Unix Makefiles
--- for generating make-compatible parallel makefiles. -
Visual Studio
--- for generating Visual Studio projects and solutions. -
Xcode
--- for generating Xcode projects.
Some Common options:
-
-DLLVM_ENABLE_PROJECTS='...'
--- semicolon-separated list of the LLVM sub-projects you'd like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or debuginfo-tests.For example, to build LLVM, Clang, libcxx, and libcxxabi, use
-DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi"
. -
-DCMAKE_INSTALL_PREFIX=directory
--- Specify for directory the full path name of where you want the LLVM tools and libraries to be installed (default/usr/local
). -
-DCMAKE_BUILD_TYPE=type
--- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug. -
-DLLVM_ENABLE_ASSERTIONS=On
--- Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).
-
-
cmake --build . [-- [options] <target>]
or your build system specified above directly.-
The default target (i.e.
ninja
ormake
) will build all of LLVM. -
The
check-all
target (i.e.ninja check-all
) will run the regression tests to ensure everything is in working order. -
CMake will generate targets for each tool and library, and most LLVM sub-projects generate their own
check-<project>
target. -
Running a serial build will be slow . To improve speed, try running a parallel build. That's done by default in Ninja; for
make
, use the option-j NNN
, whereNNN
is the number of parallel jobs, e.g. the number of CPUs you have.
-
-
For more information see CMake
-
Consult the Getting Started with LLVM page for detailed information on configuring and compiling LLVM. You can visit Directory Layout to learn about the layout of the source code tree.
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。