内容简介:Tags:programming,This is just a brief post on the steps I took to confirm, locate, and backport the fix for a compiler bug that was plaguing a (professional) project that I work on. It’s a straightforward application ofI work on LLVM-based program analysis
May 7, 2020
Tags:programming, devblog
This is just a brief post on the steps I took to confirm, locate, and backport the fix for a
compiler bug that was plaguing a (professional) project that I work on. It’s a straightforward
application of git bisect
, with a few observations:
- That even compilers have bugs (and that, depending on the context, the compiler might be the most buggy component)
-
That
git bisect
is incredibly flexible and can work on extremely complex programs - That backporting a fix isn’t always difficult or fraught
Background: the bug
I work on LLVM-based program analysis tools professionally, and currently work on a project that makes heavy use of a modified build of LLVM with some additional features.
Many of the components that I work on assume that LLVM faithfully emits debug intrinsics
(i.e. call void llvm.dbg.whatever
) and location expressions (i.e., DILocation
, DebugLoc
)
for every source-level variable.
This is normally a reasonable assumption, at least for unoptimized or less-optimized compilations, and allows us to make further assumptions about the information we find in DWARF after lowering.
Here’s a source file (minimized from the original) that, when compiled with -O1
, violated the
above assumption:
#include <stdio.h> #include <stdlib.h> struct foo { char buffer[32]; }; int bar(int index) { struct foo stuff = {}; if (index >= sizeof(stuff.buffer)) { return -1; } return (int)stuff.buffer[index]; } int main(int argc, char **argv) { return bar(atoi(argv[1])); }
…and here’s the corresponding LLVM IR for bar
:
; Function Attrs: nounwind readnone uwtable define dso_local i32 @bar(i32) local_unnamed_addr #0 !dbg !13 { %2 = alloca %struct.foo, align 1 call void @llvm.dbg.value(metadata i32 %0, metadata !17, metadata !DIExpression()), !dbg !25 %3 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 0, !dbg !26 call void @llvm.lifetime.start.p0i8(i64 32, i8* nonnull %3) #6, !dbg !26 call void @llvm.memset.p0i8.i64(i8* nonnull align 1 %3, i8 0, i64 32, i1 false), !dbg !27 %4 = icmp ugt i32 %0, 31, !dbg !28 br i1 %4, label %10, label %5, !dbg !30 ; <label>:5: ; preds = %1 %6 = sext i32 %0 to i64, !dbg !31 %7 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 %6, !dbg !32 %8 = load i8, i8* %7, align 1, !dbg !32, !tbaa !33 %9 = sext i8 %8 to i32, !dbg !36 br label %10, !dbg !37 ; <label>:10: ; preds = %1, %5 %11 = phi i32 [ %9, %5 ], [ -1, %1 ], !dbg !38 call void @llvm.lifetime.end.p0i8(i64 32, i8* nonnull %3) #6, !dbg !39 ret i32 %11, !dbg !39 }
Some things immediately stand out:
-
Despite there being two variables in scope (the
index
argument andstuff
), only onellvm.dbg.*
intrinsic is present. Inspecting the debug nodes reveals that it’s forindex
, notstuff
:!17 = !DILocalVariable(name: "index", arg: 1, scope: !13, file: !1, line: 8, type: !4)
(This makes sense, as
index
is likely to be stored in a register on our target platform and thus has anllvm.dbg.value
intrinsic rather than a memory-based one.) -
We have lifetime intrinsics (
llvm.lifetime.start
andllvm.lifetime.end
) for the structure as well as the expectedalloca
andgetelementptr
for creating and indexing into it. Some of these (thegetelementptr
andllvm.lifetime.start
) also have attached metadata that references our expected location forstuff
:!26 = !DILocation(line: 9, column: 3, scope: !13)
-
The
DILocalVariable
forstuff
is embedded in the IR, but is completely orphaned by the function other than a reference inretainedNodes
:!13 = distinct !DISubprogram(name: "bar", scope: !1, file: !1, line: 8, type: !14, isLocal: false, isDefinition: true, scopeLine: 8, flags: DIFlagPrototyped,isOptimized: true, unit: !0, retainedNodes: !16) !16 = !{!17, !18} !18 = !DILocalVariable(name: "stuff", scope: !13, file: !1, line: 9, type: !19)
The
DILocalVariable
forindex
is also inretainedNodes
, curiously, despite not being orphaned and therefore not requiring separate retention.
All told, this is extremely suspicious: we’re missing an intrinsic, a debug node is incorrectly
marked as “retained”, and there’s no connection between our (orphaned) DILocalVariable
and its DILocation
.
Unsurprisingly, this variable/location disconnect results in broken DWARF information for stuff
,
specifically in the form of a missing DW_AT_location
(which we would expect, based on the alloca
, to reflect a stack location):
0x00000043: DW_TAG_subprogram DW_AT_low_pc (0x00000000004004d0) DW_AT_high_pc (0x00000000004004f0) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_name ("bar") DW_AT_decl_file ("/tmp/junk.c") DW_AT_decl_line (8) DW_AT_prototyped (true) DW_AT_type (0x0000002a "int") DW_AT_external (true) 0x0000005c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("index") DW_AT_decl_file ("/tmp/junk.c") DW_AT_decl_line (8) DW_AT_type (0x0000002a "int") 0x00000069: DW_TAG_variable DW_AT_name ("stuff") DW_AT_decl_file ("/tmp/junk.c") DW_AT_decl_line (9) DW_AT_type (0x000000ad "foo")
First steps: reproducing
Since we’re dealing with a modified LLVM 7 here, the very first step is to make sure that we’re not the cause of our own problem.
A quick trip to Godbolt settles that, showing the same missing intrinsic on a reference build of LLVM/Clang 7:
define dso_local i32 @bar(i32) local_unnamed_addr #0 !dbg !13 { %2 = alloca %struct.foo, align 1 call void @llvm.dbg.value(metadata i32 %0, metadata !18, metadata !DIExpression()), !dbg !26 %3 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 0, !dbg !27 call void @llvm.lifetime.start.p0i8(i64 32, i8* nonnull %3) #6, !dbg !27 call void @llvm.memset.p0i8.i64(i8* nonnull align 1 %3, i8 0, i64 32, i1 false), !dbg !28 %4 = icmp ugt i32 %0, 31, !dbg !29 br i1 %4, label %10, label %5, !dbg !31 %6 = sext i32 %0 to i64, !dbg !32 %7 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 %6, !dbg !33 %8 = load i8, i8* %7, align 1, !dbg !33, !tbaa !34 %9 = sext i8 %8 to i32, !dbg !37 br label %10, !dbg !38 %11 = phi i32 [ %9, %5 ], [ -1, %1 ], !dbg !39 call void @llvm.lifetime.end.p0i8(i64 32, i8* nonnull %3) #6, !dbg !40 ret i32 %11, !dbg !40 }
( Permalink )
Finding the fix
At this point, the panic settles in: there’s a bug in the compiler that isn’t (directly) our fault, and we need to fix it in order to continue.
Compilers are complicated beasts, and we’d rather not find and fix bugs ourselves if we don’t have to. So, we return to Godbolt and see whether someone else has done the hard work for us sometime between LLVM 7 (released 09/2019) and the latest trunk build.
Indeed! Compiling with a trunk build ( permalink ) shows two debug intrinsics, as expected, with correctly attached location metadata on each call:
define dso_local i32 @bar(i32 %0) local_unnamed_addr #0 !dbg !13 { %2 = alloca %struct.foo, align 1 call void @llvm.dbg.value(metadata i32 %0, metadata !18, metadata !DIExpression()), !dbg !26 %3 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 0, !dbg !27 call void @llvm.lifetime.start.p0i8(i64 32, i8* nonnull %3) #7, !dbg !27 call void @llvm.dbg.declare(metadata %struct.foo* %2, metadata !19, metadata !DIExpression()), !dbg !28 call void @llvm.memset.p0i8.i64(i8* nonnull align 1 dereferenceable(32) %3, i8 0, i64 32, i1 false), !dbg !28 %4 = icmp ugt i32 %0, 31, !dbg !29 br i1 %4, label %10, label %5, !dbg !31 5: ; preds = %1 %6 = sext i32 %0 to i64, !dbg !32 %7 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 %6, !dbg !33 %8 = load i8, i8* %7, align 1, !dbg !33, !tbaa !34 %9 = sext i8 %8 to i32, !dbg !37 br label %10, !dbg !38 10: ; preds = %1, %5 %11 = phi i32 [ %9, %5 ], [ -1, %1 ], !dbg !26 call void @llvm.lifetime.end.p0i8(i64 32, i8* nonnull %3) #7, !dbg !39 ret i32 %11, !dbg !39 }
So let’s walk backwards.
LLVM 10 also emits two intrinsics, but LLVM 9 is identical to our buggy LLVM 7 IR:
define dso_local i32 @bar(i32) local_unnamed_addr #0 !dbg !13 { %2 = alloca %struct.foo, align 1 call void @llvm.dbg.value(metadata i32 %0, metadata !18, metadata !DIExpression()), !dbg !26 %3 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 0, !dbg !27 call void @llvm.lifetime.start.p0i8(i64 32, i8* nonnull %3) #6, !dbg !27 call void @llvm.memset.p0i8.i64(i8* nonnull align 1 %3, i8 0, i64 32, i1 false), !dbg !28 %4 = icmp ugt i32 %0, 31, !dbg !29 br i1 %4, label %10, label %5, !dbg !31 5: ; preds = %1 %6 = sext i32 %0 to i64, !dbg !32 %7 = getelementptr inbounds %struct.foo, %struct.foo* %2, i64 0, i32 0, i64 %6, !dbg !33 %8 = load i8, i8* %7, align 1, !dbg !33, !tbaa !34 %9 = sext i8 %8 to i32, !dbg !37 br label %10, !dbg !38 10: ; preds = %1, %5 %11 = phi i32 [ %9, %5 ], [ -1, %1 ], !dbg !26 call void @llvm.lifetime.end.p0i8(i64 32, i8* nonnull %3) #6, !dbg !39 ret i32 %11, !dbg !39 }
So we know that someone fixed this bug, somewhere between the LLVM 9 (09/2019) and LLVM 10 (03/2020) releases. That’s reassuring, but doesn’t get us any closer to actually patching our build. More work is needed.
Finding the fix, for real
Knowing that our desired bugfix is somewhere in the changeset between LLVM 9 and LLVM 10 leaves us with a few naive strategies:
-
We could
grep
through thegit log
between the releases, trying to key on something likeDWARF
,DebugInfo
,bug
, or similar. - We could try every commit between the two releases, stopping when the behavior changes.
Neither of these is particularly appealing: the unstructured nature of commit messages makes it difficult to ensure that we aren’t over- or under-constraining our search, and individually compiling all 10,000+ commits between the two releases would take us weeks.
git bisect
to the rescue
git bisect
, unsurprisingly, is the “right” answer to exactly this problem. From the manpage
:
This command uses a binary search algorithm to find which commit in your project’s history introduced a bug. You use it by first telling it a “bad” commit that is known to contain the bug, and a “good” commit that is known to be before the bug was introduced. Then git bisect picks a commit between those two endpoints and asks you whether the selected commit is “good” or “bad”. It continues narrowing down the range until it finds the exact commit that introduced the change.
We’re going forwards in history and looking for a bugfix rather than a regression, so the good
and bad
identifiers can be a little confusing. To avoid confusing ourselves too
much, we use git bisect
’s alternative old
and new
terms instead.
We also, of course, need to actually run something
on each bisected revision to determine
whether it’s old
or new
. The LLVM project build is, blessedly, straightforward:
$ mkdir build && cd build $ cmake -G Ninja -DLLVM_TARGETS_TO_BUILD="X86" -DLLVM_ENABLE_PROJECTS="clang" ../llvm $ cmake --build . $ ./bin/clang --version $ ./bin/clang -O1 -g /tmp/testcase.c -o /tmp/testcase # the -n flag filters DWARF info on DW_AT_name, which in our case is the "stuff" variable $ ./bin/llvm-dwarfdump -n stuff /tmp/testcase | grep 'DW_AT_location'
(We use Ninja and restrict our backends to x86 for faster compilations; there are other definitions we could add to prune the build but these were sufficient for my purposes.)
Finally, we actually perform the bisection. From the repo root, we begin with:
$ git checkout llvmorg-9.0.0 $ git bisect start # we know that 9.0 has the bug and 10.0 doesn't, so we can tag them immediately $ git bisect old $ git bisect new llvmorg-10.0.0
then, perform our actual test steps on the current revision:
$ cd build # leverage the CMake cache to speed things up, if it's still valid $ cmake --build . $ ./bin/clang -O1 -g /tmp/testcase.c -o /tmp/testcase $ ./bin/llvm-dwarfdump -n stuff /tmp/testcase | grep 'DW_AT_location' $ old_or_new=$([[ $? -eq 0 ]] && echo new || echo old) $ cd .. $ git bisect ${old_or_new}
…and repeat until no commits are left:
cbf1f3b771c8
: [Debuginfo][SROA] Need to handle dbg.value in SROA pass.
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp index a1d1b3ec84f0..7242eb1d3de2 100644 --- a/llvm/lib/Transforms/Utils/Local.cpp +++ b/llvm/lib/Transforms/Utils/Local.cpp @@ -1378,7 +1378,12 @@ void llvm::ConvertDebugDeclareToDebugValue(DbgVariableIntrinsic *DII, /// Determine whether this alloca is either a VLA or an array. static bool isArray(AllocaInst *AI) { return AI->isArrayAllocation() || - AI->getType()->getElementType()->isArrayTy(); + (AI->getAllocatedType() && AI->getAllocatedType()->isArrayTy()); +} + +/// Determine whether this alloca is a structure. +static bool isStructure(AllocaInst *AI) { + return AI->getAllocatedType() && AI->getAllocatedType()->isStructTy(); } /// LowerDbgDeclare - Lowers llvm.dbg.declare intrinsics into appropriate set @@ -1403,7 +1408,7 @@ bool llvm::LowerDbgDeclare(Function &F) { // stored on the stack, while the dbg.declare can only describe // the stack slot (and at a lexical-scope granularity). Later // passes will attempt to elide the stack slot. - if (!AI || isArray(AI)) + if (!AI || isArray(AI) || isStructure(AI)) continue; // A volatile load/store means that the alloca can't be elided anyway
To explain: the LLVM optimizer includes a pass named SRoA, short for “Scalar Replacement of Aggregates”.
SRoA is run by default on -O1
and higher, and attempts to replace aggregate values
(like arrays and structures) with scalar SSA values.
An effective SRoA run requires that the llvm.dbg.declare
for a value be lowered to one or more llvm.dbg.value
s, as the SSA value may or may not reside at a memory location (it might be in a
register, like index
).
The bug occurs because of this lowering behavior: the relevant helper ( LowerDbgDeclare
) fails
to check for one of the valid aggregate types (structures) and incorrectly elides the intrinsic
without inserting any replacements. The fix is fundamentally a single line; the changes to isArray
and addition of isStructure
only serve to make it more readable.
Five minutes later, we have a patched LLVM 7 with the correct behavior:
$ ./bin/clang --version clang version 7.0.0 Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /store/william/llvm-project/build/./bin $ ./bin/clang -O1 -g /tmp/testcase.c -o /tmp/testcase $ ./bin/llvm-dwarfdump -n stuff /tmp/testcase | grep 'DW_AT_location' /tmp/testcase: file format elf64-x86-64 0x00000069: DW_TAG_variable DW_AT_location (DW_OP_fbreg -120) DW_AT_name ("stuff") DW_AT_decl_file ("/tmp/testcase.c") DW_AT_decl_line (9) DW_AT_type (0x00000078 "foo")
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。