PHI 指令的确切作用以及如何在 LLVM 中使用它

LLVM 有 Phi指令,解释很奇怪:

“ phi”指令用于实现 SSA 图中表示函数的 φ 节点。

通常用于实现分支。如果我理解正确的话,那么需要使依赖性分析成为可能,并且在某些情况下,它可以帮助避免不必要的加载。然而,仍然很难理解它到底是做什么的。

万花筒 例子解释它相当不错的 if情况下。然而,如何实现像 &&||这样的逻辑操作还不是很清楚。如果我在 在线 LLVM编译器中键入以下内容:

void main1(bool r, bool y) {
bool l = y || r;
}

最后几行完全把我弄糊涂了:

; <label>:10                                      ; preds = %7, %0
%11 = phi i1 [ true, %0 ], [ %9, %7 ]
%12 = zext i1 %11 to i8

看起来 phi 节点生成了一个可以使用的结果。我记得 phi node 只定义路径值的来源。

谁能解释一下什么是 Phi 节点,以及如何用它实现 ||

43553 次浏览

A phi node is an instruction used to select a value depending on the predecessor of the current block (Look here to see the full hierarchy - it's also used as a value, which is one of the classes which it inherits from).

Phi nodes are necessary due to the structure of the SSA (static single assignment) style of the LLVM code - for example, the following C++ function

void m(bool r, bool y){
bool l = y || r ;
}

gets translated into the following IR: (created through clang -c -emit-llvm file.c -o out.bc - and then viewed through llvm-dis)

define void @_Z1mbb(i1 zeroext %r, i1 zeroext %y) nounwind {
entry:
%r.addr = alloca i8, align 1
%y.addr = alloca i8, align 1
%l = alloca i8, align 1
%frombool = zext i1 %r to i8
store i8 %frombool, i8* %r.addr, align 1
%frombool1 = zext i1 %y to i8
store i8 %frombool1, i8* %y.addr, align 1
%0 = load i8* %y.addr, align 1
%tobool = trunc i8 %0 to i1
br i1 %tobool, label %lor.end, label %lor.rhs


lor.rhs:                                          ; preds = %entry
%1 = load i8* %r.addr, align 1
%tobool2 = trunc i8 %1 to i1
br label %lor.end


lor.end:                                          ; preds = %lor.rhs, %entry
%2 = phi i1 [ true, %entry ], [ %tobool2, %lor.rhs ]
%frombool3 = zext i1 %2 to i8
store i8 %frombool3, i8* %l, align 1
ret void
}

So what happens here? Unlike the C++ code, where the variable bool l could be either 0 or 1, in the LLVM IR it has to be defined once. So we check if %tobool is true, and then jump to lor.end or lor.rhs.

In lor.end we finally have the value of the || operator. If we arrived from the entry block - then it's just true. Otherwise, it is equal to the value of %tobool2 - and that's exactly what we get from the following IR line:

%2 = phi i1 [ true, %entry ], [ %tobool2, %lor.rhs ]

You don't need to use phi at all. Just create bunch of temporary variables. LLVM optimization passes will take care of optimizing temporary variables away and will use phi node for that automatically.

For example, if you want to do this:

x = 4;
if (something) x = x + 2;
print(x);

You can use phi node for that (in pseudocode):

  1. assign 4 to x1
  2. if (!something) branch to 4
  3. calculate x2 from x1 by adding 2
  4. assign x3 phi from x1 and x2
  5. call print with x3

But you can do without phi node (in pseudocode):

  1. allocate local variable on stack called x
  2. load into temp x1 value 4
  3. store x1 to x
  4. if (!something) branch to 8
  5. load x to temp x2
  6. add x2 with 4 to temp x3
  7. store x3 to x
  8. load x to temp x4
  9. call print with x4

By running optimization passes with llvm this second code will get optimized to first code.

The existing answers are good. But, I want to make it even simpler and shorter.

block3:
%result = phi i32 [%a, %block1], [%b, %block2]

This means that if the previous block was block1, choose value a. If the previous block was block2, choose value b.

Why do we write like this? This is to prevent assignign result in two different blocks such as if block and else block. Because, we do not want to violate SSA principle. SSA helps compilers to apply variety of optimizations and it is a de-facto standard for the intermediate codes. For more information, refer to this reference.