Searching the predecessor blocks for a value – Basics of IR Code Generation
By Reginald Bellamy / December 27, 2022 / No Comments / Emitting the function body, IT Certifications, LLVM IR REFERENCE, Technical requirements, Understanding the IR code
If the current basic block we are looking at has only one predecessor, then we search there for the value of the variable. If the basic block has several predecessors, then we need to search for the value in all these blocks and combine the results. To illustrate this situation, you can look at the basic block with the condition of a WHILE statement from the previous section.
This basic block has two predecessors – the one resulting from the statement before the WHILE statement and the one resulting from the branch for the end of the body of the WHILE loop. A variable that’s used in the condition should have some initial value and will most likely be changed in the body of the loop. So, we need to collect these definitions and create a phi instruction from it. The basic blocks that are created from the WHILE statement contain a cycle.
Because we recursively search the predecessor blocks, we must break this cycle. To do so, we can use a simple trick: we can insert an empty phi instruction and record this as the current value of the variable. If we see this basic block again in our search, then we’ll see that the variable has a value that we can use. The search stops at this point. Once we’ve collected all the values, we must update the phi instruction.
However, we will still face a problem. At the time of the lookup, not all predecessors of a basic block may be known. How can this happen? Look at the creation of the basic blocks for the WHILE statement. The IR for the condition of the loop is generated first. However, the branch from the end of the body that goes back to the basic block, which contains the condition, can only be added after the IR for the body is generated. This is because this basic block is not known earlier. If we need to read the value of a variable in the condition, then we are stuck, because not all predecessors are known.
To solve this situation, we must do a little more:
- First, we must attach a Sealed flag to the basic block.
- Then, we must define a basic block as sealed if we know all the predecessors of the basic block. If the basic block is not sealed and we need to look up the value of the variable not yet defined in this basic block, then we must insert an empty phi instruction and use it as the value.
- We also need to remember this instruction. If the block is later sealed, then we need to update the instruction with the real values. To implement this, we must add two more members to struct BasicBlockDef: the IncompletePhis map, which records the phi instructions we need later to update, and the Sealed flag, which indicates if the basic block is sealed:
llvm::DenseMap IncompletePhis;
unsigned Sealed : 1;
- Then, the method can be implemented, as discussed at the beginning of this section:
llvm::Value *CGProcedure::readLocalVariableRecursive(
llvm::BasicBlock *BB, Decl *Decl) {
llvm::Value *Val = nullptr;
if (!CurrentDef[BB].Sealed) {
llvm::PHINode *Phi = addEmptyPhi(BB, Decl);
CurrentDef[BB].IncompletePhis[Phi] = Decl;
Val = Phi;
} else if (auto *PredBB = BB->getSinglePredecessor()) {
Val = readLocalVariable(PredBB, Decl);
} else {
llvm::PHINode *Phi = addEmptyPhi(BB, Decl);
writeLocalVariable(BB, Decl, Phi);
Val = addPhiOperands(BB, Decl, Phi);
}
writeLocalVariable(BB, Decl, Val);
return Val;
}
- The addEmptyPhi() method inserts an empty phi instruction at the beginning of the basic block:
llvm::PHINode *
CGProcedure::addEmptyPhi(llvm::BasicBlock *BB,
Decl *Decl) {
return BB->empty()
? llvm::PHINode::Create(mapType(Decl), 0,
“”, BB)
: llvm::PHINode::Create(mapType(Decl), 0,
“”, &BB->front());
}
- To add the missing operands to the phi instruction, first, we must search all the predecessors of the basic block and add the operand pair value and basic block to the phi instruction. Then, we must try to optimize the instruction:
llvm::Value *
CGProcedure::addPhiOperands(llvm::BasicBlock *BB,
Decl *Decl,
llvm::PHINode *Phi) {
for (auto *PredBB : llvm::predecessors(BB))
Phi->addIncoming(readLocalVariable(PredBB, Decl),
PredBB);
return optimizePhi(Phi);
}
This algorithm can generate unneeded phi instructions. One approach to optimize these will be implemented in the next section.