Emitting assembler text and object code – Basics of IR Code Generation-1
By Reginald Bellamy / November 18, 2023 / No Comments / Generating IR from the AST, Handling the scope of names, IT Certifications, LLVM IR REFERENCE
In LLVM, the IR code is run through a pipeline of passes. Each pass performs a single task, such as removing dead code. We’ll learn more about passes in Chapter 7, Optimizing IR. Outputting assembler code or an object file is implemented as a pass too. Let’s add basic support for it!
We need to include even more LLVM header files. First, we need the llvm::legacy::PassManager class to hold the passes to emit code to a file. We also want to be able to output LLVM IR code, so we also need a pass to emit this. Finally, we’ll use the llvm:: ToolOutputFile class for the file operations:
include “llvm/IR/IRPrintingPasses.h”
include “llvm/IR/LegacyPassManager.h”
include “llvm/MC/TargetRegistry.h”
include “llvm/Pass.h”
include “llvm/Support/ToolOutputFile.h”
Another command-line option for outputting LLVM IR is also needed:
static llvm::cl::opt EmitLLVM(
“emit-llvm”,
llvm::cl::desc(“Emit IR code instead of assembler”),
llvm::cl::init(false));
Finally, we want to be able to give the output file a name:
static llvm::cl::opt
OutputFilename(“o”,
llvm::cl::desc(“Output filename”),
llvm::cl::value_desc(“filename”));
The first task in the new emit() method is to deal with the name of the output file if it’s not given by the user on the command line. If the input is read from stdin, indicated by the use of the minus symbol, -, then we output the result to stdout. The ToolOutputFile class knows how to handle the special filename, -:
bool emit(StringRef Argv0, llvm::Module *M,
llvm::TargetMachine *TM,
StringRef InputFilename) {
CodeGenFileType FileType = codegen::getFileType();
if (OutputFilename.empty()) {
if (InputFilename == “-“) {
OutputFilename = “-“;
}
Otherwise, we drop a possible extension of the input filename and append .ll, .s, or .o as an extension, depending on the command-line options given by the user. The FileType option is defined in the llvm/CodeGen/CommandFlags.inc header file, which we included earlier. This option doesn’t support emitting IR code, so we’ve added the new–emit-llvm option, which only takes effect if it’s used together with the assembly file type:
else {
if (InputFilename.endswith(".mod"))
OutputFilename =
InputFilename.drop_back(4).str();
else
OutputFilename = InputFilename.str();
switch (FileType) {
case CGFT_AssemblyFile:
OutputFilename.append(EmitLLVM ? ".ll" : ".s");
break;
case CGFT_ObjectFile:
OutputFilename.append(".o");
break;
case CGFT_Null:
OutputFilename.append(".null");
break;
}
}
}
Some platforms distinguish between text and binary files, so we have to provide the right open flags when opening the output file:
std::error_code EC;
sys::fs::OpenFlags OpenFlags = sys::fs::OF_None;
if (FileType == CGFT_AssemblyFile)
OpenFlags |= sys::fs::OF_TextWithCRLF;
auto Out = std::make_unique(
OutputFilename, EC, OpenFlags);
if (EC) {
WithColor::error(llvm::errs(), Argv0)
<< EC.message() << ‘\n’;
return false;
}
Now, we can add the required passes to PassManager. The TargetMachine class has a utility method that adds the requested classes. Therefore, we only need to check if the user requests to output the LLVM IR code:
legacy::PassManager PM;
if (FileType == CGFT_AssemblyFile && EmitLLVM) {
PM.add(createPrintModulePass(Out->os()));
} else {
if (TM->addPassesToEmitFile(PM, Out->os(), nullptr,
FileType)) {
WithColor::error(llvm::errs(), Argv0)
<< “No support for file type\n”;
return false;
}
}