Implementing single inheritance – IR Generation for High-Level Language Constructs-1
By Reginald Bellamy / June 15, 2024 / No Comments / Generating IR from the AST, Handling the scope of names, IT Certifications, LLVM IR REFERENCE
A class is a collection of data and methods. A class can inherit from another class, potentially adding more data fields and methods, or overriding existing virtual methods. Let’s illustrate this with classes in Oberon-2, which is also a good model for tinylang. A Shape class defines an abstract shape with a color and an area:
TYPE Shape = RECORD
color: INTEGER;
PROCEDURE (VAR s: Shape) GetColor(): INTEGER;
PROCEDURE (VAR s: Shape) Area(): REAL;
END;
The GetColor method only returns the color number:
PROCEDURE (VAR s: Shape) GetColor(): INTEGER;
BEGIN RETURN s.color; END GetColor;
The area of an abstract shape cannot be calculated, so this is an abstract method:
PROCEDURE (VAR s: Shape) Area(): REAL;
BEGIN HALT; END;
The Shape type can be extended to represent a Circle class:
TYPE Circle = RECORD (Shape)
radius: REAL;
PROCEDURE (VAR s: Circle) Area(): REAL;
END;
For a circle, the area can be calculated:
PROCEDURE (VAR s: Circle) Area(): REAL;
BEGIN RETURN 2 * radius * radius; END;
The type can also be queried at runtime. If the shape is a variable of type Shape, then we can formulate a type test in this way:
IF shape IS Circle THEN (* … *) END;
The different syntax aside, this works much like it does in C++. One notable difference to C++ is that the Oberon-2 syntax makes the implicit this pointer explicit, calling it the receiver of a method.
The basic problems to solve are how to lay out a class in memory and how to implement the dynamic call of methods and run time-type checking. For the memory layout, this is quite easy. The Shape class has only one data member, and we can map it to a corresponding LLVM structure type:
@Shape = type { i64 }
The Circle class adds another data member. The solution is to append the new data member at the end:
@Circle = type { i64, float }
The reason is that a class can have many sub-classes. With this strategy, the data member of the common base class always has the same memory offset and also uses the same index to access the field via the getelementptr instruction.
To implement the dynamic call of a method, we must further extend the LLVM structure. If the Area() function is called on a Shape object, then the abstract method is called, causing the application to halt. If it is called on a Circle object, then the corresponding method to calculate the area of a circle is called. On the other hand, the GetColor() function can be called for objects of both classes.