A Complete Guide to LLVM for Programming Language Creators

Last modified on December 26, 2020

Hacker Recordsdata and Reddit. Thank you all!

I in reality assemble a tiny quiz:

Who’s this tutorial for?

This sequence of compiler tutorials is for these that don’t appropriate need to manufacture a toy language.
You love to assemble objects. You love to assemble polymorphism. You love to assemble concurrency. You love to assemble rubbish sequence. Wait you don’t want GC? Okay, no worries, we gained’t designate that 😛

If you occur to’ve appropriate joined the sequence at this stage, proper right here’s a fleet recap. We’re designing a Java-esque concurrent object-oriented programming language Scramble. We’ve long gone throughout the compiler frontend, the place we’ve carried out the parsing, kind-checking and dataflow prognosis. We’ve desugared our language to purchase it prepared for LLVM - the precept takeaway is that objects had been desugared to structs, and their choices desugared to capabilities.

Be taught about LLVM and as properly you’ll be the envy of your site visitors. Rust makes use of LLVM for its backend, so it wants to be chilly. You’ll beat them on all these efficiency benchmarks, with out having handy-optimise your code or write machine assembly code. Shhhh, I gained’t current them.

Valid give me the code!

The normal code can be realized inside the Scramble compiler repository.

The C++ class definitions for our desugared illustration (we name this Scramble IR) can be realized in deserialise_ir folder. The code for this submit (the LLVM IR skills) can be realized inside the llvm_ir_codegen folder. The repo makes use of the Customer compose pattern and gargantuan eat of std::unique_ptr to manufacture memory administration easier.

To scale back throughout the boilerplate, to study the tactic to generate LLVM IR for a bid language expression, assume about for the IRCodegenVisitor::codegen method that takes inside the corresponding ExprIR object. e.g. for if-else statements:

Rate *IRCodegenVisitor:: codegen(const ExprIfElseIR &expr) {

...

}

Opinion LLVM IR

LLVM sits inside the middle-dwell of our compiler, after we’ve desugared our language components, nonetheless ahead of the backends that focus on bid machine architectures (x86, ARM and lots more and plenty others.)

LLVM’s IR is reasonably low-diploma, it will’t possess language components veil in some languages nonetheless not others (e.g. courses are veil in C++ nonetheless not C). If you occur to’ve stumble upon instruction units ahead of, LLVM IR is a RISC instruction station.

The upshot of it is that LLVM IR appears to be like to be love a extra readable make of assembly. As LLVM IR is machine self sufficient, we don’t favor to apprehension in regards to the alternative of registers, dimension of datatypes, calling conventions or different machine-bid particulars.

So reasonably than a mounted alternative of bodily registers, in LLVM IR now we assemble an countless station of digital registers (labelled %0, %1, %2, %3… we are able to write and browse from. It’s the backend’s job to plot from digital to bodily registers.

And reasonably than allocating bid sizes of datatypes, we retain kinds in LLVM IR. All once more, the backend will desire this sort recordsdata and plot it to the size of the datatype. LLVM has kinds for diversified sizes of ints and floats, e.g. int32, int8, int1 and lots more and plenty others. It additionally has derived kinds: love pointer kinds, array kinds, struct kinds, diagram kinds. To get out extra, try the Variety documentation.

Now, constructed into LLVM are a station of optimisations we are able to bustle over the LLVM IR e.g. ineffective-code elimination, diagram inlining, total subexpression elimination and lots more and plenty others. The particulars of those algorithms are inappropriate: LLVM implements them for us.

Our side of the scale back mark is that we write LLVM IR in Static Single Project (SSA) make, as SSA make makes existence easier for optimisation writers. SSA make sounds love, on the other hand it appropriate method we define variables ahead of eat and assign to variables best as soon as. In SSA make, we cannot reassign to a variable, e.g. x=x+1; as a exchange we assign to a authentic variable each time (x2=x1 + 1).

So in transient: LLVM IR appears to be like to be love assembly with kinds, minus the messy machine-bid particulars. LLVM IR wants to be in SSA make, which makes it easier to optimise. Let’s understand at an occasion!

An occasion: Factorial

Let’s understand at a simple factorial diagram in our language Scramble:

factorial.whisk

diagram int factorial(int n){

if (n==0) {

1

}

else{

n * factorial(n - 1)

}

}

The corresponding LLVM IR is as follows:

factorial.ll

define i32 @factorial(i32) {

entry:

%eq = icmp eq i32 %0, 0

br i1 %eq, label %then, label %else

then: ; preds = %entry

br label %ifcont

else: ; preds = %entry

%sub = sub i32 %0, 1

%2 = name i32 @factorial(i32 %sub)

%mult = mul i32 %0, %2

br label %ifcont

ifcont: ; preds = %else, %then

%iftmp = phi i32 [ 1, %then ], [ %mult, %else ]

ret i32 %iftmp

}

Existing the .ll extension is for human-readable LLVM IR output. There’s additionally .bc for bit-code, a extra compact machine illustration of LLVM IR.

We'll stroll by way of this IR in Four phases of component:

At the Instruction Degree:

Stare how LLVM IR comprises assembly directions love br and icmp, nonetheless abstracts the machine-bid messy particulars of diagram calling conventions with a single name instruction.




Factorial instructions

At the Regulate Float Graph Degree:

If we desire a step help, you may possibly presumably additionally watch the IR defines the management waft graph of this blueprint. IR directions are grouped into labeled celebrated blocks, and the preds labels for every block itemizing incoming edges to that block. e.g. the ifcont celebrated block has predecessors then and else:

At this level, I’m going to desire you're going to assemble gotten stumble upon Regulate Float Graphs and celebrated blocks. We introduced Regulate Float Graphs in a previous submit inside the sequence, the place we dilapidated them to originate diversified dataflow analyses on this blueprint. I’d suggest you hunch and verify the CFG portion of that dataflow prognosis submit now. I’ll wait proper right here 🙂




Factorial control flow graph

The phi instruction represents conditional project: assigning diversified values counting on which previous celebrated block we’ve appropriate attain from. It's of the make phi form [val1, predecessor1], [val2, predecessor2], ... Within the occasion above, we station %iftmp to 1 if we’ve attain from the then block, and %mult if we’ve attain from the else block. Phi nodes wants to be on the begin up of a block, and embody one entry for every predecessor.

At the Characteristic Degree:

Taking another step help, the final construction of a diagram in LLVM IR is as follows:




Factorial fn

At the Module Degree:

An LLVM module comprises the overall recordsdata related to a program file. (For multi-file packages, we’d hyperlink collectively their corresponding modules.)




LLVM Module

Our factorial diagram is appropriate one diagram definition in our module. If we would like to form this blueprint, e.g. to compute factorial(10) we favor to stipulate a
foremost diagram, which is prepared to be the entrypoint for our program’s execution. The foremost diagram’s signature is a hangover from C (we return 0 to veil a success execution):

example_program.c

int foremost(){

factorial(10);

return 0;

}

We specify that we would like to compile for an Intel Macbook Pro inside the module diagram recordsdata:

example_module.ll

source_filename = "Module"

diagram triple = "x86_64-apple-darwin18.7.0"

...

define i32 @factorial(i32) {

...

}

define i32 @foremost() {

entry:

%0 = name i32 @factorial(i32 10)

ret i32 0

}

The LLVM API: Key Ideas

Now we’ve bought the fundamentals of LLVM IR down, let’s introduce the LLVM API. We’ll plow throughout the vital concepts, then introduce extra of the API as we detect LLVM IR further.

LLVM defines an entire host of courses that plot to the concepts we’ve talked about.

  • Rate
  • Module
  • Variety
  • Characteristic
  • FundamentalBlock
  • DepartmentInst

These are all inside the namespace llvm. Within the Scramble repo, I chosen to manufacture this namespacing bid by relating to them as llvm::Rate, llvm::Module and lots more and plenty others.)

A lot of the LLVM API could be very mechanical. Now you’ve thought of the diagrams that define modules, capabilities and celebrated blocks, the connection between their corresponding courses inside the API falls out neatly. That you simply may demand a Module object to purchase a list of its Characteristic objects, and demand a Characteristic to purchase the itemizing of its FundamentalBlocks, and the other route spherical: you may possibly presumably additionally demand a FundamentalBlock to purchase its guardian Characteristic object.

Rate is the deplorable class for any value computed by this blueprint. It'll be a diagram (Characteristic subclasses Rate), a celebrated block (FundamentalBlock additionally subclasses Rate), an instruction, or the outcomes of an intermediate computation.

Each of the expression codegen choices returns a Rate *: the outcomes of executing that expression. That you simply may assume of those codegen choices as producing the IR for that expression and the Rate * representing the digital register containing the expression’s outcome.

ir_codegen_visitor.h

digital Rate *codegen(const ExprIntegerIR &expr) override;

digital Rate *codegen(const ExprBooleanIR &expr) override;

digital Rate *codegen(const ExprIdentifierIR &expr) override;

digital Rate *codegen(const ExprConstructorIR &expr) override;

digital Rate *codegen(const ExprLetIR &expr) override;

digital Rate *codegen(const ExprAssignIR &expr) override;

How can we generate the IR for these expressions? We manufacture a outlandish Context object to tie our full code skills collectively. We eat this Context to purchase get hold of entry to to core LLVM recordsdata constructions e.g LLVM modules and IRBuilder objects.

We’ll eat the context to manufacture appropriate one module, which we’ll imaginatively identify "Module".

ir_codegen_visitor.cc

context = make_uniqueLLVMContext>();

builder = std:: unique_ptrIRBuilder>>(up to date IRBuilder>(*context));

module = make_uniqueModule>("Module", *context);

IRBuilder

We eat the IRBuilder object to incrementally technique up our IR. It's intuitively the an identical of a file pointer when studying/writing a file - it carries round implicit relate, e.g. the wonderful instruction added, the basic block of that instruction and lots more and plenty others. Esteem spellbinding round a file pointer, you may possibly presumably additionally station the builder object to insert directions on the tip of a bid Classic Block with the SetInsertPoint(FundamentalBlock *TheBB) method. Likewise you may possibly presumably additionally get hold of the newest celebrated block with GetInsertBlock().

The builder object has Create___() choices for every of the IR directions. e.g. CreateLoad for a load instruction , CreateSub, CreateFSub for integer and floating level sub directions respectively and lots more and plenty others. Some Create__() directions desire an not well-known Cord argument: that is dilapidated to give the outcome’s register a customized identify. e.g. iftmp is the twine for the next instruction:

%iftmp=phi i32 [ 1, %then ], [ %mult, %else]

Teach the IRBuilder docs to get the tactic an identical to your instruction.

Forms and Constants

We don’t right away originate these, as a exchange we get__() them from their corresponding courses. (LLVM retains observe of how a apparent event of every form / mounted class is dilapidated).

To illustrate, we getSigned to purchase a set signed integer of a bid form and value, and getInt32Ty to purchase the int32 form.

expr_codegen.cc

Rate *IRCodegenVisitor:: codegen(const ExprIntegerIR &expr) {

return FixedInt:: getSigned((Variety:: getInt32Ty(*context)),

expr.val);

};

Characteristic kinds are the identical: we are able to eat OperateType::get hold of. Characteristic kinds embody the return form, an array of the types of the params and whether or not or not the diagram is variadic:

Variety declarations

We'll outline our include customized struct kinds.

e.g. a Tree with a int value, and pointers to left and applicable subtrees:

%Tree = form {i32, Tree*, Tree* }

Defining a customized struct form is a two-stage route of.

First we manufacture the sort with that identify. This offers it to the module’s picture desk. This form is opaque: we are able to now reference in different form declarations e.g. diagram kinds, or different struct kinds, nonetheless we are able to’t manufacture structs of that sort (as we don’t know what’s in it).

StructType *treeType = StructType:: manufacture(*context, StringRef("Tree"));

LLVM bins up strings and arrays the eat of StringRef and ArrayRef. That you simply may right away jog in a string the place the docs require a StringRef, nonetheless I steal to manufacture this StringRef bid above.

The 2nd step is to specify an array of kinds that hunch inside the struct physique. Existing since we’ve outlined the opaque Tree form, we are able to get hold of a Tree * form the eat of the Tree form’s getPointerTo() method.

treeType->setBody(ArrayRefType *>({Variety:: getInt32Ty(*context);, treeType->getPointerTo(), treeType->getPointerTo()}));

So whilst you occur to can assemble gotten customized struct kinds relating to different customized struct kinds of their our bodies, the right method is to outline the entire opaque customized struct kinds, then assemble in every of the structs’ our bodies.

class_codegen.cc

void IRCodegenVisitor:: codegenClasses(

const std:: vectorstd:: unique_ptrClassIR>> &courses) {

for (auto &currClass : courses) {

StructType:: manufacture(*context, StringRef(currClass->className));

}

for (auto &currClass : courses) {

std:: vectorType *> physiqueTypes;

for (auto &self-discipline : currClass->fields) {

physiqueTypes.push_back(self-discipline->codegen(*this));

}

StructType *classType =

module->getTypeByIdentify(StringRef(currClass->className));

classType->setBody(ArrayRefType *>(physiqueTypes));

}

### Functions

Functions diagram in a the identical two step route of:

  1. Define the diagram prototypes
  2. Occupy of their diagram our bodies (skip this whilst you occur to’re linking in an exterior diagram!)

The diagram prototype consists of the diagram identify, the diagram form, the “linkage” recordsdata and the module whose picture desk we would like to add the diagram to. We steal Exterior linkage - this means the diagram prototype is viewable externally. This implies that we are able to hyperlink in an exterior diagram definition (e.g. if the eat of a library diagram), or expose our diagram definition in another module. That you simply may watch the fleshy enum of linkage decisions proper right here.

function_codegen.cc

Characteristic:: Compose(performType, Characteristic:: ExternalLinkage,

diagram->functionName, module.get hold of());

To generate the diagram definition we appropriate favor to eat the API to originate the management waft graph we talked about in our factorial occasion:

function_codegen.cc

void IRCodegenVisitor:: codegenFunctionDefn(const FunctionIR &diagram) {

Characteristic *llvmFun =

module->getFunction(diagram.functionName);

FundamentalBlock *entryBasicBlock =

FundamentalBlock:: Compose(*context, "entry", llvmFun);

builder->SetInsertPoint(entryBasicBlock);

...

The legit Kaleidoscope tutorial has an very simply applicable clarification of how to originate a management waft graph for an if-else commentary.

More LLVM IR Ideas

Now we’ve lined the fundamentals of LLVM IR and the API, we’re going to understand at some extra LLVM IR concepts and introduce the corresponding API diagram calls alongside them.

Stack allocation

There are two methods we are able to retailer values in native variables in LLVM IR. We’ve thought of the primary: project to digital registers. The 2nd is dynamic memory allocation to the stack the eat of the alloca instruction. Even as we are able to retailer ints, floats and pointers to each the stack or digital registers, mixture datatypes, love structs and arrays, don’t slot in registers so should easy be saved on the stack.

Bound, you learn that applicable. No longer like most programming language memory objects, the place we eat the heap for dynamic memory allocation, in LLVM we appropriate assemble a stack.

Heaps are not supplied by LLVM - they're a library characteristic. For single-threaded capabilities, stack allocation is sufficient. We’ll deal with the necessity for a worldwide heap in multi-threaded packages inside the subsequent submit (the place we extend Scramble to improve concurrency).

We’ve thought of struct kinds e.g. {i32, i1, i32}. Array kinds are of the make [num_elems x elem_type]. Existing num_elems is a set - it's well-known to originate this when producing the IR, not at runtime. So [3 x int32] is legit nonetheless [n x int32] isn't any longer.

We give alloca a kind and it allocates a block of memory on the stack and returns a pointer to it, which we are able to retailer in a register. We'll eat this pointer to load and retailer values from/onto the stack.

To illustrate, storing a 32-bit int on the stack:

%p = alloca i32

retailer i32 1, i32* %p

%1 = load i32, i32* %p

The corresponding builder directions are… you guessed it CreateAlloca, CreateLoad, CreateStore.
CreateAlloca returns a specific subclass of Rate *: an AllocaInst *:

AllocaInst *ptr = builder->CreateAlloca(Variety:: getInt32Ty(*context),

"p");

ptr->getAllocatedType();

builder->CreateLoad(ptr);

builder->CreateStore(someVal, ptr);

Global variables

Valid as we alloca native variables on a stack, we are able to manufacture world variables and load from them and retailer to them.

Global variables are declared earlier than each factor up of a module, and are portion of the module picture desk.




Global variables

We'll eat the module object to manufacture named world variables, and to demand them.

module->getOrInsertGlobal(globalVarName, globalVarType);

...

GlobalVariable *globalVar = module->getNamedGlobal(globalVarName);

Global variables should be initialised with a set value (not a variable):

globalVar->setInitializer(initValue);

Alternatively we are able to designate this in a single communicate the eat of the GlobalVariable constructor:

GlobalVariable *globalVar = up to date GlobalVariable(module, globalVarType, false,

GlobalWorth:: ExternalLinkage, initValue, globalVarName)

As ahead of we are able to load and retailer:

builder->CreateLoad(globalVar);

builder->CreateStore(someVal, globalVar);

GEPs

We get hold of a deplorable pointer to the combo form (array / struct) on the stack or in world memory, nonetheless what if we would like a pointer to a bid component? We’d favor to get the pointer offset of that component contained inside the combine, after which add this to the deplorable pointer to purchase the sort out of that component. Calculating the pointer offset is machine-bid e.g. depends on the size of the datatypes, the struct padding and lots more and plenty others.

The Win Convey Pointer (GEP) instruction is an instruction to observe the pointer offset to the deplorable pointer and return the next pointer.

Capture into consideration two arrays beginning at p. Following C conference, we are able to itemizing a pointer to that array as char* or int*.




Pointer Offset

Below we veil the GEP instruction to calculate the pointer p+1 in every of the arrays:

%idx1 = getelementptr i8, i8* %p, i64 1

%idx2 = getelementptr i32, i32* %p, i64 1

This GEP instruction is a little bit little bit of of a mouthful so proper right here’s a breakdown:




GEP int

This i64 1 index offers multiples of the deplorable form to the deplorable pointer. p+1 for i8 would add 1 byte, whereas as p+1 for i32 would add Four bytes to p. If the index become as soon as i64 0 we’d return p itself.

The LLVM API instruction for rising a GEP is… CreateGEP.

Rate *ptr = builder->CreateGEP(baseType, basePtr, arrayofIndices);

Wait? Array of indices? Bound, the GEP instruction can assemble a couple of indices handed to it. We’ve checked out a simple occasion the place we best wished one index.

Sooner than we understand on the case the place we jog a couple of indices, I would like to reiterate the reason for this foremost index:

A pointer of form Foo * can itemizing in C the deplorable pointer of an array of form Foo. The first index offers multiples of this deplorable form Foo to traverse this array.

GEPS with Structs

Okay, now let’s understand at structs. So desire a struct of form Foo:

%Foo = form { i32, [4 x i32], i32}

We need to index bid fields inside the struct. The pure method might per probability presumably be to label them self-discipline 0, 1 and 2. We'll get hold of entry to self-discipline 2 by passing this into the GEP instruction as another index.

%ThirdFieldPtr = getelementptr %Foo, %Foo* %ptr, i64 0, i64 2

The returned pointer is then calculated as: ptr + 0 (dimension of Foo) + offset 2 (fields of Foo).

For structs, you’ll possible constantly jog the primary index as 0. The largest confusion with GEPs is that this 0 can appear redundant, as we want the self-discipline 2, so why are we passing a 0 index first? Confidently you may possibly presumably additionally watch from the primary occasion why we want that 0. Reflect of it as passing to GEP the deplorable pointer of an implicit Foo array of dimension 1.

To stay away from the confusion, LLVM has a specific CreateStructGEP instruction that asks best for self-discipline index (that is the CreateGEP instruction with a 0 added for you):

Rate *thirdFieldPtr = builder->CreateStructGEP(baseStructType, basePtr, disciplineIndex);

The extra nested our mixture construction, the extra indices we are able to present. E.g. for component index 2 of Foo’s 2nd self-discipline (the Four component int array):

getelementptr %Foo, %Foo* %ptr, i64 0, i64 1, i64 2

The pointer returned is: ptr + 0 (dimension of Foo) + offset 1 (self-discipline of Foo) + offset 2 (elems of array).

(By method of the corresponding API, we’d eat CreateGEP and jog the array {0,1,2}.)

A Valid deal with that explains GEP neatly.

mem2reg

If you occur to keep in mind, LLVM IR wants to be written in SSA make. But what occurs if the Scramble supply program we’re making an try to plot to LLVM IR isn't any longer in SSA make?

To illustrate, if we’re reassigning x:

One alternative might per probability presumably be for us to rewrite this blueprint in SSA make in an earlier compiler stage. Every time we reassign a variable, we’d assemble to manufacture a authentic variable. We’d even assemble to introduce phi nodes for conditional statements. For our occasion, that is simple, nonetheless usually this further rewrite is a anxiousness we'd reasonably not deal with.

We'll eat pointers to dwell away from assigning authentic variables. Existing proper right here we aren’t reassigning the pointer x, appropriate updating the value it pointed to. So that is legit SSA.

This swap to pointers is an exceptional easier transformation than variable renaming. It additionally has a really positive LLVM IR an identical: allocating stack memory (and manipulating the rules to the stack) reasonably than studying from registers.

So each time we outline a neighborhood variable, we eat alloca to purchase a pointer to freshly allotted stack relate. We eat the load and retailer directions to learn and alter the value pointed to by the pointer:

reassign_var.ll

%x = alloca i32

retailer i32 1, i32* %x

%1 = load i32, i32* %x

%2 = add i32 %1, 1

retailer i32 %2, i32* %x

Let’s revisit the LLVM IR if we have been to rewrite the Scramble program to eat authentic variables. It’s best 2 directions, in distinction with the 5 directions wished if the eat of the stack. Furthermore, we dwell away from the costly load and retailer memory get hold of entry to directions.

assign_fresh_vars.ll

%x1 = 1

%x2 = add i32 %x1, 1

So whereas we’ve made our lives easier as compiler writers by avoiding a rewrite-to-SSA jog, this has attain on the expense of efficiency.

Fortunately, LLVM lets us assemble our cake and enjoyment of it.

LLVM presents a mem2reg optimisation that optimises stack memory accesses into register accesses. We appropriate favor to guarantee we outline all our allocas for native variables inside the entry celebrated block for the diagram.

How can we designate this if the native variable declaration occurs halfway throughout the diagram, in another block? Let’s understand at an occasion:

let x : int = someVal;

%x = alloca i32

retailer i32 someVal, i32* %x

We'll in reality cross the alloca. It doesn’t matter the place we allocate the stack relate goodbye as a result of it is allotted ahead of eat. So let’s write the alloca on the very begin up of the guardian diagram this native variable declaration occurs.

How can we designate this inside the API? Successfully, keep in mind the analogy of the builder being love a file pointer? We can assemble a couple of file pointers pointing to diversified areas inside the file. Likewise, we instantiate a model up to date IRBuilder to veil the start up of the entry celebrated block of the guardian diagram, and insert the alloca directions the eat of that builder.

expr_codegen.cc

Characteristic *parentFunction = builder->GetInsertBlock()

->getParent();

IRBuilder> TmpBuilder(&(parentFunction->getEntryBlock()),

parentFunction->getEntryBlock().begin up());

AllocaInst *var = TmpBuilder.CreateAlloca(boundVal->getType());

builder->CreateStore(someVal, var);

Join me on this studying stride!

I’m the eat of my weblog to educate the problems I’ve learnt this 365 days. It’s a eradicate-eradicate - you get hold of pc science tutorials and I get hold of to fragment it with you!

LLVM Optimisations

The API makes it very simple to add passes. We manufacture a functionPassManager, add the optimisation passes we’d love, after which initialise the supervisor.

ir_codegen_visitor.cc

std:: unique_ptrlegacy:: FunctionPassManager> functionPassManager =

make_uniquelegacy:: FunctionPassManager>(module.get hold of());

functionPassManager->add(createPromoteMemoryToRegisterPass());

functionPassManager->add(createInstructionCombiningPass());

functionPassManager->add(createReassociatePass());

functionPassManager->add(createGVNPass());

functionPassManager->add(createCFGSimplificationPass());

functionPassManager->doInitialization();

We bustle this on every of this blueprint’s capabilities:

ir_codegen_visitor.cc

for (auto &diagram : capabilities) {

Characteristic *llvmFun =

module->getFunction(StringRef(diagram->functionName));

functionPassManager->bustle(*llvmFun);

}

Characteristic *llvmMainFun = module->getFunction(StringRef("foremost"));

functionPassManager->bustle(*llvmMainFun);

In bid, let’s understand on the the factorial LLVM IR output by our Scramble compiler ahead of and after. That you simply may get them inside the repo:

factorial-unoptimised.ll

define i32 @factorial(i32) {

entry:

%n = alloca i32

retailer i32 %0, i32* %n

%1 = load i32, i32* %n

%eq = icmp eq i32 %1, 0

br i1 %eq, label %then, label %else

then: ; preds = %entry

br label %ifcont

else: ; preds = %entry

%2 = load i32, i32* %n

%3 = load i32, i32* %n

%sub = sub i32 %3, 1

%4 = name i32 @factorial(i32 %sub)

%mult = mul i32 %2, %4

br label %ifcont

ifcont: ; preds = %else, %then

%iftmp = phi i32 [ 1, %then ], [ %mult, %else ]

ret i32 %iftmp

}

And the optimised model:

factorial-optimised.ll

define i32 @factorial(i32) {

entry:

%eq = icmp eq i32 %0, 0

br i1 %eq, label %ifcont, label %else

else: ; preds = %entry

%sub = add i32 %0, -1

%1 = name i32 @factorial(i32 %sub)

%mult = mul i32 %1, %0

br label %ifcont

ifcont: ; preds = %entry, %else

%iftmp = phi i32 [ %mult, %else ], [ 1, %entry ]

ret i32 %iftmp

}

Stare how we’ve in reality bought rid of the alloca and the related load and retailer directions, and as properly eliminated the then celebrated block!

Wrap up

This glorious occasion reveals you the power of LLVM and its optimisations. That you simply may get the head-diploma code that runs the LLVM code skills and optimisation inside the precept.cc file inside the Scramble repository.

Within the following few posts we’ll be checked out some extra improved language components: generics, inheritance and method ov

Read More

Similar Products:

Recent Content