2021年3月27日 星期六

Calling Convention in LLVM

Photo by Pavan Trikutam on Unsplash

Introduction

The calling convention is a specification which describes how parameters are passed to a function or how a return value is returned from a function. Different processor architectures may have different calling conventions. For example, in x86, function parameters would be put in the stack. In ARM, some of the function parameters would be put in the registers and the others would be put in the stack.

Because different processors may have different calling conventions, LLVM framework provides a flexible way to let backend developers implement their own calling conventions. Most of these source code are placed in <Arch>ISelLowering.cpp and <Arch>CallingConv.td. In this article, we would take the source code of Sparc architecture as an example to see how LLVM (LLVM 3.2) deals with the calling convention of integer types.

Source Code Analysis

First, let's take a look at SparcCallingConv.td. Two properties are defined in this file: RetCC_Sparc32 and CC_Sparc32. RetCC_Sparc32 describes how return values are returned from a function and CC_Sparc32 describes how parameters are passed to a function. Take CC_Sparc32 as an example, it describes that if the parameter type is i32, the first 6 parameters would be put in the registers I0 to I5 and the rest would be put in the stack.
// Sparc 32-bit C return-value convention.
def RetCC_Sparc32 : CallingConv<[
  CCIfType<[i32], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
  ...
  // Alternatively, they are assigned to the stack in 4-byte aligned units.
  CCAssignToStack<4, 4>
]>;

// Sparc 32-bit C Calling convention.
def CC_Sparc32 : CallingConv<[
  ...
  // i32 f32 arguments get passed in integer registers if there is space.
  CCIfType<[i32, f32], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
  ...
]>;
Second, let's look at SparcISelLowering.cpp, where there are three important functions:
  • LowerFormalArguments(): deal with the functions parameters in the callee
  • LowerReturn(): deal with the return values in the callee
  • LowerCall(): deal with the functions parameters and return values in the caller
In this article, we would focus on the LowerFormalArguments() and we approximately split this function into two steps:
  • Get the information of where the parameters are placed
CCInfo.AnalyzeFormalArguments(Ins, CC_Sparc32);
In this piece of code, Ins represents the set of function parameters and CC_Sparc32 is a function auto-generated according to SparcCallingConv.td. AnalyzeFormalArguments() would visit each parameter in Ins and use the information obtained from CC_Sparc32 to decide the location of each parameter. The analysis result would be stored in the container ArgLocs.

  • Generate the DAG nodes which get the parameters from registers or the stack
In this step, it would iterate each entry in ArgLocs and generate the DAG node according to the location of the parameter. There would be two cases:

    • VA.isRegLoc() returns true.

This means the parameter is placed in the register. Therefore, a DAG node CopyFromReg would be generated to move the parameter to a virtual register.

    • VA.isMemLoc() returns true.
First, the FrameIndex which represents the position of the parameter in the stack would be computed. Then, a DAG node Load would be generated to load the parameter from the stack to a virtual register.
Some of you may have a question:
Why do we load the parameters from "FrameIndex"?  According to the calling convention of Sparc, shouldn't we load the parameters from %fp + 92, %fp + 96, ...?
Yes, you are right! However, in this stage, LLVM does not know the actual size of the stack frame because some information related to the stack is still unknown(ex: spilled registers or callee-saved registers). Thererfore, LLVM provides a hook function eliminateFrameIndex() to let backend developers compute the real position from the FrameIndex and this hook function would be used in the later pass. We would take a look at this function in another article in the future.
In the following, let's use an example to see the DAG nodes generated by LowerFormalArguments().

Experiment

First, for convenience, we change the number of registers from 6 to 1 in CC_Sparc32.

def CC_Sparc32 : CallingConv<[
  ...
  // i32 f32 arguments get passed in integer registers if there is space.
  CCIfType<[i32, f32], CCAssignToReg<[I0]>>,
  ...
]>;

Second, prepare a piece of LLVM IR code which contains a function having two parameters.
define i32 @foo(i32 %m, i32 %n) nounwind uwtable {
entry:
  %m.addr = alloca i32, align 4
  %n.addr = alloca i32, align 4
  store i32 %m, i32* %m.addr, align 4
  store i32 %n, i32* %n.addr, align 4
  ret i32 0
}

Third, use llc to compile this piece of LLVM IR and dump the related DAG graph during compilation. The following is the DAG graph:




In this graph, I use red frames to mark the DAG nodes generated by LowerFormalArguments(). As you can see, the first parameter %m would be passed using the left DAG node, that is CopyFromReg. The second parameter %n would be passed using the right DAG node, that is Load. This result corresponds to our expectations.

Conclusion

In this article, we take a look at the mechanism of how LLVM deals with calling convention of Sparc processor. However, there are still two functions, LowerReturn() and LowerCall(), not introduced yet. We would discuss them in the next article.

沒有留言:

張貼留言