A word on function calls and returns

If you've read my previous post about control hazards, you may have noticed that I talked about branches and jumps, but did not write a single word about function call or return instructions. This omission may be a bit surprising if you are not familiar with RISC-V or architectures with a similar instruction set.

The reason is that JAL (jump and link) and JALR (jump and link register) already cover these use-cases, because both of them can store the address of the instruction following the jump (PC+4) into the destination register. The register x0 can be used as the destination when that return address is not needed. Remember, x0 is the zero register, it is always equal to zero and writes to it are always discarded.

start:
  jal   x0, somewhere   ; direct jump, discard return address
  ...

somewhere:
  jal   x1, f           ; call function f, save return address into x1
  ...

f:
  ...
  jalr  x0, 0(x1)       ; Return to the address x1 (and no offset)

But seriously, don't you find that a bit... unreadable?

The convention set by the proposed ABI is to use register x1 as the return address, this register being refered to as ra. There are also pseudo-instructions that the programmer can use to take advantage of these conventions.

The same code using pseudo-instructions and register alias would look like this. Note that it is 100% equivalent to the snippet above and expands to the exact same instructions.

start:
  j     somewhere       ; unconditional direct jump
  ...

somewhere:
  jal   ra, f           ; call function f
  ...

f:
  ...
  ret                   ; Return from subroutine

Using a register instead of the stack to store the return address (as is the case in the x86 family, for instance) has several advantages.

First, it simplifies the design of the processor a lot, keeping it true to the Load/Store architecture.

Second, it can make the code more efficient, as it is now up to the subroutine to determine if the return address needs to be pushed onto the stack, or if it can simply be left in a register when there are no nested calls. Especially in our case, where the Fetch and Memory units cannot both access the RAM during the same cycle.

Finally, although the convention is to use x1, nothing stops you from using any other register to save the return address. The official specification even suggests using x5 as an alternate return address for millicode routine calls to common prologue and epilogue code for functions. Of course, the subroutine that is called must also know which register has been used, in order to return to the correct address!

Now you know why I didn't mention call and ret in the previous article: they are actually just plain jumps.