Saturday, May 6, 2017

ARM Assembly: Branches

Branches, also called jumps in some assembly languages, are how we get about in our code.  Any time we don't want to run the next instruction and instead want to execute an instruction somewhere else, we use a branch.  Branches change where in the code the program is currently executing.

Branches are a big part of what makes modern computing possible.  They allow us to call functions, create loops, and conditionally run code (if statements).  Without them, our programs would be linear, only taking initial input and always performing the same operations in the same order.  We would not be able to make programs that react to user input provided while the program is running.  Computers would still be good for math and certain kinds of simulation, but things like word processors, internet browsers, and more importantly video games would just not exist.  They would not be able to.  Interactive anything would not be able to exist.  Almost all programs use branches.

Let's start with a simple branch instruction.
.text
.balign 4
.global _start
_start:
    mov r0, #27
    b skip_to_this

    mov r0, #13
skip_to_this:

    mov r7, #1
    svc #0
This program will start by setting the return value to 27.  Then it will branch to the skip_to_this label.  Notice between the branch and the label, there is an instruction that will change the return value to 13.  This instruction will never run, thus the return value will be 27.  (In real life, we don't put code in our program that never runs like this, but this is just a demonstration.  If we really needed the code to stay in the file and not run, we would comment it out.  Otherwise, we would just delete that line, instead of skipping it like this.)

The b instruction always takes a literal memory address (in the form of a label here).  It does not do anything extra.  It just changes the pc register to run the target instruction next.

As we saw in the Function Pointers tutorial, the bx instruction can be used to branch to an address stored in a register (a pointer).  We won't go over that again here.  Please review the last few paragraphs of the Function Pointers tutorial if you need to brush up.

The next branch instruction is bl.  This instruction is used to make function calls.  Typically when we call a function, we want to eventually return to the location the function was called from.  The bl instruction will help us with this, by storing the return address in lr when we call it.  If we hold onto this address, when the function is done, we can return using bx lr.

.text
.balign 4
.global _start
_start:
    bl fun

    mov r7, #1
    svc #0

fun:
    mov r0, #101
    bx lr
When the program starts running, r0 will have the value 0 in it.  In fact, only pc and sp will have non-zero values in them.  The rest of the registers, including lr, will have 0s (_start is not a function).  When we call fun() using bl, the memory address of the following mov instruction will be stored in lr, and the pc will be set to the memory address of the starting mov instruction in the function.  Once the mov instruction is executed, the bx instruction will set the pc to the address in lr, returning control to _start, at the instruction right after the function call.  Note that the bl instruction will overwrite whatever is in lr, regardless of what it is.  So when nesting functions, each must make sure that its return address is stored somewhere it will not be overwritten.

In the Function Pointers tutorial we already used the blx instruction.  It is just the bl and bx instructions combined.  It will store the return address to lr, and it takes a function pointer instead of a label.

No comments:

Post a Comment