Saturday, May 6, 2017

ARM Assembly: Indexing Modes

We briefly discussed indexing modes when we learned about the stack and global memory.  It was probably somewhat confusing back then, because indexing modes are really designed to deal with arrays.  We won't learn about arrays quite yet though, as understanding indexing modes is very important in dealing with arrays.

Indexing modes give us a great deal of flexibility in dealing with data in memory.  This is important, since ARM only has a few instructions that can access memory.  (On Intel and some other processors, math instructions can use memory directly, instead of having to load to a register first.  This makes these instructions much slower, but it is necessary on a CPU that has very few registers.)

Indexing modes are relevant to LDR, STR, LDM, and STM instructions.  The last two are far simpler but also far less powerful.  The first two provide a great deal of flexibility.

The LDR and STR instructions take a register and something designating the address of the memory to read.  The simplest uses a label, and it has no special capabilities.  The rest use a register containing a memory address (a pointer) and some optional extra stuff.  There are four of these indexing modes, and each has some extra optional stuff.

The first indexing mode is written in the documentation as [Rn, {, #<offset>}]{!}.  The parts in braces are optional.  The means the simplest way this indexing mode could be used is ldr r0, [r1].  This loads the value pointed to by the address in r1.  If we had several integers in memory, and we wanted the second one, we could add the offset, ldr r0, [r1, #4].  This will get the data pointed to by r1 + 4.  The immediate value can also be negative.  Lastly, if we wanted to load several ints in sequence, we might use ldr r0, [r1, #4]!.  This will literally add 4 to r1, then it will use the new value in r1 as the pointer.  The exclamation point makes the instruction save the address we are accessing into r1.  If we wanted to get a series of values, we could use multiple of this instruction in succession, and each instruction will setup r1 to load the next value.  Of course, this is not exactly good for arrays, as this will always skip the first element.

The second indexing mode is [Rn], #<offset>.  This is a post indexing mode.  This means that the instruction will access the memory at Rn first, then it will add the offset to Rn.  If we used str r0, [r1], #4, the memory pointed to by r1 would first be written to.  Then 4 will be added to r1.  This won't skip the first element of an array.  If we used several of this instruction in series, we would first write the memory pointed to by r1, then r1 + 4, then r1 + 8, and so on, in 4 byte increments.  We will see an example of this when we learn how to work with arrays.

The third indexing mode uses an offset stored in a register, [Rn, +/-Rm {, <opsh>}]{!}.  If this looks familiar, it is because it is very similar to the first one.  There are two differences.  The first is the offset comes from a register, and the second is that we can put in an optional shift value.  If we use the shift value, it will be applied to the index in Rm, before that index is applied to the address in Rn.  This allows us to do things like indexing arrays by element, instead of by memory offset.  For example, we could get the second value in an integer array with ldr r0, [r1, r2, LSL #2], if we had the value 1 in r2.  The exclamation point has the same behavior as in the first indexing mode, it stores the address accessed in r1.  Note that with this one we can start on the first element of an array, by starting with a 0 in r2.

The forth indexing mode also uses an offset stored in a register, but it is similar to the second one.  The documentation shows it as [Rn], +/-Rm {, #<opsh>}.  This indexing mode will add the shifted value from Rm to Rn after the memory is accessed.

These indexing modes can allow you to easily traverse arrays and take specific elements from structs.  They are really handy for fast and efficient memory access that does not require a lot of extra instructions to manage indices.  Learning to use these will help you to be a much better ARM assembly programmer.

I mentioned that LDM and STM also have a sort of indexing mode (it's not really, as there is no actual index).  This is much simpler than the others.  The LDM instruction is shown in the documentation with the argument list Rn{!}, <reglist-PC>.  There are a few more, but the only part we care about right now is the optional exclamation point.  The register used for Rn contains the memory address being written to.  As with the previous indexing modes, the optional exclamation point causes Rn to be updated to the last memory location written by the instruction.  This works the same way with the STM instruction.

No comments:

Post a Comment