Friday, May 5, 2017

ARM Assembly: Pointers and Global Memory Access

I deliberately decided to write this tutorial after the one about the stack.  This is because the stack is a harder subject, but if you don't learn it first, you may fall into the bad practice of trying to put everything into global memory.  This is especially bad if you start off by storing lr in global memory.  The stack should be your primary goto place for memory, unless you need create something in a function that will still be needed after the function returns or if you need to allocate quantities of memory too large to put on the stack.  Even then, using global variables is not always the right solution.

When we learned about the stack, we learned that the stack pointer (sp) points to some location in memory, representing the top of the stack.  The stack pointer is just a pointer in the sense of C pointers.  It has a dedicated register, because it is essential to structured programming.  In reality, we can try to access any address in memory space.  Trying to access 0 will throw a null pointer exception, and trying to access anything past 0xc0000000 (3GB) will give you a permission error.  The memory is reserved for the kernel to keep track of information about the program.  Most of the rest of the space is initially unallocated space that will cause a segfault if you try to access it.

There are a few ways to allocate memory.  The stack is allocated by default, and the operating system will typically make sure there is enough, up to a certain limit (8MB, for example).  There is also a data area, a bss area, and a read only area (possibly more, depending on the OS and architecture) that you can also define, which will be allocated for the program when it starts.

These sections require special directives to create in your program.  Each section has a different purpose.  The data section is typically used for preinitialized (they start with data in them) variables.  The bss section is used for uninitialized variables.  The reason you might choose the bss section over the data section with 0s in places that are not initialized is that the data section is written into your executable, taking up more space.  If you need the variables to be preloaded with values, this is usually a secondary concern.  The bss section is allocated when the program starts, which means that it does not need space in the executable, aside from the amount of memory desired.  The last section, the read only or rodata section is typically used for constant strings or other run time constants.  When using the read system call, this appears to silently fail, not writing anything and not throwing an exception.  It appears that trying to write to the rodata area directly causes a segfault.   Put stuff in this section only if you are certain you don't want to overwrite it.  GCC typically puts string constant there.

Perhaps the most valuable area is the data area.  If you find you need to initialize a lot of mutable (changeable) variables at start up, this is probably a good place to put them.  Just make sure the stack would not be a more appropriate location.  For example, perhaps you need some memory to store an IP address.  You want the default to 127.0.0.1.  You program is going to load the IP address from a config file, but perhaps this file is empty the first time a user runs the problem.  You could allocate 4 bytes for the IP address in the bss section, but now you have to check if the IP address was in the config file, and if it was, store it to the memory, but if it was not, you will have to put the default value in.  Further, if you want to change this default, that could require finding there the default is put into memory in your config reading code.  If you put this in the data section and there is no address in the config file, you don't have to do anything, because it is already there.  If you want to change it, you can generally find one entry in the data section more easily than a bunch of moves and an LDM in the middle of a configuration section.

So, let's try it.
.data
.balign 4
ip_address:
    .word 127
    .word   0
    .word   0
    .word   1


.text
.global _start
_start:
    ldr r0, =ip_address
    ldr r0, [r0]

    mov r7, #1
    svc #0
This program is not going to read a config file, but it will set the default IP address.   Technically this is backwards, as the OS stores and accepts IP addresses starting with the lowest section and going up to the highest.

Give this a file name and compile this with as filename.s -o filename.o && ld filename.o -o filename.  From there, you can run it and output the error code, which will be 127.

The first thing that happens when you run this program is that the OS loads it into memory and reads some header information to see if it is valid.  By this time, your data area is already in memory.  Then the text section gets the memory address for the global variable and it reads the first value.  The first LDR instruction is getting a pointer to the data desired.  This is exactly like C pointers.  It is merely a memory address.  The second LDR line dereferences the pointer, getting the data that it points to.  (You could change the second LDR line to ldr r0, [r0, #4] to get the second value in the IP address.  Technically, we just created a simple struct that holds an IP address.  We can increase that 4 in multiples of 4 to get the next two elements of our struct.  Alternatively, we could use LDM to load the entire IP address into registers r0 through r3.  (And we could even flip the direction in the read or the write back, by using the right memory access code.)

The bss section is initially empty.  You use directives to tell the operating system how much preallocated memory you want, and it will allocate it and initialize it all to 0s for you (I am not certain you can rely on this last behavior on all platforms; some may leave garbage from something else there).
.bss
.balign 4 
some_memory:
    .space 16

.text
.global _start
_start:
    ldr r0, =some_memory
    ldr r0, [r0]

    mov r7, #1
    svc #0
This will allocate 16 bytes of memory (the same as the last program) when the program starts.  If you change the amount of space in the bss section to 1,600 bytes, you will find it increases the size of the executable by only 2 bytes.

The program starts by asking for 16 bytes of space in the bss section, and then it reads the first byte and returns it as an error code.  Again, the first LDR instruction is getting a pointer and the second is dereferencing it, getting the data.  You can add immediate multiples of 4 to the second LDR instruction to see the 0s in the other word sized sections of the 16 byte memory space.

The read only data section requires slightly different syntax on this particular device.  The assembler won't recognize the directive .rodata, unless it is preceeded by a .section directive.  Essentially this means that the assembler or linker does not support this section by default, so you have to define one.  The linker script used by the linker does recognize it though, and it makes sure to put it in the right place in the program binary, as well as designating it read only.
.section .rodata
.balign 4
write_this:
    .word 27

.text
.balign 4
.global _start
_start:
    mov r0, #13
    ldr r1, =write_this
    str r0,  [r1]

    mov r0, #0
    ldr r0, [r1]

    mov r7, #1
    svc #0
This will put the value 27 into the read only data section.  From there we put the number 13 in r0, and then we try to write r0 to the variable we created in the read only section.  The last part before returning first overwrites r0 with 0, to make sure we don't have an old value in there, and then we read the value from the rodata area.  If the write succeeded, we should get a 13 as the error code, but if it did not, we will probably get a segfault.  Using the debugger, I identified the location of the segfault this program throws as the STR instruction that attempts to overwrite the read only data.

As before, the LDR at the beginning is getting a pointer.  The STR instruction is dereferencing the  pointer and attempting to write to it this time (changing the stored value to a new one).  The last LDR does not run (the program crashes before this happens), but it is dereferencing the pointer stored in r1 and reading it into a register.

These different sections can give you a lot of control over your program.  If you want to make sure that a value is never changed, the read only section is very useful.  If you need your values initialized, the data section is probably the best.  If you cannot preinitialize your data, then maybe the bss section is what you need.

The things to keep in mind are that putting stuff in the data section will save the program time it would otherwise use to initialize, but putting it in the bss section will reduce the size of the executable.  The read only section is more of an error detection and security mechanism, that can prevent unskilled programmers and hackers from breaking things or sneaking malicious code or data in where it should not be.

No comments:

Post a Comment