2007-09-06

char pointer versus char array

There are two basic ways to assign a string literal to a local variable.


char *p = "string";
char a[] = "string";


I was curious to see how gcc handles the two.

In both cases, "string" is put into .rodata This means that with the first method, you must not modify the contents of "string".


p[3]='o'; // this causes a segfault.


So technically, the first method should be:


const char *p = "string";


With the pointer method, gcc will initialize the variable using the following:

mov dword ptr [ebp-8], pointer_to_string


Calling a function and passing p will result in:

mov eax,[ebp-8]
mov [esp], eax
call function_name


With the array method, however, gcc reserves space for the string on the stack and will initialize the variable like so:

mov eax, [pointer_to_string]
mov [ebp-15h], eax
mov eax, [pointer_to_string+4]
mov [ebp-11h],eax
mov eax,[pointer_to_string+8]
mov [ebp-0dh],eax
movzx eax,byte ptr [pointer_to_string+12]
mov [ebp-9],al


As you can see, it moves the string into the stack 4 bytes at a time. If the string is more than 64 bytes long, then gcc will actually create a call to memcpy:

lea ecx,[ebp-49h]
mov edx, pointer_to_string
mov eax,41h
mov [esp+8],eax
mov [esp+4],edx
mov [esp],ecx
call wrapper_to_memcpy


Regardless of how long the string is, passing the array to a function results in the following:

lea eax,[ebp-15h]
mov [esp],eax
call function_name


Passing the variable around is virtually identical for both array and pointer. However, initializing the variable takes a huge performance and space penalty for the array. Only use the array method if you intend on changing the string, it takes up space in the stack and adds some serious overhead every time you enter the variable's scope. Use the pointer method otherwise.

No comments: