下面的代码在第2行接收到seg错误:

char *str = "string";
str[0] = 'z';  // could be also written as *str = 'z'
printf("%s\n", str);

虽然这个方法非常有效:

char str[] = "string";
str[0] = 'z';
printf("%s\n", str);

用MSVC和GCC测试。


当前回答

通常,当程序运行时,字符串字面值存储在只读内存中。这是为了防止您意外地更改字符串常量。在第一个例子中,"string"存储在只读内存中,*str指向第一个字符。当您试图将第一个字符更改为'z'时,会发生段错误。

在第二个例子中,字符串"string"被编译器从其只读母数组复制到str[]数组中。然后允许更改第一个字符。你可以通过打印每个地址来检查:

printf("%p", str);

同样,在第二个例子中打印str的大小会显示编译器已经为它分配了7个字节:

printf("%d", sizeof(str));

其他回答

为什么我得到一个分割错误时写入字符串?

c99n1256草案

字符串字面量有两种不同的用法:

Initialize char[]: char c[] = "abc"; This is "more magic", and described at 6.7.8/14 "Initialization": An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array. So this is just a shortcut for: char c[] = {'a', 'b', 'c', '\0'}; Like any other regular array, c can be modified. Everywhere else: it generates an: unnamed array of char What is the type of string literals in C and C++? with static storage that gives UB if modified So when you write: char *c = "abc"; This is similar to: /* __unnamed is magic because modifying it gives UB. */ static char __unnamed[] = "abc"; char *c = __unnamed; Note the implicit cast from char[] to char *, which is always legal. Then if you modify c[0], you also modify __unnamed, which is UB. This is documented at 6.4.5 "String literals": 5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...] 6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32“初始化”给出了一个直接的例子:

EXAMPLE 8: The declaration char s[] = "abc", t[3] = "abc"; defines "plain" char array objects s and t whose elements are initialized with character string literals. This declaration is identical to char s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' }; The contents of the arrays are modifiable. On the other hand, the declaration char *p = "abc"; defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF实现

计划:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

编译和反编译:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

输出包含:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

结论:GCC将char* it存储在.rodata部分,而不是在.text中。

如果我们对char[]做同样的操作:

 char s[] = "abc";

我们获得:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

因此它被存储在堆栈中(相对于%rbp)。

但是请注意,默认的链接器脚本将.rodata和.text放在同一个段中,该段有执行权限,但没有写权限。这可以观察到:

readelf -l a.out

它包含:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata

参见C常见问题,问题1.32

Q: What is the difference between these initializations? char a[] = "string literal"; char *p = "string literal"; My program crashes if I try to assign a new value to p[i]. A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways: As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size). Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element. Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

当您试图访问不可访问的内存时,会导致分割错误。

Char *str是一个指向不可修改的字符串的指针(这是导致segfault的原因)。

而char str[]是一个数组,可以修改。

The

 char *str = "string";

Line定义了一个指针,并将其指向一个字面值字符串。字面值字符串是不可写的,所以当你这样做:

  str[0] = 'z';

你会得到一个隔离失误。在某些平台上,字面值可能位于可写内存中,因此您不会看到段错误,但无论如何它都是无效代码(导致未定义的行为)。

线:

char str[] = "string";

分配一个字符数组并将字面值字符串复制到该数组中,该数组是完全可写的,因此后续更新没有问题。

// create a string constant like this - will be read only
char *str_p;
str_p = "String constant";

// create an array of characters like this 
char *arr_p;
char arr[] = "String in an array";
arr_p = &arr[0];

// now we try to change a character in the array first, this will work
*arr_p = 'E';

// lets try to change the first character of the string contant
*str_p = 'G'; // this will result in a segmentation fault. Comment it out to work.


/*-----------------------------------------------------------------------------
 *  String constants can't be modified. A segmentation fault is the result,
 *  because most operating systems will not allow a write
 *  operation on read only memory.
 *-----------------------------------------------------------------------------*/

//print both strings to see if they have changed
printf("%s\n", str_p); //print the string without a variable
printf("%s\n", arr_p); //print the string, which is in an array.