我一直有一个印象,如果可能的话,永远不要使用goto。
然而,在前几天阅读libavcodec(它是用C编写的)时,我惊讶地注意到它的多种用法。
在支持循环和函数的语言中使用goto是否有优势?如果有,为什么?请提供一个具体的例子,清楚地说明使用goto的理由。
我一直有一个印象,如果可能的话,永远不要使用goto。
然而,在前几天阅读libavcodec(它是用C编写的)时,我惊讶地注意到它的多种用法。
在支持循环和函数的语言中使用goto是否有优势?如果有,为什么?请提供一个具体的例子,清楚地说明使用goto的理由。
当前回答
我在以下情况下使用goto: 当需要从不同位置的函数返回时,并且在返回之前需要进行一些初始化:
non-goto版本:
int doSomething (struct my_complicated_stuff *ctx)
{
db_conn *conn;
RSA *key;
char *temp_data;
conn = db_connect();
if (ctx->smth->needs_alloc) {
temp_data=malloc(ctx->some_size);
if (!temp_data) {
db_disconnect(conn);
return -1;
}
}
...
if (!ctx->smth->needs_to_be_processed) {
free(temp_data);
db_disconnect(conn);
return -2;
}
pthread_mutex_lock(ctx->mutex);
if (ctx->some_other_thing->error) {
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -3;
}
...
key=rsa_load_key(....);
...
if (ctx->something_else->error) {
rsa_free(key);
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -4;
}
if (ctx->something_else->additional_check) {
rsa_free(key);
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return -5;
}
pthread_mutex_unlock(ctx->mutex);
free(temp_data);
db_disconnect(conn);
return 0;
}
goto版本:
int doSomething_goto (struct my_complicated_stuff *ctx)
{
int ret=0;
db_conn *conn;
RSA *key;
char *temp_data;
conn = db_connect();
if (ctx->smth->needs_alloc) {
temp_data=malloc(ctx->some_size);
if (!temp_data) {
ret=-1;
goto exit_db;
}
}
...
if (!ctx->smth->needs_to_be_processed) {
ret=-2;
goto exit_freetmp;
}
pthread_mutex_lock(ctx->mutex);
if (ctx->some_other_thing->error) {
ret=-3;
goto exit;
}
...
key=rsa_load_key(....);
...
if (ctx->something_else->error) {
ret=-4;
goto exit_freekey;
}
if (ctx->something_else->additional_check) {
ret=-5;
goto exit_freekey;
}
exit_freekey:
rsa_free(key);
exit:
pthread_mutex_unlock(ctx->mutex);
exit_freetmp:
free(temp_data);
exit_db:
db_disconnect(conn);
return ret;
}
当您需要更改释放语句中的某些内容时(每个语句在代码中使用一次),第二个版本使它变得更容易,并减少了在添加新分支时跳过其中任何一个语句的机会。在函数中移动它们在这里不会有帮助,因为可以在不同的“级别”进行释放。
其他回答
看看在C语言中编程时何时使用Goto:
Although the use of goto is almost always bad programming practice (surely you can find a better way of doing XYZ), there are times when it really isn't a bad choice. Some might even argue that, when it is useful, it's the best choice. Most of what I have to say about goto really only applies to C. If you're using C++, there's no sound reason to use goto in place of exceptions. In C, however, you don't have the power of an exception handling mechanism, so if you want to separate out error handling from the rest of your program logic, and you want to avoid rewriting clean up code multiple times throughout your code, then goto can be a good choice.
我是什么意思?你可能会有这样的代码:
int big_function()
{
/* do some work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* do some more work */
if([error])
{
/* clean up*/
return [error];
}
/* clean up*/
return [success];
}
This is fine until you realize that you need to change your cleanup code. Then you have to go through and make 4 changes. Now, you might decide that you can just encapsulate all of the cleanup into a single function; that's not a bad idea. But it does mean that you'll need to be careful with pointers -- if you plan to free a pointer in your cleanup function, there's no way to set it to then point to NULL unless you pass in a pointer to a pointer. In a lot of cases, you won't be using that pointer again anyway, so that may not be a major concern. On the other hand, if you add in a new pointer, file handle, or other thing that needs cleanup, then you'll need to change your cleanup function again; and then you'll need to change the arguments to that function.
通过使用goto,它将是
int big_function()
{
int ret_val = [success];
/* do some work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
/* do some more work */
if([error])
{
ret_val = [error];
goto end;
}
end:
/* clean up*/
return ret_val;
}
这样做的好处是,您的代码可以访问执行清理所需的所有内容,并且您已经成功地减少了更改点的数量。另一个好处是你的函数从多个出口点变成了只有一个;不可能不小心从函数返回而不进行清理。
此外,由于goto仅用于跳转到单个点,因此并不是为了模拟函数调用而创建大量来回跳转的意大利面代码。相反,goto实际上有助于编写更结构化的代码。
总而言之,goto应该谨慎使用,并作为最后的手段——但它是有时间和地点的。问题不应该是“你是否必须使用它”,而应该是“它是使用它的最佳选择”。
如果有,为什么?
C语言没有多级/标记的中断,并不是所有的控制流都可以用C语言的迭代和决策原语轻松建模。Gotos对纠正这些缺陷大有帮助。
有时使用某种类型的标志变量来实现一种伪多级中断更清晰,但它并不总是优于goto(至少goto可以轻松地确定控制的位置,不像标志变量),有时您只是不想为了避免goto而付出旗帜/其他扭曲的性能代价。
Libavcodec是一段性能敏感的代码。控制流的直接表达可能是优先考虑的,因为它往往会运行得更好。
1) The most common use of goto that I know of is emulating exception handling in languages that don't offer it, namely in C. (The code given by Nuclear above is just that.) Look at the Linux source code and you'll see a bazillion gotos used that way; there were about 100,000 gotos in Linux code according to a quick survey conducted in 2013: http://blog.regehr.org/archives/894. Goto usage is even mentioned in the Linux coding style guide: https://www.kernel.org/doc/Documentation/CodingStyle. Just like object-oriented programming is emulated using structs populated with function pointers, goto has its place in C programming. So who is right: Dijkstra or Linus (and all Linux kernel coders)? It's theory vs. practice basically.
There is however the usual gotcha for not having compiler-level support and checks for common constructs/patterns: it's easier to use them wrong and introduce bugs without compile-time checks. Windows and Visual C++ but in C mode offer exception handling via SEH/VEH for this very reason: exceptions are useful even outside OOP languages, i.e. in a procedural language. But the compiler can't always save your bacon, even if it offers syntactic support for exceptions in the language. Consider as example of the latter case the famous Apple SSL "goto fail" bug, which just duplicated one goto with disastrous consequences (https://www.imperialviolet.org/2014/02/22/applebug.html):
if (something())
goto fail;
goto fail; // copypasta bug
printf("Never reached\n");
fail:
// control jumps here
使用编译器支持的异常也会出现同样的错误,例如在c++中:
struct Fail {};
try {
if (something())
throw Fail();
throw Fail(); // copypasta bug
printf("Never reached\n");
}
catch (Fail&) {
// control jumps here
}
But both variants of the bug can be avoided if the compiler analyzes and warns you about unreachable code. For example compiling with Visual C++ at the /W4 warning level finds the bug in both cases. Java for instance forbids unreachable code (where it can find it!) for a pretty good reason: it's likely to be a bug in the average Joe's code. As long as the goto construct doesn't allow targets that the compiler can't easily figure out, like gotos to computed addresses(**), it's not any harder for the compiler to find unreachable code inside a function with gotos than using Dijkstra-approved code.
(**) Footnote: Gotos to computed line numbers are possible in some versions of Basic, e.g. GOTO 10*x where x is a variable. Rather confusingly, in Fortran "computed goto" refers to a construct that is equivalent to a switch statement in C. Standard C doesn't allow computed gotos in the language, but only gotos to statically/syntactically declared labels. GNU C however has an extension to get the address of a label (the unary, prefix && operator) and also allows a goto to a variable of type void*. See https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html for more on this obscure sub-topic. The rest of this post ins't concerned with that obscure GNU C feature.
标准C(即未计算的)goto通常不是无法在编译时找到不可达代码的原因。通常的原因是如下所示的逻辑代码。鉴于
int computation1() {
return 1;
}
int computation2() {
return computation1();
}
对于编译器来说,在以下3种结构中找到不可访问的代码同样困难:
void tough1() {
if (computation1() != computation2())
printf("Unreachable\n");
}
void tough2() {
if (computation1() == computation2())
goto out;
printf("Unreachable\n");
out:;
}
struct Out{};
void tough3() {
try {
if (computation1() == computation2())
throw Out();
printf("Unreachable\n");
}
catch (Out&) {
}
}
(请原谅我使用了与大括号相关的编码风格,但我试图使示例尽可能紧凑。)
Visual c++ /W4(即使使用/Ox)也无法在这些类型中找到无法到达的代码,而且正如您可能知道的那样,寻找无法到达的代码的问题通常是无法确定的。(如果你不相信我的话:https://www.cl.cam.ac.uk/teaching/2006/OptComp/slides/lecture02.pdf)
As a related issue, the C goto can be used to emulate exceptions only inside the body of a function. The standard C library offers a setjmp() and longjmp() pair of functions for emulating non-local exits/exceptions, but those have some serious drawbacks compared to what other languages offer. The Wikipedia article http://en.wikipedia.org/wiki/Setjmp.h explains fairly well this latter issue. This function pair also works on Windows (http://msdn.microsoft.com/en-us/library/yz2ez4as.aspx), but hardly anyone uses them there because SEH/VEH is superior. Even on Unix, I think setjmp and longjmp are very seldom used.
2) I think the second most common use of goto in C is implementing multi-level break or multi-level continue, which is also a fairly uncontroversial use case. Recall that Java doesn't allow goto label, but allows break label or continue label. According to http://www.oracle.com/technetwork/java/simple-142616.html, this is actually the most common use case of gotos in C (90% they say), but in my subjective experience, system code tends to use gotos for error handling more often. Perhaps in scientific code or where the OS offers exception handling (Windows) then multi-level exits are the dominant use case. They don't really give any details as to the context of their survey.
编辑补充:这两种使用模式出现在Kernighan和Ritchie的C语言书的第60页左右(取决于版本)。另一件值得注意的事情是,这两个用例都只涉及forward goto。MISRA C 2012版(不像2004版)现在允许goto,只要它们是向前的。
我遇到过goto是一个很好的解决方案的情况,我在这里或其他地方都没有看到过这个例子。
我有一个开关的情况下,所有的情况都需要调用相同的函数在最后。我有其他的情况,最后都需要调用不同的函数。
这看起来有点像这样:
switch( x ) {
case 1: case1() ; doStuffFor123() ; break ;
case 2: case2() ; doStuffFor123() ; break ;
case 3: case3() ; doStuffFor123() ; break ;
case 4: case4() ; doStuffFor456() ; break ;
case 5: case5() ; doStuffFor456() ; break ;
case 6: case6() ; doStuffFor456() ; break ;
case 7: case7() ; doStuffFor789() ; break ;
case 8: case8() ; doStuffFor789() ; break ;
case 9: case9() ; doStuffFor789() ; break ;
}
我没有给每个case一个函数调用,而是用goto代替了break。goto跳转到一个标签,这也是在开关的情况下。
switch( x ) {
case 1: case1() ; goto stuff123 ;
case 2: case2() ; goto stuff123 ;
case 3: case3() ; goto stuff123 ;
case 4: case4() ; goto stuff456 ;
case 5: case5() ; goto stuff456 ;
case 6: case6() ; goto stuff456 ;
case 7: case7() ; goto stuff789 ;
case 8: case8() ; goto stuff789 ;
case 9: case9() ; goto stuff789 ;
stuff123: doStuffFor123() ; break ;
stuff456: doStuffFor456() ; break ;
stuff789: doStuffFor789() ; break ;
}
案例1到3都必须调用doStuffFor123(),类似的案例4到6必须调用doStuffFor456()等。
在我看来,如果你正确使用它们,gotos是非常好的。最后,任何代码都像人们写的那样清晰。有了gotos,你可以写出意大利面代码,但这并不意味着gotos是意大利面代码的原因。这个事业就是我们;程序员。如果我愿意,我也可以用函数创建意大利面条代码。宏也是如此。
goto不好的一个原因是,除了编码风格之外,你可以用它来创建重叠但非嵌套的循环:
loop1:
a
loop2:
b
if(cond1) goto loop1
c
if(cond2) goto loop2
这将创建一个奇怪的,但可能是合法的流控制结构,其中可能有(a, b, c, b, a, b, a, b, b,…)这样的序列,这让编译器黑客不高兴。显然,有许多聪明的优化技巧依赖于这种类型的结构不发生。(我应该检查一下我的龙书……)这样做的结果(使用一些编译器)可能是对包含gotos的代码没有进行其他优化。
如果你知道它只是“哦,顺便说一下”,恰好说服编译器发出更快的代码,那么它可能会很有用。就我个人而言,我更喜欢在使用像goto这样的技巧之前尝试向编译器解释什么是可能的,什么是不可能的,但可以说,我也可能在破解汇编程序之前尝试goto。