- cross-posted to:
- cpp@programming.dev
- cross-posted to:
- cpp@programming.dev
From Russ Cox
Lumping both non-portable and buggy code into the same category was a mistake. As time has gone on, the way compilers treat undefined behavior has led to more and more unexpectedly broken programs, to the point where it is becoming difficult to tell whether any program will compile to the meaning in the original source. This post looks at a few examples and then tries to make some general observations. In particular, today’s C and C++ prioritize performance to the clear detriment of correctness.
I am not claiming that anything should change about C and C++. I just want people to recognize that the current versions of these sacrifice correctness for performance. To some extent, all languages do this: there is almost always a tradeoff between performance and slower, safer implementations. Go has data races in part for performance reasons: we could have done everything by message copying or with a single global lock instead, but the performance wins of shared memory were too large to pass up. For C and C++, though, it seems no performance win is too small to trade against correctness.
Well… I partially agree with you. The final step in the failure-chain was the optimizer assuming that dereferencing NULL would have blown up the program, but (1) that honestly seems like a pretty defensible choice, since it’s accurate 99.999% of the time (2) that’s nothing to do with the language design. It’s just an optimizer bug. It’s in that same category as C code that’s mucks around with its own stack, or single-threaded code that has to have stuff marked
volatile
because of crazy pointer interactions; you just find complex problems sometimes when your language starts getting too close to machine code.I guess where I disagree is that I don’t think a NULL pointer dereference is undefined. In the spec, it is. In a running program, I think it’s fair to say it should dereference 0. Like e.g. I think it’s safe for an implementation of assert() to do that to abort the program, and I would be unhappy if a compiler maker said “well the behavior’s undefined, so it’s okay if the program just keeps going even though you dereferenced NULL to abort it.”
The broader assertion that C is a badly-designed language because it has these important things undefined, I would disagree with; I think there needs to be a category of “not nailed down in the spec because it’s machine-dependent,” and any effort to make those things defined machine-independently would mean C wouldn’t fulfill the role it’s supposed to fulfill as a language.