-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GCC 15 Wunterminated-string-initialization issues #1543
base: master
Are you sure you want to change the base?
Conversation
ac000
commented
Jan 21, 2025
•
edited
Loading
edited
fe37226
to
a365f98
Compare
a365f98
to
72ea5ce
Compare
72ea5ce
to
98f11e6
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that I’m unable to check if the change is reasonable.
As I understand, we introduce nxt_str_t to handle string with a non traditional way, that it explicitly specify its length. In this case, the stuff like hex is an array of u_char/uint8 type, I think they are not entirely strings whose ending char ‘\0’ is used.
I’d like to ask team’s review.
Thanks.
Taking this as an example static const u_char hex[16] = "0123456789ABCDEF"; That declares a non-NUL terminated string (by loosing the terminating You know that these are usually used as translation table type things and is really being treated as just an array of characters. So assuming we want to keep this new warning enabled. The two main options are Make it NUL terminated, which this patch does. (As you say ) It doesn't matter that it's NUL terminated or not (other than using an extra byte) as we only ever access the array indices 0-15 and never the NUL byte. This is what nginx does. The other option would be to define it as a character array like static const u_char hex[16] = { '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'A', 'B', 'C', 'D', 'E', 'F' }; I chose the first option for most of the cases as it's simpler and they don't matter being NUL terminated. You can see I took the second option for a couple of them where it does matter that they are non-NUL terminated. I suppose you could maybe declare it as a nxt_str_t... OK, lets stop there... (besides that's just going to give you a pointer to a NUL terminated string literal, which is no better than char [] and char [] better describes its purpose...) |
When building with the upcoming GCC 15, we are hit by the new Wunterminated-string-initialization compiler warning (however, due to -Werror, this is a build failure). E.g. src/nxt_string.c: In function ‘nxt_encode_uri’: src/nxt_string.c:601:36: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization] 601 | static const u_char hex[16] = "0123456789ABCDEF"; | ^~~~~~~~~~~~~~~~~~ We have created a character array containing 16 elements, but is not NUL terminated. While this type of (lookup/translation table) use case is perfectly valid and we could simply disable the warning, it is probably still a useful warning to have enabled. There is a 'nonstring' attribute, but that doesn't influence this warning. Well not yet at least... but that looks to change! Lets just do what nginx does and create them NUL terminated. Except in src/nxt_http_parse.c where they *really* want to be non-NUL terminated and we use *arrays* of chars. Link: <https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Common-Variable-Attributes.html> Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178#c21> Signed-off-by: Andrew Clayton <[email protected]>
98f11e6
to
36d6901
Compare
|
To minimize it, it’s d discussion about if the below code is good or not. static const u_char hex[16] = "0123456789ABCDEF"; Personally I prefer to keep the explicit [16] because:
Regarding string initialization: The original syntax fits this lookup table use case. GCC 15's warning is a bit overzealous here. We don't actually need this string to be NUL-terminated since it's only accessed through 0-15 indices. Removing [16] just to satisfy the compiler warning actually obscures the code's intent. If we really need to handle this warning, I'd prefer to suppress it with a compiler directive rather than change the code structure. Welcome to others suggestions on it. It’s also a discussion if we need to support GCC 15 and the new compile option. |
This fails to build (due to -Werror) with GCC 15 (I.e. the Fedora Rawhide GCC CI check, the Clang check is failing due to Ruby 3.4 API deprecations)... which is the whole point of this PR...
No, I know we don't need it to be NUL terminated, if we did then we had bigger problems. Making it NUL terminated is harmless however... and is what the nginx folks did. nginx actually builds fine with GCC 15 (with -Wextra).
Yes, I agree with that...
Do we need to support GCC 15?, err yes... why wouldn't we? We support all versions of Clang/GCC that support -std=gnu11 Seeing as there is work in progress to make the |
I thought the PR is only because of the option -Wu-s-i* support in GCC 15. |
Correct. |
While I'm comfortable with the current approach, I'll stay neutral as others might prefer the new GCC |
Well, we'll need to do something... GCC 15 will land in Fedora 42 in April, but then even before that it is currently stopping the Fedora Rawhide GCC CI from running... |
static const u_char normal[256] nxt_aligned(64) = | ||
/* The last '\0' is not needed because the string is NUL terminated */ | ||
static const u_char normal[] nxt_aligned(64) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ac000 ,
That comment is incorrect:
/* The last '\0' is not needed because the string is NUL terminated */
Even though this is a "string literal", it is not a string. The standard prohibits having NUL bytes in the middle of a string, and this is thus not a string. See footnote 75 in C23 (n3220): https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf#subsection.6.4.5.
75)A string literal may not be a string (see 7.1.1), because a null character can be embedded in it by a \0 escape sequence.
So I would disable the diagnostic. This change (I'm referring specifically to this hunk, and not the entire patch; most of the patch, I like it) is bad for readability, IMO. Instead, I'd take the opportunity to fix the few places where it wouldn't hurt to adapt to the warning, but then disable it until [[gnu::nonstring()]]
allows you to re-enable it.
Cheers,
Alex
Cc: @thesamesam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add the [[gnu::nonstring]]
attribute already, so that later you only need to re-enable the diagnostic. That would already enhance readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ac000 ,
That comment is incorrect:
/* The last '\0' is not needed because the string is NUL terminated */
Admittedly I just yanked that comment out of nginx, but In this case I would take string to just mean "array of characters".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ac000 ,
That comment is incorrect:/* The last '\0' is not needed because the string is NUL terminated */
Admittedly I just yanked that comment out of nginx,
Ughhh. Maybe tell them that their comment is actively harmful?
but In this case I would take string to just mean "array of characters".
Yeah, but I this one is really too obscure. The new code feels like a hybrid between a string and a nonstring. And I'd say the [256]
was actually informative, since it's not obvious to see how many zeros there are.
"the string is NUL terminated" reminds just so much to a proper string, that it will most likely confuse someone at some point.
I'd keep the old version here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, can you point to the comment in nginx? I can't find it with grep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior related commits:
(sounds like this confused Igor himself.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this context, I would assume "string" just means "string of characters" where string is synonymous with 'series', like you might say "string of robberies".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this context, I would assume "string" just means "string of characters" where string is synonymous with 'series', like you might say "string of robberies".
I'd use "array" instead of string. Or "string literal", if you want. But using string seems inducing confusion.
Maybe I'm too much of a pedantic regarding wording issues; that's my job. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, if we're being pedantic then I think the word would be pedant.
Anyway, I'm probably just going to close this PR and simply disable the warning for now, and then look at sprinkling __attribute__((nonstring))
's (nicer named) around when Kees' patch makes it in, maybe GCC 16, maybe even GCC 15, which would be prefect, or we may be sitting here in two years time still waiting for it to land. Never can tell... (No, think positive thoughts...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, if we're being pedantic then I think the word would be pedant.
Lol. True.
Anyway, I'm probably just going to close this PR and simply disable the warning for now,
I do like that one, however: #1543 (comment)
I'd do that change, and then turn the diagnostic off.
and then look at sprinkling
__attribute__((nonstring))
's (nicer named) around when Kees' patch makes it in, maybe GCC 16, maybe even GCC 15, which would be prefect, or we may be sitting here in two years time still waiting for it to land. Never can tell... (No, think positive thoughts...)
:-)
static const nxt_http_ver_t http11 = { "HTTP/1.1" }; | ||
static const nxt_http_ver_t http10 = { "HTTP/1.0" }; | ||
static const nxt_http_ver_t http11 = | ||
{{ 'H', 'T', 'T', 'P', '/', '1', '.', '1' }}; | ||
static const nxt_http_ver_t http10 = | ||
{{ 'H', 'T', 'T', 'P', '/', '1', '.', '0' }}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep this as it was. It was much more readable. I'd make it a [[gnu::nonstring]], and wait for a future compiler to re-enable the diagnostic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it looks hideous, but it meant we could keep the warning enabled... but then that whole nxt_http_ver_t thing is only used in that function and it seems to be purely to allow for doing a string comparison using unit64_t's...
static const u_char hexadecimal[16] = "0123456789abcdef"; | ||
static const u_char HEXADECIMAL[16] = "0123456789ABCDEF"; | ||
static const u_char hexadecimal[] = "0123456789abcdef"; | ||
static const u_char HEXADECIMAL[] = "0123456789ABCDEF"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I like it. Everything that need not be a nonstring, should be a string. Everybody knows hexadecimal enough that the lost information is negligible.