TIL: C++ character constants aren’t created equal

arrPtrTerm

When reviewing a patchset involving files, strings, timers, and multiple threads (essentially all the “tough to review” checkboxes right there), a comment from :froydnj caught my eye:

> +const char* kPersistenceFileName = "gv_measurements.json";
> +const char* kPersistenceFileNameNoExt = "gv_measurements";

Please make these `const char kPersistence...[] = "..."`, which is slightly smaller on all of our usual systems.

Why? Aren’t C arrays just pointers to the first element? Don’t we treat char[] and char* identically when logging or appending them to higher-order string objects?

Influenced by :Dexter and :jan-erik, I tried this out myself. I made two small C programs arr.c and ptr.c where the only difference was that one had a const char* and the other a const char[].

I compiled them both with gcc and then checked their sizes in bytes.arrPtrSizes

Sure enough, though ptr.c was smaller by one byte (* is one character and [] is two), after being compiled ptr was larger by a whole 8 bytes!

This is because C arrays aren’t identical to pointers to the first element. Instead they are identifiers that are (most of the time) implicitly converted to be pointers to the first element. For instance, if you’re calling a function void foo(char c[]), inside the function `c` is implicitly a char* and its size is that of a pointer, but outside of the function c could be an array with its size being the length of the array. As an example:

void foo(char c[]) {
  printf("sizeof(c): %d\n", sizeof c);
}
int main(void) {
  char arr[] = "a";
  printf("sizeof(arr): %d\n", sizeof arr);
  return 0;
}

Prints:

sizeof(arr): 2
sizeof(c): 8

Another way to think about this is that the char* form allocates two things: the actual string of four (“abc” plus “\0”) characters, and a pointer called ptr that contains the address of the place the ‘a’ character is stored.

This is in contrast to the char[] form which allocates just the four-character string and allows you to use the name arr to refer to the place the ‘a’ character is stored.

So, in conclusion, if you want to save yourself a pointer’s width on your constant strings, you may wish to declare them as char[] instead of char*.

:chutten

(( :glandium wrote a much deeper look at this problem if you’d like to see how much worse this can actually get thanks to relocations and symbol exporting ))

(( :bsmedberg wrote in to mention that you almost never want const char* for a constant anyway as the pointer isn’t const, only the data is. ))

Advertisements