2023-06-15 @rrobin
GCC static analysis has come a long way in the last few releases, courtesy of the -fanalyzer feature, however it still does not stack up with compilers from other languages. Lets go over the current pitfalls and how you can address them using GCC trickery.
To get us started, we can compile a piece of code using
gcc -Wall -Werror -fanalyzer file.c
Lets say we have this function f (static [1] is the standard way to say ptr is not NULL):
int f(int ptr[static 1]) { return *ptr; }
Trying to compile something obviously wrong such as
void t() { f(NULL); }
will result in an error from fanalyzer with a detailed explanation that you are dereferencing NULL inside f.
Likewise this will fail too, because malloc may return NULL or because the memory in ptr needs to be initialized:
void t2() { int *ptr = malloc(sizeof(int)); f(ptr); free(ptr); }
Wrap the code in an if statement and initialize the value
void t3() { int *ptr = malloc(sizeof(int)); if (ptr) { *ptr = 42; f(ptr); free(ptr); } }
Lets say you have struct S which holds a pointer, and a function dereferences the pointer. memset in this example sets that pointer to NULL, fanalyzer will fail to catch it, it considers the memory is initialized and get_c will dereference the NULL ptr.
struct S { char *c; }; char get_c(struct S s[static 1]) { return *(s->c); } void test_get_c() { struct S *ptr = malloc(sizeof(struct S)); if (ptr) { memset(ptr, 0, sizeof(struct S)); get_c(ptr); free(ptr); } }
If the struct is allocated in the stack you also have C inline struct initalization.
You can attempt to block some functions from being used using GCC attributes and redefining functions.
Here is an example for memset, attempting to compile a program with memset calls will fail.
#ifndef TEST_H #define TEST_H #include__attribute__((error("Do NOT use memset, it prevents fanalyzer from checking pointers"))) void *memset(void *, int, size_t); #endif
We can also force the inclusion of this header using the include parameter for gcc
gcc -include test.h -Wall -Werror -fanalyzer file.c
However fanalyze will fail to spot errors if we replace malloc with something else. For example getenv
char g(char ptr[static 1]) { return *ptr; } void t4() { char *p = getenv("DOESNOTEXIST"); g(p); }
I don't have a good solution for this one. We could again do a redefine trick by adding this to our header
/// A memory deallocator that does nothing void leak(void *ptr) {} __attribute__ ((malloc, malloc (leak, 1))) char* getenv(const char*);
This would mark getenv as a memory allocator and fanalyzer would now error with the proper error. But we also need to call a dummy deallocator that does nothing.
void t4() { char *p = getenv("DOESNOTEXIST"); g(p); leak(p); }
and now we can add an if statement so compilation works:
if (p) { g(p); leak(p); }
The dummy leak function is there to satisfy the compiler, because now it considers getenv to be a memory allocator.
I think getenv returns some static memory and should never be deallocated, so the ideal annotation would mark a function as returns_null, but I'm abusing the malloc attribute for this.
Lets expand on the previous example by adding an allocator function for S:
struct S* allocate_S() { struct S *ptr = malloc(sizeof(struct S)); if (ptr) { ptr->c = NULL; } return ptr; }
This will not compile
void test_allocate_S_0() { struct S* s = allocate_S(); }
because the variable s is not freed.
error: leak of ‘s’ [CWE-401] [-Werror=analyzer-malloc-leak]
Adding a call to free() addresses that, but once we try to call get_c()
void test_allocate_S_1() { struct S* s = allocate_S(); get_c(s); free(s); }
This will actually fail with multiple errors (s can be NULL), so lets wrap that code
if (s) { get_c(s); free(s); }
That fails exactly like I would expect, because s->c can be NULL
error: dereference of NULL ‘0’ [CWE-476] [-Werror=analyzer-null-dereference]
and the only way to avoid it is to wrap get_c in a check,
if (s->c) { get_c(s); }
But this may not be very what you. Maybe the allocator could be setting c too?
Here is a different allocation function. Notice the check for s->c prior to the call to get_c is gone:
struct S* allocate_S_with_c(char *c) { struct S *ptr = malloc(sizeof(struct S)); if (ptr) { ptr->c = c; } return ptr; } void test_allocate_S_with_c_0() { struct S *s = allocate_S_with_c("static str"); if (s) { get_c(s); } free(s); }
But this one will fail, again because s->c is NULL
void test_allocate_S_with_c_1() { struct S *s = allocate_S_with_c(NULL); if (s) { get_c(s); } free(s); }
Here is an interesting case where c is memory from malloc:
void test_allocate_S_with_c_1() { char *c = malloc(1); struct S *s = allocate_S_with_c(c); if (s) { get_c(s); } free(s); }
This fails with two errors because s->c can now be NULL, but also because we are leaking the allocated memory.
error: dereference of possibly-NULL ‘c’ [CWE-690] [-Werror=analyzer-possible-null-dereference]
error: leak of ‘c’ [CWE-401] [-Werror=analyzer-malloc-leak]
Exactly as intended.
Finally lets create an allocator that allocates c from the heap.
struct S* allocate_S_with_c_from_heap() { char *c = malloc(1); if (!c) { return NULL; } struct S *ptr = malloc(sizeof(struct S)); if (ptr) { ptr->c = c; return ptr; } else { free(c); return NULL; } }
But now this will fail, because the s->c memory will leak:
void test_allocate_S_with_c_from_heap() { struct S *s = allocate_S_with_c_from_heap(); free(s); }
The error is a bit funny in that it mentions unknown:
error: leak of ‘’ [CWE-401] [-Werror=analyzer-malloc-leak]
We could release this manually, or maybe wrap a dedicated function to free the struct.
if (s) { free(s->c); } free(s);
Rough around the edges with regards to libc functions, but definitely getting better. This was tested with gcc 12, so I have not seen the new goodies from gcc 13.
text/gemini;
This content has been proxied by September (ba2dc).