The root cause of many vulnerabilities is memory corruption from software written in memory-unsafe languages like C and C++.
The responsibility to stop memory corruption errors is left to the developer in C and C++ and they are often hard to catch. You fix up one memory vulnerability and another one appears.
We can eliminate these vulnerabilities by moving software away from C and C++.
This is why the National Security Agency (NSA) this November and OpenSSF's '10 point mobility plan' advise and encourage organizations to move away from memory unsafe languages.
OpenSSL, a crucial C software library found in Linux that implements the TLS protocol, is a good case study in memory vulnerabilities. It has suffered from a number of vulnerabilities due to memory corruption, including:
- The Heartbleed bug, found in 2014, allows anyone on the Internet to read the memory of the systems protected by the vulnerable versions of the OpenSSL.
- In November, two high-severity vulnerabilities were exposed due to buffer overflows.
The NSA recommends C#, Rust, Go, Java, Ruby and Swift as possible alternatives to C and C++.
An important feature of all the memory-safe languages that the NSA recommends is that they have a package manager. Cloudsmith's cloud-native artifact repository supports over 28 different types of languages including Go, Rust, C#, Java and Ruby to help you with the transition to a memory-safe language.
Let's talk about the following:
- Memory vulnerabilities in C and C++
- Memory-safe languages- who is the obvious successor to C and C++?
- How do we transition Legacy systems from C and C++?
- Conclusion
Memory vulnerabilities in C and C++
In C and C++ the memory is allocated to all the variables of a program automatically during compile time. The stack is used for static memory allocation like local variables. Dynamic memory is allocated during runtime on the heap when the size of the memory is not known e.g reading in data. C and C++ developers need to manage their dynamic memory by allocating, deallocating memory and checking for potential memory issues.
Managing your memory and assigning data is where memory vulnerabilities can occur.
Buffer overflows
Buffer overflows are the most common type of memory corruption in C and C++ and are one of the top vulnerabilities used by attackers.
A buffer overflow occurs when you allocate memory of a certain size but the data you put into it is larger than the buffer has room for. This causes the program to write past the end of the buffer or array and corrupt the extra memory. This vulnerability can be triggered under certain circumstances e.g. unexpected user input.
A buffer overflow attack exploits the buffer overflow vulnerability. This attack can lead to the following:
- unauthorized memory being accessed in an information leak attack e.g. Heartbleed in OpenSSL,
- memory being changed,
- modify program code in a code corruption or control-flow hijack attack or
- a system to crash.
Protections in C/C++ without moving to another language
There are a number of things that can reduce the risk of memory corruption in C and C++, including:
- Checking that the allocation of memory succeeded.
- Boundary checking.
- Dynamically allocate memory rather than statically when you don't know the size needed.
- Defensive programming where you always assume the worst by checking any input including return values from functions.
- Using managed buffers and strings rather than raw arrays and pointers in C++.
- Using safe functions that include buffer overflow protection e.g use strncat instead of strcat.
- The C++ Core Guidelines advise against using new directly for creating dynamic objects in favor of smart pointers through make_unique for single ownership and make_shared for reference-counted multiple ownership.
- Heap scanning technologies to improve memory safety of C++.
- Operating system kernel settings like ASLR and stack canaries.
Why mitigations against memory vulnerabilities aren't enough
Most of the responsibility to stop memory corruption errors is left to the developer in C and C++ and defenses in the compiler/kernel e.g. ASLR, stack canaries, can be circumvented by attackers. If you are lucky, the program will crash during testing, exposing the memory issue, but these bugs can go under the radar only to be found by attackers when they are running.
On top of this, there are difficulties in C++ that prevent it from improving its memory safety including tech debt and its commitment to backward compatibility.
Memory corruption in C or C++ has gone on for more than 30 years and continues to be one of the top vulnerabilities in real-world exploits. This is what has driven the NSA to encourage organizations to shift to using memory-safe languages.
Memory safe languages
Languages such as C#, Go, Java, Ruby, Rust and Swift use built-in safety mechanisms that minimize the likelihood of memory corruption vulnerabilities like buffer overflows.
It doesn't mean that there are no security vulnerabilities in programs developed in these languages but it eliminates a whole class of vulnerabilities.
Interpreted vs Compiled
Languages, like Java, C#, Python and Ruby are interpreted languages. The source code is not directly translated by the target machine. Instead, first, the source code is compiled into a virtual language and then it is interpreted by a VM.
Well-written C and C++ code will generally run faster than well-written code in interpreted languages mostly because they are statically-typed languages compiled to machine code.
Go and Rust are both performant open-sourced, compiled languages that are memory-safe alternatives to C/C++.
The garbage collector
The garbage collector manages the allocation and release of memory for an application. In garbage-collected languages (e.g. C#, Go, Java, Ruby and Swift) developers don't have to write code to perform memory management tasks.
C and C++ don't use garbage collection for memory management which improves performance at the expense of security and usability.
The Rust language does not perform garbage collection and instead uses a complex ownership and type system. This allows it to be more efficient and performant than other memory-safe languages but at the expense of usability.
Who is C/C++'s successor? Is it all about speed?
Speed isn't the be-all and end-all in programming.
Optimizations in interpreted languages, powerful chips, cheap infrastructure and the development of performant languages have reduced the need for the speed that C and C++ can achieve. Security, usability, team experience and development time are often more important considerations.
Let's hear about some of the contenders for the C/C++ throne:
- Java and C# are pretty fast and have made significant speed increases in the last few years. On top of that Java and C# have comprehensive tooling which leads to a really impressive developer workflow at 15hrs/16hr to merge a PR compared to the average 40hr. The same study showed that C++ programs took on average 140 hrs to merge a PR.
- Go is a good all-rounder with speed and ease of use. Go compiles to machine code, has more modern features, simpler syntax, and is easier to write than C/C++.
- But if we talking about applications where speed is the priority- Rust is a memory safe language that is nearly always faster than Go and compares equally to C++ but it comes with a steep learning curve.
How do we transition legacy code from C and C++
There are lots of alternatives for developing new software in memory-safe languages but migrating billions of lines of legacy C/C++ code is a huge task.
Carbon
Google launched a new programming language called Carbon. It’s an experimental successor to C++. Carbon is not fully memory safe but it has a path toward achieving robust memory safety.
The primary purpose of the Carbon language is to have performance similar to C++, bi-directional interoperability with C++ and migration from existing C++ code.
Carbon folk actually recommend using Rust if it is possible to do so. Many large C++ codebases use features incompatible with Rust making the migration process difficult. This is where Carbon makes sense.
Incrementally move off C++/C
A likely path off memory unsafe programs is to find the most problematic and popular C/C++ programs and replace them 1 at a time.
Rust is now in the Linux kernel- this opens up a lot of possibilities for moving off C and C++ code. One possibility is to replace the use of OpenSSL, a C program that implements TLS and SSL protocols, that has suffered from a number of memory vulnerabilities with Rustls written in Rust.
Conclusion
C and C++ lack guardrails around memory which have led to too many exploitable memory vulnerabilities over the last 30 years. These bugs can be completely eliminated by moving to memory safe languages, like the NSA recommendations of, C#, Rust, Go, Java, Ruby and Swift.
Migrating billions of lines of legacy C/C++ code will be a huge task but pathways exist to make this happen. Who knows, maybe the new Carbon language will develop some nice tools for migrating some of this code without much intervention.
The march has already begun- worries about security, less skilled C/C++ engineers, performant alternatives and trends in software development mean there will be less and less C and C++ code in the world over the next decade.
Cloudsmith supports memory-safe languages
An important feature of all the memory-safe languages that the NSA recommends is that they have a package manager. Cloudsmith's cloud-native artifact repository supports over 28 different types of languages including Go, Rust, C#, Java and Ruby to help you with the transition.