Fuzzing is a software black-box testing technique where units of a software system are being continuously tested against a stream of random data. Units can be methods, API endpoints, Database interfaces, etc. The software unit is then monitored for exceptions such as crashes, or potential memory leaks.
Why the fuzz?
The main objective of fuzzing is to get the application to crash. It's not meant to test any business bound logic.
Fuzzers work best for discovering vulnerabilities that can be exploited by buffer overflow, DOS, cross-site scripting and SQL injection. These schemes are often used by malicious hackers intent on wreaking the greatest possible amount of havoc in the least possible time. Fuzz testing is less effective for dealing with errors that does not crash the software.
Fuzzing often reveals serious defects within the software and that are usually overlooked by engineers. When you fuzz test a unit of your software you expose it to "unbiased" input that is not affected by assumptions developers usually assume. Teams maintaining large scale projects, where these crashes are really costly, run fuzz tests continuously either as a qualifier for release or even continuously run against HEAD of code-under-test independent from release process with proper reporting when a crash happen.
There are a few variations of fuzzing-oriented instrumentation techniques available and usually it's recommended to test your software against few of them. The most popular techniques are:
- Address sanitization (detects addressability issues)
- Leak sanitization (detects memory leaks)
- Thread sanitization (detects data races and deadlocks)
- Memory sanitization (detects use of uninitialized memory)
How do fuzzers work?
Before going over how fuzzers work, let's start with defining seed data as a corpus of data that represents the skeletal structure of data. Fuzzers rely on seed corpus to derive new inputs to test the software against. While fuzzers usually don't need seed corpus and can run without them, good seed helps in providing a range of possible skeletal structures to enable fuzzers to work more efficiently and achieve coverage faster especially when the structure of the data is complicated i.e a complex JSON object or a deeply nested protocol buffer.
Fuzzing engines offer different interfaces depending on the language. gofuzzer for example, offers an extensive interface for generating random structures. It lets developers decide how to perform the fuzzing.
For example, this snippet uses gofuzzer to generate an object with randomized values of internal values.
type MyType struct {
A string
B string
C int
D struct {
E float64
}
}
f := fuzz.New()
object := MyType{}
f.fuzz(&object)
It lets you decide your fuzzing logic on your own. For example you can use the previous snippet to fuzz a method against 1000 random inputs.
for i := 0; i < 1000; i++ {
f.Fuzz(&object)
FuncToTest(object) # test FuncToTest against 1000 random inputs
}
Other fuzzers like LLVM libfuzzer for C/C++ provide a method interface that is feeded random data, software units are called inside this method with passed data to test against fuzz. The following is an example from the official documentation for libfuzzer. This fails if fuzzer engine passes HI!
in data
.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size > 0 && data[0] == 'H')
if (size > 1 && data[1] == 'I')
if (size > 2 && data[2] == '!')
__builtin_trap();
return 0;
}
As per the docs, if you tried running this using: clang -fsanitize=address,fuzzer file_name.cc && ./a.out
will catch this and crash very quickly. This showcases how powerful fuzzers are in catching even this specific corner case. You can use the method LLVMFuzzerTestOneInput
to test your code-under-test by calling the method inside it.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
FuncToTest(Data, Size);
return 0;
}
BONUS: check this tutorial on how to reproduce heartbleed vulnerability in OpenSSLv1.0.1 using libFuzzer