NIST has recently published eleven recommendations for software verification in response to Executive Order (EO) 14028, Improving the Nation’s Cybersecurity, 12 May 2021. The document is titled “Guidelines on Minimum Standards for Developer Verification of Software.”

This long-awaited, recommendation update provides direction to software vendors on how they can improve the security and reliability of their software, or any of their products that utilise  software.

Among these industry recognised recommendations is, “run a fuzzer.” GUARDARA appreciates the recognition of fuzz testing as an efficient testing method to uncover bugs and vulnerabilities in the software development lifecycle. The inclusion of fuzzing as a technique recommended by NIST is an exciting indication of the maturation of fuzzing technology as a defensive rather than purely offensive testing solution.

GUARDARA is a negative testing platform designed for custom and proprietary protocols that empowers your security team to find zero-days easily and faster. In addition to simplifying fuzzing, GUARDARA can also be used to improve other forms of testing, such as Black Box Tests and Historical Tests; we explain this all below. 

Fuzz Testing

What is fuzz testing? 

Fuzz testing is the automated or semi-automated process of generating specially crafted test cases to trigger some unintended behavior in software. Fuzz testing is also called negative testing. 


Test Type Input Goal
Unit testing Static, valid sample(s) See if it works as expected
Fuzz testing Dynamically generated, specially-crafted See when it breaks

Table 1: Basic Comparison of Unit Testing and Fuzz Testing

Compared to other testing approaches, such as unit testing, where we provide valid test data to the target under test, fuzz testing is about providing unexpected data. The goal is to detect under what circumstances the test target breaks, or behaves in an unintended way.

Fuzz testing is excellent at identifying software flaws by simulating deliberate abuse of software solutions. Of course, software can break or produce unexpected behavior under normal circumstances as well. However, thinking about all the possible ways things can go wrong and writing all those test cases by hand is a time-consuming and challenging task. In our fast-paced environment, it is almost impossible. Fuzz testing can help engineering and quality assurance teams find bugs and configuration issues that will negatively impact operation even under perfectly normal circumstances. Even better, a fuzzer generates both the negative test cases and the regression tests for you. 

Fuzzing, then and now

Just as mentioned by NIST, the majority of times, fuzzing focuses on detecting issues that result in crashes. It is one of the reasons fuzz testing is still heavily underutilized and seen as a testing method of limited use. It is a mistake to assume fuzz testing only provides value when testing software written in lower-level languages such as C or C++. Classic bugs such as different kinds of memory corruption, format string issues, integer underflow, and overflows, null pointer dereferences significantly impacted software written in C/C++. The typical symptom of these bugs was software crashes. Therefore, security researchers were utilizing fuzzing to find these crashes then develop code to exploit the underlying vulnerabilities. Software development significantly changed over the past two decades. For example, we have programming languages that eliminated some of the typical issues mentioned earlier. As a result, we see software crashes much less frequently. Therefore, if we only looked for software crashes while running fuzz tests, it is no surprise that fuzzing would not come across as extremely valuable. However, we will immediately see fuzz testing in a different light if we consider and monitor software for anomalous behavior other than just crashes. For example, with GUARDARA, it is not only possible to monitor the state of a process (or processes) for crashes but detect unusual behavior based on:

  • responses received from the target and unexpected state changes using response processing rules;
  • the contents of log files using the log monitoring extension, and/or;
  • many more as the target monitoring capabilities are easily extendable and customizable.

Another fundamental change that happened over the past years is that today’s software is not monolithic. Instead, we have multiple distributed services, devices, and visual interfaces working together to form a single product offering. As a result, classical fuzz testing that focuses on testing software or software components in isolation is not sufficient anymore. To truly benefit from the power of fuzz testing, we need a solution capable of performing in today’s environment.

NIST has done a great service to fuzz testing by separating it from web application scanning, and in general, vulnerability scanning. Vulnerability scanning is the process of detecting the presence of known vulnerabilities, while fuzz testing is historically known to be most efficient in finding yet unknown, so-called zero-day vulnerabilities. In comparison to web application vulnerability scanners that look for very specific vulnerabilities (such as those discussed in the OWASP Top 10) and target only web-based applications, a fuzzer can not only uncover issues missed by web application vulnerability scanners, but can do the same for applications and services that are not web-based. 

Moreover, fuzz testing can help in other areas mentioned separately in the NIST guidelines, such as black-box test cases and the historical test cases, whereas web application/vulnerability scanning cannot. The value of fuzz testing in black box and historical test cases is outlined below

Black Box Test Cases

What is Black Box Testing?

Quality assurance teams often perform acceptance testing to determine if the product requirements are met, the product is working according to the expectations, and is fit for release. Acceptance testing involves the execution of Black Box Test Cases.
A clear and accurate explanation of black box testing is included in the NIST guideline as well; you can find it under section 2.6 here.

The analysis is not done by inspecting code but by analyzing functionality and behavior. It is often a tedious, repetitive, and manual (or, in the best case, a semi-automated) process that can take a very long time if done correctly. For many organizations, acceptance testing, therefore, requires quality assurance teams to do heavy manual work. 

In addition to this, security teams also perform black-box testing to see whether they can uncover security vulnerabilities. Thus, there is overlap between the activities of the quality assurance and security teams as, ideally, both utilize negative testing. 

Where does fuzzing come in?

It is important to highlight that these tests involve negative testing, including testing for denial of service and overload, input boundary analysis, input combinations, unexpected and misformatted values, and more. Given that fuzz testing is about automating the generation of negative test cases, applying fuzz testing in this context is something all quality assurance teams should consider.

How can GUARDARA help?

At GUARDARA, we recognized a long time ago that fuzz testing is about a lot more than just finding exploitable 0-day vulnerabilities. Therefore, we designed GUARDARA to enable the entire software development or product team to benefit from fuzz testing, including security and quality assurance teams. To be more specific, when using GUARDARA, you can set up rules to define how software should or should not behave. Then, GUARDARA generates negative test cases and reports whether the observed behavior is compliant with these rules or not.

How we approach black-box testing is also critical. GUARDARA allows setting up tests for individual features. We call this feature coverage. At the same time, it is also possible to build out complex test flows to test multiple features and perform operations in an order a user or malicious actor would. The first approach allows us to focus our testing effort on features that changed, reducing the test time and providing feedback much earlier. The second approach is more suitable for detecting interoperability issues between features, operations, or software components and logic flaws.

Historical Test Cases

What is Historical Testing?

Historical testing is running a regression test to see if any previously fixed bug has resurfaced. Unfortunately, these tests are most often added manually to a regression test suite which creates a lot of maintenance work, especially in the context of negative testing.

Where does fuzzing come in?

Historically, fuzzing was purely a source of negative test cases that triggered some sort of unintended behaviour. The inclusion of these test cases in a regression test suite was primarily a manual process that often required writing code. 

How can GUARDARA help?

GUARDARA enables engineers to leverage the benefits of fuzzing in historical testing outlined above with the click of a button rather than writing code. Regression tests for any issues found during fuzz testing can be run by clicking a button on the user interface or calling the appropriate API endpoint. As an alternative, GUARDARA also allows copying the test cases in different formats to be included in your regression test suite.


By including fuzz testing in their newest guidelines, NIST, the leading authority in technology standard in the USA, has acknowledged fuzzing as a very effective and versatile testing method that has a place at multiple stages of the development lifecycle.

Fuzz testing or negative testing is not only a security researcher’s tool, nor is it only about finding 0-day vulnerabilities. The many and varied benefits of fuzz testing lie not only in the technology, but also in the approach; fuzz testing is a great tool to improve security and reliability while also significantly reducing tedious, manual work so your engineers can focus on areas that require human intelligence and intuition.

GUARDARA is an easy to deploy, versatile, and scalable testing solution that enables teams to exploit negative testing to the fullest extent, whether it’s hunting for zero-day vulnerabilities, acceptance/black-box testing, or historical testing.