Press "Enter" to skip to content

Tag: Security

Just what the heck is a “Buffer Overflow” anyway?!

[Let me start this with a disclaimer; a warning; or maybe a promise: This is designed to be an accessible series that describes common software vulnerabilities, their effects, and potential mitigations. I’m writing this for myself as much as for others in the hope that it will simplify some of the concepts, code, and terms of art that come up in software and systems development. Frankly, I’m probably going to mess this up in some way (I just know it) so when I do, please feel free to correct me (nicely) and I will publish both a correction and probably a mea culpa.]

I promised (maybe threatened is more accurate) on Twitter recently [yes, it was over 6 months two years ago. “Recently” is a relative term and can be excused by a global pandemic] to start publishing a series of articles on common software vulnerabilities, what they look like, and how to (potentially) mitigate them. I figure I’ll start at one of the easiest to explain and (hopefully) understand: Buffer Overflows.

But first…

Having a shared language helps us to understand what we’re all talking about so let’s get a couple of basic definitions and such out of the way. I promise this will be the only time we have to look at this, all future posts will just refer back to here.
A “vulnerability” as defined by the Common Vulnerabilities and Exposures (CVE) (a list sponsored by the U.S. Department of Homeland Security (DHS),  Cybersecurity and Infrastructure Security Agency (CISA), and run by MITRE (already an acronym) is

“…a weakness in the computational logic (e.g., code) found in software and hardware components that, when exploited, results in a negative impact to confidentiality, integrity, or availability.”

http://cve.mitre.org/about/terminology.html
[emphasis added]

whereas an “exposure” is

“…a system configuration issue or a mistake in software that allows access to information or capabilities that can be used by a hacker as a stepping-stone into a system or network. ”

http://cve.mitre.org/about/terminology.html

The National Institute of Standards and Technology (NIST) maintains a database of CVE’s called the National Vulnerability Database (NVD)

And finally the Common Weakness Enumeration (CWE) is a community-developed list of common software security weaknesses that is, again, sponsored by DHS and CISA and maintained by MITRE. CWE entries describe particular types of vulnerabilities, how they may commonly appear / happen in code, and potential mitigations.

Gee, thanks for the acronym list. I’ll remember that forever, but back to the task at hand: what is a buffer overflow?

At it’s most basic description, a Buffer Overflow is simply what its name says it is: a program has allocated some space in memory (buffer) to store data but it is presented more data than that buffer will or can hold, and when it tries to put data into the buffer, the data can overflow to somewhere else. Here’s a way to visualize this: Think of a one gallon bucket (sigh. Ok, a 3.78 liter bucket if you must). If you try to fill it with two gallons, the bucket can’t hold that much and the water overflows and gets your floor all wet. Simple, right? This is a very common error in coding. A search of the NVD for vulnerabilities related to CWE-119 shows that they account for about 11% of the total disclosed vulnerabilities. CWE-119 is the “Improper Restriction of Operations with the Bounds of a Memory Buffer” weakness which is the parent type of a bunch of different types of buffer overruns. Note the key phrase above: “total disclosed vulnerabilities.” Odds are there are more instances out there that no one has exposed / reported on yet.

CWE-119 “Improper Restriction of Operations within the Bounds of a Memory Buffer” entries vs total entries in the NVD

But…so what?

You may be thinking to yourself, “So? Some data didn’t go into the right place. I mean, that sucks and all, but it’s just an error…” And most of the time you might be correct. The data could just overflow, get put in the wrong place, return an invalid result, the program crashes, or something. However, any computational error like this is a security concern because it could affect the confidentiality, integrity, or availability of data. At it’s most basic exploitation, the data returned by an application could just be flat out wrong (integrity), or could cause the application to crash (availability), or could reveal data that it shouldn’t (confidentiality). Let’s explore this a bit more to see what it could mean in practice. Let’s look at a stack overflow. This is really only one type of buffer overflow, but it’s one that I think is fairly easy to understand.

Shall we look at code?

Let’s take a look at some very simple code to see what happens when a buffer overrun occurs and then what could happen if we exploited it.

/***********************************************************************
* Original Author:	Falken, Stephen W.
* File Creation Date:	06/22/1973
* Development Group:	W.O.P.R. Ops.
* Description:		War Operation Planned Response (W.O.P.R.) system login control.
**********************************************************************/

#include <iostream>
using namespace std;

int main(void)
{
	int authorized = 0;	//Is the user authorized? Default is 0 or "no"
	char cUsername[7];	// Holds the username entered by the user.


	{

		system("CLS");
	start:
		//Prompt for a logon
		cout << "LOGON: ";
		cin >> cUsername;
		// Let's make sure I can get back in...
		if (strcmp(cUsername, "Joshua") == 0)
		{
			// It me
			authorized = 1;
		}

		if (authorized != 0) // Check if the user is authorized
		{
			// greet
			cout << "GREETINGS PROFESSOR FALKEN\nHOW ARE YOU FEELING TODAY?\n";
			int waitFlag=0;
			do {
				cin >> waitFlag;
			} while (waitFlag != 1);
		}
		else
		{
			// The user is not authorized
			cout << "INDENTIFICATION NOT RECOGNIZED BY SYSTEM\n--CONNECTION TERMINATED--\n\n";

		}
		goto start;
	}
}

The W.O.P.R. login code above is pretty simple. We allocate a couple of values for the system to use, and then ask the user to give us some information. In this case we’re allocating a 7 char buffer value for the username , and an integer value to store the result of our authentication check. If the user enters the correct login value (“Joshua” in this case) the value of “authorized” will change from 0 to 1. The system will then check the value of “authorized” and if it’s not equal to 0 [zero], the application will continue. Otherwise, the code will reject the login and loop back to start again. But what happens if the user enters more than the username buffer can hold? Well, that’s where it gets interesting.

Let’s take a look at what happens when we run the above code and purposefully overload the buffer:

A command shell showing various attempts to log in to an application

The first entry is a single “a” character. Since it’s well within our buffer size of 7, the rest of the code runs as expected and the users login is denied (because a single letter “a” != “Joshua”). The next line is using seven “a” characters and since that’s still not enough data the code runs as expected. The next line is using 8 “a” characters and while still larger than our buffer size, it’s not yet bled into the authorized value. Finally, we enter 9 “a” characters, and we see that we’re now “authenticated”.
Effectively what happened here was we overfilled the username bucket and “spilled” in to the value of authorized. If we look at the value that was assigned to authorized, we’d see that instead of still being “0” [zero] it now has a value of “97” (which is the ASCII value of “a”). Since the code that is checking for authorization is simply looking to see if authorized is equal to zero, we’re now magically “authorized” (because anything that is not zero is considered “good”). If we look at what the memory layout looks like it makes it a little simpler to understand:

As you can see, this is a fairly trivial error, but it’s one that developers can make quite often and has been known and around for quite a while (in fact it happens so often that the compiler in Visual Studio had a BUNCH of different options that I had to disable to even make the buffer overflow even work. Heck I don’t even think I got them all).

What if we were to not just overwrite the value of “authorized” though? Could we maybe even get this application to execute code that it doesn’t normally do? For example, what if we were to go beyond the value of “authorized” in memory and passed in some byte code or “shell” code? Could we get the system to execute it? The short answer is “yes”. Yes we certainly could.

I hope you found this little blog post interesting. If you’d like more like this, or have particular areas you’d like to know more about, let me know. I will (really) try to get these out more often in the future.

Comments closed

Ruminations on risk management, fault tolerance, and Amazon’s HQ2

Yes.  I know.  I committed to posting at least one post per week, and I’ve failed.  However, I am going to try to make it up by posting two articles this week.  First off, some simple thoughts on Amazon’s “HQ2”, fault tolerance, and risk management.  Last year Amazon started the process to find a suitable location for a second headquarters that they are calling “HQ2”.  This is a pretty unique scenario and it got me to thinking. While some news articles have called it a “marketing ploy” to attract tax incentives and offers from regions, I’m taking a different (albeit probably wrong) angle that while their choice to be so public about the selection process most likely is a marketing move,  the underlying thought could be to introduce some redundancy and fault tolerance into their headquarters operations as a method of risk management. As Sancho Panzo said to Don Quixote: “…to retire is not to flee, and there is no wisdom in waiting when danger outweighs hope, and it is the part of wise men to preserve themselves to-day for to-morrow, and not risk all in one day; and I thought this could be a good time to talk about risk management in general.

According to NIST  risk “…is the net negative impact of the exercise of a vulnerability, considering both the probability and the impact of occurrence.  Risk management is the process of identifying risk, assessing risk, and taking steps to reduce risk to an acceptable level. ” which in human terms means that risks are bad things that can happen to your stuff and that to avoid or mitigate risk to an acceptable level, you want to identify your stuff (assets or operations), the bad things that could happen to that stuff (hazards or vulnerabilities), a scenario for that bad thing occurring, the probability of that scenario occurring, and finally the severity of that scenario playing out (what could happen, how likely is it to happen, and what is the final impact).  Based on that information you can determine if the cost of a mitigation is worth the investment in that mitigation as the cost to mitigate a risk shouldn’t exceed the value of the asset itself.  For example, let’s say a business is based in a wooden building (our asset).   One potential hazard for that building is a fire and one potential scenario (there could be multiple) is a total loss of the building.  If in our scenario we have an employee who likes to play with matches and lighter fluid, then we have a higher likelihood of that scenario occurring and this could have a catastrophic impact on our business (all our stuff got burnt up).  However, if we believe that the investment in a fire suppression system is worth the expenditure, we could mitigate this risk by installing sprinklers (or just replace Jane Firebug, but that’s for HR to decide).  Another potential mitigation could be to duplicate operations so that if we did have a fire (Jane didn’t get replaced because she wrote all the code for the companies primary product), we could resume operations in another location.  This would deal with not just the risk of fire, but also other potential risks to the building.  Granted this could be a much more expensive mitigation than sprinklers but it can also be more effective in terms of operational recovery, uptime, etc.  Keep in mind that hazards aren’t just physical in nature.  A hazard (or threat) could be that a vulnerability in software which could exploited by a malicious actor. While it’s a different type of hazard, the process for assessing the risk is the same:  identify the threat, a scenario for that threat, it’s likelihood of occurring, and finally its impact.  From there you can then determine the acceptable risk or mitigation.

If we consider that Amazon has said that it will be “…a complete headquarters for Amazon – not a satellite office. and that they are expecting to allow their staff and executives to be able to choose which headquarters they want to work at, move groups around, etc. then that sounds to me like a basic fault tolerance design.  In 1976, Dr. Algirdas Avizienis wrote that “The most widely accepted definition of a fault-tolerant computing system is that it is a system which has the built-in capability(without external assistance) to preserve the continued correct execution of its programs and input/output (I/O) functions in the presence of a certain set of operational faults.” (sorry, the original paper is behind a paywall ?) That certainly sounds like the concept that Amazon is taking here doesn’t it? (ok, roughly taking here.  This isn’t a doctoral thesis folks, it’s just me musing on a Saturday morning).  And while duplication of a system as a method of mitigating risk is an expensive one, I think Amazon has the cash.

Comments closed