Rob Kraft's Software Development Blog

Software Development Insights

Archive for the ‘Security’ Category

C# .Net LDAP Injection Prevention

Posted by robkraft on April 30, 2021

OWASP is a great resource for writing secure code, but some of there examples are outdated. For .Net, OWASP, as of this writing, recommends using LinqToAD (which appears to be outdated and no longer supported) or the AntiXSS tool which also appears to be outdated and a bit unreliable.

OWASP LDAP Injection Prevention Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/LDAP_Injection_Prevention_Cheat_Sheet.html

A good example of how attacks on LDAP can occur via injection: https://www.synopsys.com/glossary/what-is-ldap-injection.html

In most cases, you probably just need to valid your input with a white list or a black list. White lists are always more secure than black lists, but if you are adding this check to an existing application you may prefer to start with a black list until you can identify and handle all the special characters you need to support.

Below is a bit of code I wrote to validate data against LDAP Injection risks. I found the Blacklist example here (https://stackoverflow.com/questions/53862391/how-does-ldapdistinguishednameencode-work-with-the-c-sharp-directoryservices-lib) and I added ‘|’, ‘(‘, and ‘)’ to it. Please let me know if you find an error in it!

using System.Text.RegularExpressions;
public class LDAPValidation
{
	static readonly string whitelist = @"^[a-zA-Z\-\.']*$";
	static Regex whiteListRegex = new Regex(whitelist);
	public static bool IsNameValidForLdapQueryWhiteList(string strUserName)
	{
		strUserName = strUserName.Trim();
		if (whiteListRegex.IsMatch(strUserName))
		{
			return true;
		}
		return false;
	}
	
	public static bool IsNameValidForLdapQueryBlackList(string strUserName, bool allowWildCard = false)
	{
		char[] illegalChars = { ',', '\\', '#', '+', '<', '>', ';', '"', '=', '|', '(', ')' };
		if (strUserName.IndexOfAny(illegalChars) == -1)
		{
			if (allowWildCard == false && strUserName.Contains("*"))
				return false;
			return true;
		}
		return false;
	}
	public static void RunTests()
	{
		bool result = false;
		result = IsNameValidForLdapQueryWhiteList("rkraft");
		if (result == false) throw new Exception();
		result = IsNameValidForLdapQueryWhiteList("*");
		if (result == true) throw new Exception();
		result = IsNameValidForLdapQueryWhiteList("#");
		if (result == true) throw new Exception();

		result = IsNameValidForLdapQueryBlackList("rkraft");
		if (result == false) throw new Exception();
		result = IsNameValidForLdapQueryBlackList("*");
		if (result == true) throw new Exception();
		result = IsNameValidForLdapQueryBlackList("#");
		if (result == true) throw new Exception();

	}
}

Posted in Coding, Security | Leave a Comment »

SQL to Find Who Is Trying To Hack The SQL Server You Exposed To The Internet

Posted by robkraft on May 5, 2020

If you are running a SQL Server exposed to the Internet on the default port 1433 then hackers are probably try to hack in to it.  If the SQL Server is running on a different port it is less likely, but it could still be happening.  If you exposed a SQL Server to the world the hackers on the Internet will find it and will launch repeated login attempts against the ‘sa’ account.

That is a good reason for disabling the ‘sa’ account.

That is also a good reason for renaming the ‘sa’ account.

Most SQL Server admins record the Failed Login Attempts in the SQL Server Error Log.  If you have this option turned on, which is recommended, you may see entries like this:

Login failed for user ‘sa’. Reason: Could not find a login matching the name provided. [CLIENT: 113.64.210.68]

You can write an SQL to read the SQL Server ErrorLog and extract all the IP addresses that are attempting to login to your server using the ‘sa’ account.  You can then provide that list to your firewall to blacklist them, although a whitelist of known good addresses is a much better approach.  After you blacklist those IP Addresses, come back the next day and check your logs again because you will probably find a whole bunch of new IP addresses doing the same thing.  This game of ‘whack-a-hacker’ can be played for a long time before the hackers run out of IP addresses.  By the way, if you are going to try to blacklist the bad IP addresses, I recommend blocking entire blocks like 113.64.*.*, not a single IP Address at a time.

Here is an SQL to get you started with reading and filtering your SQL Server Error Logs.  Notice this script only looks for failed login attempts using the ‘sa’ account.  If you look at your logs, you will find failed login attempts for other accounts too.  Happy Hunting.

CREATE TABLE #TmpLog
(LogDate datetime, ProcessInfo VARCHAR(150), Text VARCHAR(max))
INSERT INTO #TmpLog
EXEC sp_readerrorlog
SELECT replace(Text,'Login failed for user ''sa''. Reason: Could not find a login matching the name provided. ',''), 
count(*)
FROM #TmpLog where Text like '%Client%' and Text like '%''sa''%' group by
replace(Text,'Login failed for user ''sa''. Reason: Could not find a login matching the name provided. ','')
having count(*)>10
order by 2 desc
DROP TABLE #TmpLog

 

 

Posted in Security, SQL Server | Leave a Comment »

C# .Net SQL Injection Detection – Especially for Legacy .Net Code

Posted by robkraft on March 4, 2020

When you have an existing .Net code base full of SQL statements, and you want to reduce the chance that there are SQL injection risks in the code, you may decide to perform a review of every SQL statement in order to confirm that they are all coded correctly; or you may hire another company to do this for you. But one problem with this approach is that the code is only “SQL Injection Free” from the moment the review is completed until people start modifying and adding to the code again.

What you should strive for is a way to make sure every past and future SQL statement gets tested for SQL Injection risk before it runs.  That is what this sample code provides you.  If you follow the patterns described here, I believe you can significantly reduce the risk that your code has bugs leading to SQL Injection and it will stay that way going forward.

Using the Decorator Pattern to Provide a Place To Add SQL Injection Detection

The primary technique I recommend in this article for adding SQL Injection detection into your application is to stop using the .ExecuteReader and .ExecuteNonQuery methods.  Instead, use the Decorator pattern to create your own method to be called in place of those two, and that method will include code to do some SQL Injection detection.

Replace:

SqlDataReader reader = command.ExecuteReader();

With 

SqlDataReader reader = command.ExecuteSafeReader(); //provided in sample code

The sample code provided behaves like the Proxy pattern in that it will make the actual call to the database after finding no SQL Injection risk.  The benefit of this approach is that you can then regularly scan your entire code base for the use of .ExecuteReader and .ExecuteNonQuery knowing that there should be no cases of those methods, other than the exception cases you expect.  Thus you can be sure that the majority of your code is running through your SQL Injection detector.

Another benefit of using the Decorator pattern to implement SQL Injection Detection is that you can also easily add other features such as:

  • Logging every SQL that is executed
  • Logging and blocking every SQL that is a SQL Injection risk
  • Altering every SQL on the fly.  One scenario where this could be helpful is that if you renamed a table in the database but had a lot of SQL that needed to change.  You could possibly add a find/replace to every SQL on the fly to change the table name, allowing you more time to find and correct all stored SQL fragments with the old table name.
	public static SqlDataReader ExecuteSafeReader(this SqlCommand sqlcommand)
	{
		if (!sqlcommand.CommandType.Equals(CommandType.StoredProcedure))
		{
			var sql = sqlcommand.CommandText;
			//Options: You could Add logging of the SQL here to track every query ran
			//Options: You could edit SQL - for example if you had renamed a table in the database
			if (!ValidateSQL(sql, SelectRegex))
				return null;
		}

		return sqlcommand.ExecuteReader();
	}

The SQL Injection Detection Code

Warning!  This does not detect all forms of SQL Injection, but it will detect most of them.  Here is what causes the class to throw an exception:

  • Finding a single apostrophe (single quote) that does not have a matching single apostrophe (single quote)
  • Finding double quotes that do not have a matching double quote.  This is only needed if the SQL Server has SET QUOTED_IDENTIFIER OFF.  However, you may also want to use this if your database is MySQL or some other DBMS.
  • Finding a comment within the SQL
  • Finding an ASCII value great than 127
  • Finding a semicolon
  • After extracting the strings and comments, finding any of a specific configurable list of keywords in a SELECT statement such as DELETE, SYSOBJECTS, TRUNCATE, DROP, XP_CMDSHELL

The code is written to be easy to change if you don’t want to enforce any of the rules above, or if you need to add similar rules because you have a special scenario or a DBMS besides SQL Server.

The code uses the regex [^\u0000-\u007F] to reject the SQL if it contains any non-ASCII characters.  This works for the applications I have written, but may need alteration for non American English language support.

The code also uses regexes to check SQL statements for undesirable keywords.  One regex is for SELECT statements and therefore blocks them if they contain INSERT, UPDATE, or DELETE.  Other keywords that may indicate a SQL Injection attempt are also rejected and that list includes waitfor, xp_cmdshell, and information_schema.  Note that I also include UNION in the list; so if you use the UNION keyword you will need to remove that from the list.  UNION is frequently used by hackers attempting to perform SQL Injection.

private static void LoadFromConfig()
{

	_asciiPattern = "[^\u0000-\u007F]";
	_selectpattern = @"\b(union|information_schema|insert|update|delete|truncate|drop|reconfigure|sysobjects|waitfor|xp_cmdshell)\b|(;)";
	_modifypattern = @"\b(union|information_schema|truncate|drop|reconfigure|sysobjects|waitfor|xp_cmdshell)\b|(;)";
	_rejectIfCommentFound = true;
	_commentTagSets = new string[2, 2] { { "--", "" }, { "/*", "*/" } };
}

SQL Server supports two techniques to comment out SQL code in a SQL Statement, two dashes, and enclosing the comment in /* */.  Since it is unlikely that developers write SQL to include comments, my default choice is to reject any SQL containing those values.

Exactly How Is The SQL Injection Detected?

There are basically three steps in the SQL Injection detection process.

First, the code checks for any ASCII values above 127 and rejects the SQL if one is found.

Second, the code removes all the code withing strings and comments.  So an SQL that starts out looking like this:

select * from table where x = ‘ss”d’ and r = ‘asdf’ /* test */ DROP TABLE NAME1 order by 5

becomes this:

select * from table where x = and r = t DROP TABLE NAME1 order by 5

Third, the code looks for keywords, like “DROP” and “XP_CMDSHELL”, in the revised SQL that are on the naughty list.  If any of those keywords are found, the SQL is rejected.

Formatting Methods included in the SQLExtensions Class

The SQLExtensions class provides additional methods to help your coders reduce the risk of SQL Injection.  These methods help coders format variables in SQL when doing so with a parameter is not an option.  The most useful of these methods is FormatStringForSQL and it could be used as shown here to enclose a string in SQL quotes as well as replace any single quotes contained within the value with two single quotes.


string sql = "select * from customers where firstname like " + nameValue.FormatStringForSQL();

Another advantage of using a method like this is that it makes it easy for you to change how you handle the formatting of strings everywhere within your code if you discover that you need to make a change.  For example, perhaps you decide to move your application from SQL Server to MySQL and therefore that you also need to replace double quotes in addition to single quotes.  You could make the change within this method instead of reviewing your entire code base to make the change one by one for each SQL.

Custom .Net Exception Class

I also provided a custom Exception primarily to show how easy it is to implement custom exceptions and because I think it is useful for this extension class.  This provides you more flexibility for handling exceptions.  You can catch and handle the exceptions raised specifically due to SQL Injection risk different than exceptions thrown by the underlying ADO.NET code returned from the database.


[Serializable]
public class SQLFormattingException : Exception
{
	public SQLFormattingException() {}

	public SQLFormattingException(string message): base(message) {}
}

The Rules For Detecting SQL Injection are Configurable

I made enabling/disabling configuration of the SQL Injection detections easy to change so that you could import those rules at runtime if desired so that different applications could have different rules.  Perhaps one of your applications needs to allow semicolons in SQL but the others don’t.  It is a good practice to implement the most stringent rules you can everywhere you can.  Don’t implement weak SQL Injection detection rules everywhere because a single place in your code needs weaker rules.  The rules are “Lazy Loaded” when needed, then cached, to support the ability to change them while an application is running by calling the InvalidateCache method provided.

Below is an example of one of the rules.  You can configure your code to reject the SQL if it contains SQL Server comments.


#region RejectComments Flag
private static bool? _rejectIfCommentFound = null;
public static bool RejectIfCommentFound
{
	get
	{
		if (_rejectIfCommentFound == null)
		{
			LoadFromConfig();
		}
		return (bool)_rejectIfCommentFound;
	}
}
#endregion

Steps To Implement and Use This Code

I suggest you take the following steps to implement this class:

  1. Get the SQLExtensions.cs class file into a project in your code base. You will also need the CustomExceptions.cs class file.  The program.cs just contains a sample usage and there is also a UnitTest1.cs class.
  2. Comment out all the lines in ReallyValidateSQL except for the “return true”
  3. Do a find and replace across your entire code base to replace ExecuteReader with ExecuteSafeReader
  4. Compile and test.  Your app should still work exactly the same at this point.
  5. Review the Customizable Validation Properties and decided which ones you want to implement, then uncomment the lines you commented out in ReallyValidateSQL
  6. Decide if you need to and want to replace dynamically constructed SQL in your application with any of the four FormatSQL… extension methods provided.
  7. Provide me feedback

MIT FREE TO USE LICENSE

This code has an MIT license which means you can use this code in commercial products for free!

A link to the source code example is here: https://github.com/RobKraft/SQLInjectionDetection

Posted in Code Design, CodeProject, Coding, Security | 2 Comments »

SQL Server’s sp_executesql Does Not Protect You from SQL Injection

Posted by robkraft on August 18, 2019

Many coders of SQL have learned we can dynamically construct SQL statements inside of stored procedures and then execute the constructed SQL.  In Microsoft’s SQL Server product there are two commands we can choose for running the constructed SQL:

  • EXEC (EXEC is an alias for EXECUTE, both do the same thing).
  • sp_executesql.

We SQL Server “experts” often advise coders to use sp_executesql instead of EXEC when running dynamically constructed SQL statements to reduce the risk of SQL Injection, and this is good advice.  But it is not the use of sp_executesql that prevents SQL injection, it is the use of parameters with sp_executesql that helps protect against SQL Injection.  You can still construct SQL dynamically and run that SQL using sp_executesql and be affected by a SQL Injection attack.

If you use parameters to substitute all the values in the SQL and then use sp_executesql you have probably eliminated the SQL Injection risk; but as a developer this means you may be unable to dynamically construct the SQL you want to run.

When you use sp_executesql parameters correctly, you can only replace data values in your SQL statement with values from parameters, not parts of the SQL itself.  Thus we can do this to pass in a value for the UserName column:

declare @sql nvarchar(500)
declare @dynvalue nvarchar(50)
select @dynvalue=’testuser’
SET @sql = N’SELECT * FROM appusers WHERE UserName = @p1′;
EXEC sp_executesql @sql, N’@p1 nvarchar(50)’, @dynvalue

But the following code will return an error when trying to pass in the name of the table:

declare @sql nvarchar(500)
declare @dynvalue nvarchar(50)
select @dynvalue=’appusers’
SET @sql = N’SELECT * FROM @p1′;
EXEC sp_executesql @sql, N’@p1 nvarchar(50)’, @dynvalue

Msg 1087, Level 16, State 1, Line 1
Must declare the table variable “@p1”.

If you are dynamically constructing SQL, and you are changing parts of the SQL syntax other than the value of variables, you need to manually write the code yourself to test for the risk of SQL injection in those pieces of the SQL.  This is difficult to do and probably best handled by the application calling the stored procedure.  I recommend that the calling program do the following at a minimum before calling a stored procedure that dynamically constructs SQL:

  1. Validate the length of the parameter. Don’t allow input longer than the maximum length expected.  If the stored procedure allows a column to be passed in that is used for sorting in an ORDER BY clause, and all of your column names are less than or equal to 10 characters in length, then make sure that the length of the parameter passed in does not exceed 10 characters.
  2. Don’t allow a single single quote, make sure to replace a single single quote with two single quotes.
  3. Don’t allow other special characters or even commands such as a semicolon or the UNION keyword or two hyphens that represent a comment in SQL.
  4. Don’t allow ASCII values greater than 255.

That short list is not sufficient to prevent all SQL Injection attacks, but it will block a lot of them and gives you an idea of the challenge involved in preventing SQL Injection attacks from being effective.

If you would like to see for yourself how the EXEC and sp_executesql statements behave I have provided a script you can use to get started with.  Related to this article, the most important query to understand is the last one because it shows a case of SQL injection even though the dynamically generated SQL is ran using sp_executesql.

–1. Create tables and add rows
DROP TABLE InjectionExample
GO
DROP TABLE Users
GO
CREATE TABLE InjectionExample ( MyData varchar (500) NULL)
GO
INSERT INTO InjectionExample VALUES(‘the expecteddata exists’), (‘data only returned via sql injection’)
GO
CREATE TABLE Users( username varchar(50) NULL,[password] varchar(50) NULL)
go
INSERT INTO Users VALUES (‘user1′,’password1’), (‘user2′,’password2’), (‘user3′,’password3’)
GO
–2. Run a test using EXEC with data the programmer expects
declare @sql nvarchar(500)
declare @p1 nvarchar(50)
select @p1 = ‘expecteddata’
select @sql = ‘SELECT * FROM InjectionExample WHERE MyData LIKE ”%’ + @p1 + ‘%”’
exec (@sql)–returns 1 row as expected
GO

–3. Run a test using EXEC with data the hacker used for sql injection
declare @sql nvarchar(500)
declare @p1 nvarchar(50)
select @p1 = ”’ or 1 = 1–‘
select @sql = ‘SELECT * FROM InjectionExample WHERE MyData LIKE ”%’ + @p1 + ‘%”’
exec (@sql)–returns all rows – vulnerable to sql injection
GO

–4. Run a test using sp_executeSQL to prevent this SQL Injection
declare @sql nvarchar(500)
declare @p1 nvarchar(50)
select @p1 = ‘expecteddata’
select @sql = N’select * from InjectionExample WHERE MyData LIKE ”%” + @param1 + ”%”’
exec sp_executesql @sql, N’@param1 varchar(50)’, @p1
GO

–5. Run a test using sp_executeSQL to prevent this SQL Injection – hacker data returns no results
declare @sql nvarchar(500)
declare @p1 nvarchar(50)
declare @pOrd nvarchar(50)
select @p1 = ”’ or 1 = 1–‘
set @pOrd = ‘MyData’
select @sql = N’select * from InjectionExample WHERE MyData LIKE ”%” + @param1 + ”%” order by ‘ + @pOrd
exec sp_executesql @sql, N’@param1 varchar(50)’, @p1
GO

–6. But sp_executesql does not protect against all sql injection!
–In this case, sql is injected into the @pOrd variable to pull data from another table
declare @sql nvarchar(500)
declare @p1 nvarchar(50)
declare @pOrd nvarchar(50)
set @p1 = ‘expecteddata’
set @pOrd = ‘MyData; SELECT * FROM Users’
select @sql = ‘select * from InjectionExample WHERE MyData LIKE ”%” + @param1 + ”%” order by ‘ +@pOrd
exec sp_executesql @sql, N’@param1 nvarchar(50)’, @p1

 

 

Posted in CodeProject, Security, SQL Server | Leave a Comment »

How to Upgrade to a Stronger Password Hash. Such as Upgrading from MD5 to BCrypt.

Posted by robkraft on July 23, 2018

Years ago I upgraded the hash algorithm in our database application from a custom algorithm to BCrypt.  Although the implementation went well I lamented the need to retain the old algorithm until such time as all users had logged in and replaced their old hashed password with a new hashed password.  Since that time I have learned there is a better way to upgrade to a stronger password hash so I am sharing it here in case I need to do this again in the future and possibly to help someone else implement a better approach.

In a nutshell, what I should have done is hashed all the existing hashed passwords using BCrypt immediately.

If that does not give you the hint you need to see the solution, allow me to explain in more detail.

Assume you have a database full of user names and passwords, the passwords were all hashed using MD5, and you now want to use BCrypt to hash them.  The approach a lot of us take, including myself, is to add the BCrypt algorithm to the application code.  When users log in to the updated application, the code does the following:

  • Person enters user name and password
  • Application hashes the password using MD5
  • If the MD5 hash matches the hashed MD5 password in the database:
    • The application hashes the password using BCrypt and stores that in the database
  • In the future, the application will use the BCrypt hash for the user instead of the MD5 password.

The problem with this approach, is that that the application must continue to support the old hash algorithm (MD5 in this example), until all users have logged in and converted their password hashes to BCrypt.

There is a better way.

  • Immediately hash all the MD5 hashes using BCrypt.  (Note, this is not hashing the passwords, because we do not know the passwords.  We are hashing the MD5 hash of the passwords.)
    • This allows us to immediately have all the passwords hashed with the stronger BCrypt algorithm.
  • As users log in to the application, hash the password they enter using BCrypt and see if it matches the hash stored in the database.
    • If not, then hash the password they enter using MD5 (your old algorithm), and then hash the result of that operation using BCrypt.  If that hash matches what is stored in the database, allow the user access to the application and also create a new hash of their password directly using BCrypt and update the hash stored in the database.

As mentioned, the benefit of this approach is that you immediately convert all the stored passwords to the stronger algorithm, instead of implementing a gradual process that does not convert all the passwords until all user accounts have eventually logged in, which could be never for some applications.

Posted in Code Design, Security, Uncategorized | Leave a Comment »

How To Protect Against “Man Over The Shoulder” Attacks

Posted by robkraft on June 24, 2018

Replacing the characters we type in a password field with asterisks or dots is so common that we don’t question the value or purpose of it.  Most people don’t realize that the technique serves just one purpose, and that is to protect people from “Man over the shoulder” attacks.

login-570317_1280

So when do we need this protection?  Only when we are in coffee shops, or airports, or other public places where someone is looking over our shoulders to find out what password we are typing.  And possibly also in our work environments when we are sitting with co-workers and one person is logging in to a system or application.  But when you are in your home alone, or when you are at your desk alone at work, or when you are out in public but in a place where no one is looking over your shoulder to see what you type, hiding the password you type is unnecessary.  In fact, hiding the password we are typing is sometimes worse than unnecessary, it is counter-productive.  How so?  Being unable to see the password we type causes us to choose passwords that are easier to remember and get correct.  A complicated long password is more difficult to enter correctly when you cannot confirm what you have typed than is a shorter password that meets the minimum criteria required by the application.  Thus people are more likely to use a simple password when they don’t have the ability to review it later as they type it again.

The best resolution to this conundrum is probably to have an option to allow the user to see the password they are typing.  Perhaps a checkbox next to the password field to show the password to the user as they type it.  Or, as we see in some browsers, a little eyeball icon that lets users reveal the password they have typed so far.

If you are concerned that revealing passwords assists hackers and spyware and malicious JavaScript that may be running on your computer to discover your password, your fears are unfounded.  That malicious code that is already running on your computer can reveal the password for itself without human intervention in just the same way that the application using the password does.  The one exception to this rule would be software that is recording a video of what you are doing on your computer.  Most tracking and monitoring software do not use this approach, but some do.

I recommend we keep masking passwords, but all password fields should also provide an option to allow the user to see the password they have typed.

GoodLoginScreen

 

 

 

Posted in Security | Leave a Comment »

Flawed Logic in W3C Spec 3.2 of HTML-Design-Principles “Priority of Constituencies” – AutoComplete Bug

Posted by robkraft on April 28, 2017

I believe there is a concept flaw in the W3C spec regarding “priority of constituencies” (https://www.w3.org/TR/html-design-principles/#priority-of-constituencies).

I agree with this explanation http://www.schemehostport.com/2011/10/priority-of-constituencies.html for sites like facebook where users own their data, but not for company sites where companies own the data and users are just performing a role regarding company data.  Owners of data, a category not considered as separate from users in the 2004 W3C spec, should be given priority over the users of the data.  Company owners of data desire to keep users from making poor security decisions and choosing to store their password in their browsers, thus company owners should be allowed to ask their authors to remove the ability for users to store passwords to company roles in their browsers.  By not allowing this ability, Chrome and other browsers cause developers needing this ability to implement techniques that may introduce new security flaws.

Here is a workaround for Chrome: http://stackoverflow.com/questions/35049555/chrome-autofill-autocomplete-no-value-for-password
Here are some other approaches: http://stackoverflow.com/questions/11708092/detecting-browser-autofill

Posted in Security | Leave a Comment »

Malware for Neural Networks: Let’s Get Hacking!

Posted by robkraft on March 24, 2017

I don’t intend to infect any artificial intelligence systems with malware. But I do intend to provide an overview of the techniques that can be used to damage the most popular AI in use today, neural networks.

With traditional hacking attempts, bad actors attempt to plant their own instructions, their own computer code, into an existing software environment to cause existing software to behave badly. But these techniques will not work on neural networks. Neural networks are nothing more than a big collection of numbers and mathematical algorithms that no human can understand well enough to alter in order to obtain a malicious desired outcome. Neural networks are trained, not programmed.

But I am not implying that damage cannot be done to neural networks, or that they can’t be corrupted for evil purposes. I am implying that the techniques for malware must be different.

I have identified five types of malware, or perhaps I should say five techniques, for damaging a neural network.

1. Transplant

The simplest technique for changing the behavior of an existing neural network is probably to transplant the existing neural network with a new one. The new, malicious, neural network presumably would be one that you have trained using the same inputs the old one expected, but the new one would produce different outcomes based on the same inputs. To successfully implement this malware, the hacker would first need to train the replacement neural network, and to do so the hacker needs to know the number of input nodes and the number of output nodes, and also the range of values for each input and the range of results of each output node. The replacement neural net would need to be trained to take the inputs and produce the outputs the hacker desires. The second major task would be to substitute the original neural network with the new neural network. Neural networks accessible to the Internet could be replaced once the hacker had infiltrated the servers and software of the existing neural network. It could be as simple as replacing a file, or it could require hacking a database and replacing values in different tables. This all depends on how the data for the neural network is stored, and that would be a fact the hacker would want to learn prior to even attempting to train a replacement neural network. Some neural networks are embedded in electronic components. A subset of these could be updated in a manner similar to updating firmware on a device, but other embedded neural networks may have no option for upgrades or alterations and the only recourse for the hacker may be to replace the hardware component with a similar hardware compare that has the malicious neural network embedded in it. Obviously there are cases where physical access to the device may be required in order to transplant a neural network.

2. Lobotomy

If a hacker desires to damage a neural network, but is unable or unwilling to train a replacement neural network, the hacker could choose the brute force technique called the lobotomy. As you might guess, when the hacker performs a lobotomy the hacker is randomly altering the weights and algorithms or the network in order to get it to misbehave. The hacker is unlikely to be able to choose a desired outcome or make the neural network respond to specific inputs with specific outputs, but the random alterations introduced by the hacker may lead the neural network to malfunction and produce undesirable outputs. If a hackers goal is to sow distrust in the user community of a specific neural network or of neural networks in general, this may be the best technique for doing so. If one lobotomy can lead a machine to choose a course of action that takes a human life, public sentiment against neural networks will grow. As with a transplant, the hacker also needs to gain access to the data of the existing neural network in order to alter that data.

3. Paraphasia

Of the five hacking techniques presented here I think that paraphasia is the most interesting because I believe it is the one a hacker is most likely to have success with. The term is borrowed from psychology to describe a human disorder that causes a person to say one word when they mean another. In an artificial neural network, paraphasia results when a saboteur maps the response from the neural network to incorrect meanings. Imagine that Tony Stark, aka Iron Man, creates a neural network that uses face recognition to identify each of the Avengers. When the neural network inputs send an image of Captain America through the neural network layers, the neural network recognizes him, and then assigns the label “Captain America” to the image. But a neural network with paraphasia, or I should say a neural network that has been infected with paraphasia, would see that image and assign the label of “Loki” to it. Technically speaking, paraphasia is probably not accomplished by manipulating the algorithms and weights of the neural networks. Rather, it is achieved by manipulating the labels assigned to the outputs. This makes it the most likely candidate for a successful neural network hacking attempt. If I can alter the software consuming the output of a neural network so that when it sees my face it doesn’t assign my name to it, but instead assigns “President of the United States” to it, I may be able to get into secret facilities that I would otherwise be restricted from.

Open and Closed Networks

The first three hacking techniques could be applied to neural networks that are open, or that are closed. A closed neural network is a network that no longer adjusts its weights and algorithms based on new inputs. Neural networks embedded in hardware will often be closed, but the designers of any neural network may choose to close the neural network if they feel it has been trained to an optimal state. An open neural network is a network that continues to adjust its weights and algorithms based on new inputs. This implies that the neural network is open to two additional forms of attack.

4. Brainwashing

Many neural networks we use today continue to evolve their learning algorithms in order to improve their responses. Many voice recognition systems attempt to understand the vocalizations of their primary users and adapt their responses to produce the desired outcomes. Some neural networks that buy and sell stocks alter their algorithms and weights with feedback from the results of those purchases and sales. Neural network designers often strive to create networks that can learn and improve without human intervention. Others attempt to crowdsource the training of their neural networks, and one example of this you may be familiar with is captcha responses that ask you to identify items in pictures. The captcha producer is probably not only using your response to confirm that you are a human, but also to train their neural network on image recognition. Now, imagine that you had a way to consistently lie to the captcha collection neural network. For dramatic effect, let’s pretend that the captcha engine showed you nine images of people and asked you to click on the image of the President of the United States. Then imagine that, as a hacker, you are able to pick the image of your own face millions of times instead of the face of the President. Eventually you may be able to deceive the neural network into believing that you are the President of the United States. Once you had completed this brainwashing of the neural network, you could go to the top secret area and the facial recognition software would let you in because it believed you to be the President. I am not saying that brainwashing would be easy. I think it would be really difficult. And I think it would only work in the case where you could programmatically feed a lot of inputs to the neural network and have some control over the identification of the correct response. For best results, a hacker might attempt to use this technique on a neural network that was not receiving updates through a network like the Internet, but was only receiving updates from a local source. A neural network running inside an automated car or manufacturing facility may operate with this design. Brainwashing is similar to paraphasia. The difference is that in brainwashing, you train the neural network to misidentify the output, but in paraphasia you take a trained neural network and map its output to an incorrect value.

5. OverStimulation

Like a lobotomy, the overstimulation technique only allows the hacker to cause mischief and cause the neural network to make incorrect choices. The hacker is very unlikely to achieve a specific desired outcome from the neural network. Overstimulation can only occur on poorly designed neural networks and essentially these are neural networks that are subject to the overfitting flaw of neural network design. A neural network that is not closed and designed with an inappropriate number of nodes or layers could be damaged by high volumes of inputs that were not examples from the original training set.

Layers of difficulty

To all you aspiring hackers, I also warn you that our neural networks are getting more complex and sophisticated every day and I think this makes it even more difficult to hack them describing the techniques mentioned here. The deep learning revolution has been successful in many cases because multiple neural networks work in sequence to produce a response. The first neural network in the sequence may just try to extract features from the incoming sources. The identified features are the output of the first network and these are passed into a second neural network for more grouping, classification, or identification. After that these results could be passed on to another neural network that makes responses based upon the outputs of the previous neural network. Therefore, any attempted hack upon the process needs to decide which of the neural networks within the sequence to damage.

I am not encouraging you to try to introduce malware into neural networks. I am strongly opposed to anyone attempting to do such things. But I believe it is important for system engineers to be aware of potential ways in which a neural network may be compromised, and raising that awareness is the only purpose of this article.

Posted in CodeProject, Security | Tagged: , , , | 1 Comment »