避免没有参数的 SQL 注入

在这里,我们还将讨论在代码中使用参数化 sql 查询的问题。我们在讨论中有两个方面: 我和其他一些人说,我们应该始终使用参数来防止 sql 注入,而其他人则认为没有必要这样做。相反,他们希望在所有字符串中用两个撇号替换单个撇号,以避免 sql 注入。我们的数据库都在运行 SqlServer2005或2008,而我们的代码库正在运行。NET 框架2.0。

让我给你一个 C # 中的简单例子:

我想我们用这个:

string sql = "SELECT * FROM Users WHERE Name=@name";
SqlCommand getUser = new SqlCommand(sql, connection);
getUser.Parameters.AddWithValue("@name", userName);
//... blabla - do something here, this is safe

而其他人想这么做:

string sql = "SELECT * FROM Users WHERE Name=" + SafeDBString(name);
SqlCommand getUser = new SqlCommand(sql, connection);
//... blabla - are we safe now?

其中 SafeDBString 函数定义如下:

string SafeDBString(string inputValue)
{
return "'" + inputValue.Replace("'", "''") + "'";
}

现在,只要对查询中的所有字符串值使用 SafeDBString,就应该是安全的,对吗?

使用 SafeDBString 函数有两个原因。首先,这是自石器时代以来一直采用的方法,其次,调试 sql 语句更容易,因为您可以看到在数据库上运行的精确查询。

那么。我的问题是,使用 SafeDBString 函数来避免 sql 注入攻击是否真的足够。我一直在尝试找出破坏这种安全措施的代码的例子,但是我找不到任何这方面的例子。

有人能打破这个吗? 你会怎么做?

编辑: 总结迄今为止的答复:

  • 目前还没有人找到在 SqlServer2005或2008上绕过 SafeDBString 的方法。我觉得这很好?
  • 一些答复指出,使用参数化查询可以提高性能。原因是可以重用查询计划。
  • 我们还同意,使用参数化查询可以提供更易于维护的可读代码
  • 此外,总是使用参数比使用各种版本的 SafeDBString、字符串到数字的转换和字符串到日期的转换更容易。
  • 使用参数可以自动进行类型转换,这在处理日期或十进制数时特别有用。
  • 最后,朱利安写的 不要试图自己做安全。数据库供应商在安全性方面花费了大量的时间和金钱。我们没有办法做得更好,也没有理由去做他们的工作。

因此,虽然没有人能够破坏 SafeDBString 函数的简单安全性,但我得到了许多其他好的参数。谢谢!

30396 次浏览

I'd use stored procedures or functions for everything, so the question wouldn't arise.

Where I have to put SQL into code, I use parameters, which is the only thing that makes sense. Remind the dissenters that there are hackers smarter than they are, and with better incentive to break the code that's trying to outsmart them. Using parameters, it's simply not possible, and it's not like it's difficult.

I have used both approaches to avoid SQL injection attacks and definitely prefer parametrized queries. When I have used concatenated queries I have used a library function to escape the variables (like mysql_real_escape_string) and wouldn't be confident I have covered everything in a proprietary implementation (as it seems you are too).

And then somebody goes and uses " instead of '. Parameters are, IMO, the only safe way to go.

It also avoids a lot of i18n issues with dates/numbers; what date is 01/02/03? How much is 123,456? Do your servers (app-server and db-server) agree with each-other?

If the risk factor isn't convincing to them, how about performance? The RDBMS can re-use the query plan if you use parameters, helping performance. It can't do this with just the string.

First of all, your sample for the "Replace" version is wrong. You need to put apostrophes around the text:

string sql = "SELECT * FROM Users WHERE Name='" + SafeDBString(name) & "'";
SqlCommand getUser = new SqlCommand(sql, connection);

So that's one other thing parameters do for you: you don't need to worry about whether or not a value needs to be enclosed in quotes. Of course, you could build that into the function, but then you need to add a lot of complexity to the function: how to know the difference between 'NULL' as null and 'NULL' as just a string, or between a number and a string that just happens to contain a lot of digits. It's just another source for bugs.

Another thing is performance: parameterized query plans are often cached better than concatenated plans, thus perhaps saving the server a step when running the query.

Additionally, escaping single quotes isn't good enough. Many DB products allow alternate methods for escaping characters that an attacker could take advantage of. In MySQL, for example, you can also escape a single quote with a backslash. And so the following "name" value would blow up MySQL with just the SafeDBString() function, because when you double the single quote the first one is still escaped by the backslash, leaving the 2nd one "active":

x\' OR 1=1;--


Also, JulianR brings up a good point below: NEVER try to do security work yourself. It's so easy to get security programming wrong in subtle ways that appear to work, even with thorough testing. Then time passes and a year later your find out your system was cracked six months ago and you never even knew it until just then.

Always rely as much as possible on the security libraries provided for your platform. They will be written by people who do security code for a living, much better tested than what you can manage, and serviced by the vendor if a vulnerability is found.

With parameterised queries you get more than protection against sql injection. You also get better execution plan caching potential. If you use the sql server query profiler you can still see the 'exact sql that is run on the database' so you're not really losing anything in terms of debugging your sql statements either.

You aren't able to easily do any type checking of the user input without using parameters.

If you use the SQLCommand and SQLParameter classes to make you're DB calls, you can still see the SQL query that's being executed. Look at the SQLCommand's CommandText property.

I'm always a litle suspect of the roll-your-own approach to preventing SQL injection when parameterized queries are so easy to use. Second, just because "it's always been done that way" doesn't mean it's the right way to do it.

From the very short time I've had to investigate SQL injection problems, I can see that making a value 'safe' also means that you're shutting the door to situations where you might actually want apostrophes in your data - what about someone's name, eg O'Reilly.

That leaves parameters and stored procedures.

And yes, you should always try to implement code in the best way you know now - not just how its always been done.

This is only safe if you're guaranteed that you're going to pass in a string.

What if you're not passing in a string at some point? What if you pass just a number?

http://www.mywebsite.com/profile/?id=7;DROP DATABASE DB

Would ultimately become:

SELECT * FROM DB WHERE Id = 7;DROP DATABASE DB

Here are a couple of articles that you might find helpful in convincing your co-workers.

http://www.sommarskog.se/dynamic_sql.html

http://unixwiz.net/techtips/sql-injection.html

Personally I prefer to never allow any dynamic code to touch my database, requiring all contact to be through sps (and not one which use dynamic SQl). This means nothing excpt what I have given users permission to do can be done and that internal users (except the very few with production access for admin purposes) cannot directly access my tables and create havoc, steal data or commit fraud. If you run a financial application, this is the safest way to go.

Agree hugely on the security issues.
Another reason to use parameters is for efficiency.

Databases will always compile your query and cache it, then re-use the cached query (which is obviously faster for subsequent requests). If you use parameters then even if you use different parameters the database will re-use your cached query as it matches based on the SQL string before binding the parameters.

If however you don't bind parameters then the SQL string changes on every request (that has different parameters) and it will never match what's in your cache.

I think the correct answer is:

Don't try to do security yourself. Use whatever trusted, industry standard library there is available for what you're trying to do, rather than trying to do it yourself. Whatever assumptions you make about security, might be incorrect. As secure as your own approach may look (and it looks shaky at best), there's a risk you're overlooking something and do you really want to take that chance when it comes to security?

Use parameters.

Here are a few reasons to use parameterized queries:

  1. Security - The database access layer knows how to remove or escape items that are not allowed in data.
  2. Separation of concerns - My code is not responsible for transforming the data into a format that the database likes.
  3. No redundancy - I don't need to include an assembly or class in every project that does this database formatting/escaping; it's built in to the class library.

The argument is a no-win. If you do manage to find a vulnerability, your co-workers will just change the SafeDBString function to account for it and then ask you to prove that it's unsafe all over again.

Given that parametrized queries are an undisputed programming best practice, the burden of proof should be on them to state why they aren't using a method that is both safer and better performing.

If the issue is rewriting all the legacy code, the easy compromise would be to use parametrized queries in all new code, and refactor old code to use them when working on that code.

My guess is the actual issue is pride and stubbornness, and there's not much more you can do about that.

There were few vulnerability(I can't remember which database it was) that is related to buffer overflow of the SQL statement.

What I want to say is, SQL-Injection is more then just "escape the quote", and you have no idea what will come next.

I did not see any other answsers address this side of the 'why doing it yourself is bad', but consider a SQL Truncation attack.

There is also the QUOTENAME T-SQL function that can be helpful if you can't convince them to use params. It catches a lot (all?) of the escaped qoute concerns.

It can be broken, however the means depends on exact versions/patches etc.

One that has already been brought up is the overflow/truncation bug that can be exploited.

Another future means would be finding bugs similar to other databases - for example the MySQL/PHP stack suffered an escaping problem because certain UTF8 sequences could be used to manipulate the replace function - the replace function would be tricked into introducing the injection characters.

At the end of the day, the replacement security mechanism relies on expected but not intended functionality. Since the functionality was not the intended purpose of the code, there is a high probablity that some discovered quirk will break your expected functionality.

If you have a lot of legacy code, the replace method could be used as a stopgap to avoid lengthy rewriting and testing. If you are writing new code, there is no excuse.

So I'd say:

1) Why are you trying to re-implement something that's built in? it's there, readily available, easy to use and already debugged on a global scale. If future bugs are found in it, they'll be fixed and available to everyone very quickly without you having to do anything.

2) What processes are in place to guarantee that you never miss a call to SafeDBString? Missing it in just 1 place could open up a whole host of issues. How much are you going to eyeball these things, and consider how much wasted that effort is when the accepted correct answer is so easy to reach.

3) How certain are you that you've covered off every attack vector that Microsoft(the author of the DB and the access library) knows about in your SafeDBString implementation ...

4) How easy is it to read the structure of the sql? The example uses + concatenation, parameters are very like string.Format, which is more readable.

Also, there are 2 ways of working out what was actually run - roll your own LogCommand function, a simple function with no security concerns, or even look at an sql trace to work out what the database thinks is really going on.

Our LogCommand function is simply:

    string LogCommand(SqlCommand cmd)
{
StringBuilder sb = new StringBuilder();
sb.AppendLine(cmd.CommandText);
foreach (SqlParameter param in cmd.Parameters)
{
sb.Append(param.ToString());
sb.Append(" = \"");
sb.Append(param.Value.ToString());
sb.AppendLine("\"");
}
return sb.ToString();
}

Right or wrong, it gives us the information we need without security issues.

Another important consideration is keeping track of escaped and unescaped data. There are tons and tons of applications, Web and otherwise, that don't seem to properly keep track of when data is raw-Unicode, &-encoded, formatted HTML, et cetera. It's obvious that it will become difficult to keep track of which strings are ''–encoded and which aren't.

It's also a problem when you end up changing the type of some variable — perhaps it used to be an integer, but now it's a string. Now you have a problem.

For the reasons already given, parameters are a very good idea. But we hate using them because creating the param and assigning its name to a variable for later use in a query is a triple indirection head wreck.

The following class wraps the stringbuilder that you will commonly use for building SQL requests. It lets you write paramaterized queries without ever having to create a parameter, so you can concentrate on the SQL. Your code will look like this...

var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId, SqlDbType.Int);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName, SqlDbType.NVarChar);
myCommand.CommandText = bldr.ToString();

Code readability, I hope you agree, is greatly improved, and the output is a proper parameterized query.

The class looks like this...

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;


namespace myNamespace
{
/// <summary>
/// Pour le confort et le bonheur, cette classe remplace StringBuilder pour la construction
/// des requêtes SQL, avec l'avantage qu'elle gère la création des paramètres via la méthode
/// Value().
/// </summary>
public class SqlBuilder
{
private StringBuilder _rq;
private SqlCommand _cmd;
private int _seq;
public SqlBuilder(SqlCommand cmd)
{
_rq = new StringBuilder();
_cmd = cmd;
_seq = 0;
}
//Les autres surcharges de StringBuilder peuvent être implémenté ici de la même façon, au besoin.
public SqlBuilder Append(String str)
{
_rq.Append(str);
return this;
}
/// <summary>
/// Ajoute une valeur runtime à la requête, via un paramètre.
/// </summary>
/// <param name="value">La valeur à renseigner dans la requête</param>
/// <param name="type">Le DBType à utiliser pour la création du paramètre. Se référer au type de la colonne cible.</param>
public SqlBuilder Value(Object value, SqlDbType type)
{
//get param name
string paramName = "@SqlBuilderParam" + _seq++;
//append condition to query
_rq.Append(paramName);
_cmd.Parameters.Add(paramName, type).Value = value;
return this;
}
public SqlBuilder FuzzyValue(Object value, SqlDbType type)
{
//get param name
string paramName = "@SqlBuilderParam" + _seq++;
//append condition to query
_rq.Append("'%' + " + paramName + " + '%'");
_cmd.Parameters.Add(paramName, type).Value = value;
return this;
}


public override string ToString()
{
return _rq.ToString();
}
}
}

Always use parameterized queries where possible. Sometimes even a simple input without the use of any weird characters can already create an SQL-injection if its not identified as a input for a field in the database.

So just let the database do its work of identifying the input itself, not to mention it also saves allot of trouble when you need to actually insert weird characters that otherwise would be escaped or changed. It can even save some valuable runtime in the end for not having to calculate the input.

2 years later, I recidivated... Anyone who finds parameters a pain is welcome to try my VS Extension, QueryFirst. You edit your request in a real .sql file (Validation, Intellisense). To add a parameter, you just type it directly into your SQL, starting with the '@'. When you save the file, QueryFirst will generate wrapper classes to let you run the query and access the results. It will look up the DB type of your parameter and map it to a .net type, which you will find as an input to the generated Execute() methods. Could not be simpler. Doing it the right way is radically quicker and easier than doing it any other way, and creating a sql injection vulnerability becomes impossible, or at least perversely difficult. There are other killer advantages, like being able to delete columns in your DB and immediately see compile errors in your application.

legal disclaimer : I wrote QueryFirst