替换 C # 中的多个字符串元素

还有更好的办法吗。

MyString.Trim().Replace("&", "and").Replace(",", "").Replace("  ", " ")
.Replace(" ", "-").Replace("'", "").Replace("/", "").ToLower();

我已经扩展了字符串类,将其保持为一个作业,但是有更快的方法吗?

public static class StringExtension
{
public static string clean(this string s)
{
return s.Replace("&", "and").Replace(",", "").Replace("  ", " ")
.Replace(" ", "-").Replace("'", "").Replace(".", "")
.Replace("eacute;", "é").ToLower();
}
}

只是为了好玩(并停止在评论中的争论) 我已经推出了一个基准测试下面各种例子的要点。

Https://gist.github.com/chrismckee/5937656

正则表达式选项得分很低; 字典选项得分最快; 字符串构建器替换的冗长版本比简写版本稍微快一些。

142187 次浏览

Quicker - no. More effective - yes, if you will use the StringBuilder class. With your implementation each operation generates a copy of a string which under circumstances may impair performance. Strings are immutable objects so each operation just returns a modified copy.

If you expect this method to be actively called on multiple Strings of significant length, it might be better to "migrate" its implementation onto the StringBuilder class. With it any modification is performed directly on that instance, so you spare unnecessary copy operations.

public static class StringExtention
{
public static string clean(this string s)
{
StringBuilder sb = new StringBuilder (s);


sb.Replace("&", "and");
sb.Replace(",", "");
sb.Replace("  ", " ");
sb.Replace(" ", "-");
sb.Replace("'", "");
sb.Replace(".", "");
sb.Replace("eacute;", "é");


return sb.ToString().ToLower();
}
}

this will be more efficient:

public static class StringExtension
{
public static string clean(this string s)
{
return new StringBuilder(s)
.Replace("&", "and")
.Replace(",", "")
.Replace("  ", " ")
.Replace(" ", "-")
.Replace("'", "")
.Replace(".", "")
.Replace("eacute;", "é")
.ToString()
.ToLower();
}
}

Maybe a little more readable?

    public static class StringExtension {


private static Dictionary<string, string> _replacements = new Dictionary<string, string>();


static StringExtension() {
_replacements["&"] = "and";
_replacements[","] = "";
_replacements["  "] = " ";
// etc...
}


public static string clean(this string s) {
foreach (string to_replace in _replacements.Keys) {
s = s.Replace(to_replace, _replacements[to_replace]);
}
return s;
}
}

Also add New In Town's suggestion about StringBuilder...

I'm doing something similar, but in my case I'm doing serialization/De-serialization so I need to be able to go both directions. I find using a string[][] works nearly identically to the dictionary, including initialization, but you can go the other direction too, returning the substitutes to their original values, something that the dictionary really isn't set up to do.

Edit: You can use Dictionary<Key,List<Values>> in order to obtain same result as string[][]

If you are simply after a pretty solution and don't need to save a few nanoseconds, how about some LINQ sugar?

var input = "test1test2test3";
var replacements = new Dictionary<string, string> { { "1", "*" }, { "2", "_" }, { "3", "&" } };


var output = replacements.Aggregate(input, (current, replacement) => current.Replace(replacement.Key, replacement.Value));

There is one thing that may be optimized in the suggested solutions. Having many calls to Replace() makes the code to do multiple passes over the same string. With very long strings the solutions may be slow because of CPU cache capacity misses. May be one should consider replacing multiple strings in a single pass.

The essential content from that link:

static string MultipleReplace(string text, Dictionary replacements) {
return Regex.Replace(text,
"(" + String.Join("|", adict.Keys.ToArray()) + ")",
delegate(Match m) { return replacements[m.Value]; }
);
}
// somewhere else in code
string temp = "Jonathan Smith is a developer";
adict.Add("Jonathan", "David");
adict.Add("Smith", "Seruyange");
string rep = MultipleReplace(temp, adict);




string input = "it's worth a lot of money, if you can find a buyer.";
for (dynamic i = 0, repl = new string[,] { { "'", "''" }, { "money", "$" }, { "find", "locate" } }; i < repl.Length / 2; i++) {
input = input.Replace(repl[i, 0], repl[i, 1]);
}

Another option using linq is

[TestMethod]
public void Test()
{
var input = "it's worth a lot of money, if you can find a buyer.";
var expected = "its worth a lot of money if you can find a buyer";
var removeList = new string[] { ".", ",", "'" };
var result = input;


removeList.ToList().ForEach(o => result = result.Replace(o, string.Empty));


Assert.AreEqual(expected, result);
}

Regular Expression with MatchEvaluator could also be used:

    var pattern = new Regex(@"These|words|are|placed|in|parentheses");
var input = "The matching words in this text are being placed inside parentheses.";
var result = pattern.Replace(input , match=> $"({match.Value})");

Note:

  • Obviously different expression (like: \b(\w*test\w*)\b) could be used for words matching.
  • I was hoping it to be more optimized to find the pattern in expression and do the replacements
  • The advantage is the ability to process the matching elements while doing the replacements

This is essentially Paolo Tedesco's answer, but I wanted to make it re-usable.

    public class StringMultipleReplaceHelper
{
private readonly Dictionary<string, string> _replacements;


public StringMultipleReplaceHelper(Dictionary<string, string> replacements)
{
_replacements = replacements;
}


public string clean(string s)
{
foreach (string to_replace in _replacements.Keys)
{
s = s.Replace(to_replace, _replacements[to_replace]);
}
return s;
}
}

One thing to note that I had to stop it being an extension, remove the static modifiers, and remove this from clean(this string s). I'm open to suggestions as to how to implement this better.