正则表达式删除所有(非数字或句点)

我需要像“ joe ($3,004.50)”这样的文本被过滤到3004.50,但是我不擅长正则表达式,找不到合适的解决方案。因此,只有数字和句点应该保留-其他一切过滤。我使用 C # 和 VS.net 2008框架3.5

134036 次浏览

The regex is:

[^0-9.]

You can cache the regex:

Regex not_num_period = new Regex("[^0-9.]")

then use:

string result = not_num_period.Replace("joe ($3,004.50)", "");

However, you should keep in mind that some cultures have different conventions for writing monetary amounts, such as: 3.004,50.

This should do it:

string s = "joe ($3,004.50)";
s = Regex.Replace(s, "[^0-9.]", "");

The approach of removing offending characters is potentially problematic. What if there's another . in the string somewhere? It won't be removed, though it should!

Removing non-digits or periods, the string joe.smith ($3,004.50) would transform into the unparseable .3004.50.

Imho, it is better to match a specific pattern, and extract it using a group. Something simple would be to find all contiguous commas, digits, and periods with regexp:

[\d,\.]+

Sample test run:

Pattern understood as:
[\d,\.]+
Enter string to check if matches pattern
>  a2.3 fjdfadfj34  34j3424  2,300 adsfa
Group 0 match: "2.3"
Group 0 match: "34"
Group 0 match: "34"
Group 0 match: "3424"
Group 0 match: "2,300"

Then for each match, remove all commas and send that to the parser. To handle case of something like 12.323.344, you could do another check to see that a matching substring has at most one ..

For the accepted answer, MatthewGunn raises a valid point in that all digits, commas, and periods in the entire string will be condensed together. This will avoid that:

string s = "joe.smith ($3,004.50)";
Regex r = new Regex(@"(?:^|[^w.,])(\d[\d,.]+)(?=\W|$)/)");
Match m = r.match(s);
string v = null;
if (m.Success) {
v = m.Groups[1].Value;
v = Regex.Replace(v, ",", "");
}

You are dealing with a string - string is an IEumerable<char>, so you can use LINQ:

var input = "joe ($3,004.50)";
var result = String.Join("", input.Where(c => Char.IsDigit(c) || c == '.'));


Console.WriteLine(result);   // 3004.50