如何替换字符串中的特殊字符?

我有一个包含很多特殊字符的字符串。我想删除所有这些字符,但保留字母字符。

我怎么能这么做?

414154 次浏览

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

Or, with Guava:

private static final CharMatcher ALNUM =
CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
.or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

I am using this.

s = s.replaceAll("\\W", "");

It replace all special characters from string.

Here

\w : A word character, short for [a-zA-Z_0-9]

\W : A non-word character

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

You can use the following method to keep alphanumeric characters.

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

replaceAll("[^a-zA-Z]", "");
string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

For spaces use "[^a-z A-Z 0-9]" this pattern

Replace any special characters by

replaceAll("\\your special character","new character");

ex:to replace all the occurrence of * with white space

replaceAll("\\*","");

*this statement can only replace one type of special character at a time

Following the example of the Andrzej Doyle's answer, I think the better solution is to use org.apache.commons.lang3.StringUtils.stripAccents():

package bla.bla.utility;


import org.apache.commons.lang3.StringUtils;


public class UriUtility {
public static String normalizeUri(String s) {
String r = StringUtils.stripAccents(s);
r = r.replace(" ", "_");
r = r.replaceAll("[^\\.A-Za-z0-9_]", "");
return r;
}
}

Here is a function I used to remove all possible special characters from the string

let name = name.replace(/[&\/\\#,+()$~%!.„'":*‚^_¤?<>|@ª{«»§}©®™ ]/g, '').toLowerCase();