C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

我在谷歌上搜索了这个话题,看了所有的答案,但还是没有找到答案。

基本上,我需要将 UTF-8字符串转换为 ISO-8859-1,并且我使用以下代码:

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
string msg = iso.GetString(utf8.GetBytes(Message));

我的源字符串是

Message = "ÄäÖöÕõÜü"

但不幸的是,我的结果字符串变成

msg = "�ä�ö�õ�ü

我做错了什么?

309301 次浏览

您首先需要修复字符串的源。

一根绳子。NET 实际上只是一个由16位 Unicode 代码点、字符组成的数组,所以字符串不是以任何特定的编码形式存在的。

当你把这个字符串转换成一组字节时,编码就开始发挥作用了。

在任何情况下,如您所见,将字符串编码为具有一个字符集的字节数组,然后再用另一个字符集对其进行解码的方法都不会起作用。

你能告诉我们更多关于原始字符串的来源,以及为什么你认为它被编码错误?

我认为您的问题在于,您假设表示 utf8字符串的字节在被解释为其他字符串(iso-8859-1)时会产生相同的字符串。但事实并非如此。我建议你阅读 Joel Spolsky 的 这篇优秀的文章

试试这个:

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(Message);
byte[] isoBytes = Encoding.Convert(utf8,iso,utfBytes);
string msg = iso.GetString(isoBytes);

在尝试将字节数组解码为目标编码之前,请使用 编码,转换调整字节数组。

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(Message);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string msg = iso.GetString(isoBytes);

看起来有点奇怪的代码。要从 Utf8字节流获取字符串,你需要做的就是:

string str = Encoding.UTF8.GetString(utf8ByteArray);

如果你需要将 iso-8859-1字节流保存到某个地方,只需使用: 前面的附加代码行:

byte[] iso88591data = Encoding.GetEncoding("ISO-8859-1").GetBytes(str);

刚刚使用了 Nathan 的解决方案,它运行良好。我需要将 ISO-8859-1转换为 Unicode:

string isocontent = Encoding.GetEncoding("ISO-8859-1").GetString(fileContent, 0, fileContent.Length);
byte[] isobytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(isocontent);
byte[] ubytes = Encoding.Convert(Encoding.GetEncoding("ISO-8859-1"), Encoding.Unicode, isobytes);
return Encoding.Unicode.GetString(ubytes, 0, ubytes.Length);
Encoding targetEncoding = Encoding.GetEncoding(1252);
// Encode a string into an array of bytes.
Byte[] encodedBytes = targetEncoding.GetBytes(utfString);
// Show the encoded byte values.
Console.WriteLine("Encoded bytes: " + BitConverter.ToString(encodedBytes));
// Decode the byte array back to a string.
String decodedString = Encoding.Default.GetString(encodedBytes);

这里是 ISO-8859-9的样本;

protected void btnKaydet_Click(object sender, EventArgs e)
{
Response.Clear();
Response.Buffer = true;
Response.ContentType = "application/vnd.openxmlformatsofficedocument.wordprocessingml.documet";
Response.AddHeader("Content-Disposition", "attachment; filename=XXXX.doc");
Response.ContentEncoding = Encoding.GetEncoding("ISO-8859-9");
Response.Charset = "ISO-8859-9";
EnableViewState = false;




StringWriter writer = new StringWriter();
HtmlTextWriter html = new HtmlTextWriter(writer);
form1.RenderControl(html);




byte[] bytesInStream = Encoding.GetEncoding("iso-8859-9").GetBytes(writer.ToString());
MemoryStream memoryStream = new MemoryStream(bytesInStream);




string msgBody = "";
string Email = "mail@xxxxxx.org";
SmtpClient client = new SmtpClient("mail.xxxxx.org");
MailMessage message = new MailMessage(Email, "mail@someone.com", "ONLINE APP FORM WITH WORD DOC", msgBody);
Attachment att = new Attachment(memoryStream, "XXXX.doc", "application/vnd.openxmlformatsofficedocument.wordprocessingml.documet");
message.Attachments.Add(att);
message.BodyEncoding = System.Text.Encoding.UTF8;
message.IsBodyHtml = true;
client.Send(message);}

Maybe it can help
Convert one codepage to another:

    public static string fnStringConverterCodepage(string sText, string sCodepageIn = "ISO-8859-8", string sCodepageOut="ISO-8859-8")
{
string sResultado = string.Empty;
try
{
byte[] tempBytes;
tempBytes = System.Text.Encoding.GetEncoding(sCodepageIn).GetBytes(sText);
sResultado = System.Text.Encoding.GetEncoding(sCodepageOut).GetString(tempBytes);
}
catch (Exception)
{
sResultado = "";
}
return sResultado;
}

用法:

string sMsg = "ERRO: Não foi possivel acessar o servico de Autenticação";
var sOut = fnStringConverterCodepage(sMsg ,"ISO-8859-1","UTF-8"));

产出:

"Não foi possivel acessar o servico de Autenticação"