格式化XML字符串以打印友好的XML字符串

我有一个这样的XML字符串:

<?xml version='1.0'?><response><error code='1'> Success</error></response>

一个元素和另一个元素之间没有行,因此很难阅读。我想要一个函数格式化上面的字符串:

<?xml version='1.0'?>
<response>
<error code='1'> Success</error>
</response>

不需要自己手动编写格式函数,是否有任何。net库或代码片段我可以立即使用?

253506 次浏览

如果你加载XMLDoc,我很确定. tostring()函数对此有重载。

但是这是用来调试的吗?这样发送的原因是为了占用更少的空间(即从XML中剥离不必要的空白)。

检查以下链接:如何漂亮地打印XML(不幸的是,该链接现在返回404:()

链接中的方法以XML字符串作为参数,并返回格式良好(缩进)的XML字符串。

我只是从链接中复制了示例代码,以使这个回答更全面和方便。

public static String PrettyPrint(String XML)
{
String Result = "";


MemoryStream MS = new MemoryStream();
XmlTextWriter W = new XmlTextWriter(MS, Encoding.Unicode);
XmlDocument D   = new XmlDocument();


try
{
// Load the XmlDocument with the XML.
D.LoadXml(XML);


W.Formatting = Formatting.Indented;


// Write the XML into a formatting XmlTextWriter
D.WriteContentTo(W);
W.Flush();
MS.Flush();


// Have to rewind the MemoryStream in order to read
// its contents.
MS.Position = 0;


// Read MemoryStream contents into a StreamReader.
StreamReader SR = new StreamReader(MS);


// Extract the text from the StreamReader.
String FormattedXML = SR.ReadToEnd();


Result = FormattedXML;
}
catch (XmlException)
{
}


MS.Close();
W.Close();


return Result;
}

使用XmlTextWriter……

public static string PrintXML(string xml)
{
string result = "";


MemoryStream mStream = new MemoryStream();
XmlTextWriter writer = new XmlTextWriter(mStream, Encoding.Unicode);
XmlDocument document = new XmlDocument();


try
{
// Load the XmlDocument with the XML.
document.LoadXml(xml);


writer.Formatting = Formatting.Indented;


// Write the XML into a formatting XmlTextWriter
document.WriteContentTo(writer);
writer.Flush();
mStream.Flush();


// Have to rewind the MemoryStream in order to read
// its contents.
mStream.Position = 0;


// Read MemoryStream contents into a StreamReader.
StreamReader sReader = new StreamReader(mStream);


// Extract the text from the StreamReader.
string formattedXml = sReader.ReadToEnd();


result = formattedXml;
}
catch (XmlException)
{
// Handle the exception
}


mStream.Close();
writer.Close();


return result;
}

你必须以某种方式解析内容。我发现使用LINQ是最简单的方法。同样,这完全取决于您的具体场景。下面是一个使用LINQ格式化输入XML字符串的示例。

string FormatXml(string xml)
{
try
{
XDocument doc = XDocument.Parse(xml);
return doc.ToString();
}
catch (Exception)
{
// Handle and throw if fatal exception here; don't just ignore them
return xml;
}
}

[为简洁起见,省略使用语句]

这个,来自kristopherjohnson更好:

  1. 它也不需要XML文档头。
  2. 有更明确的例外
  3. 增加额外的行为选项:OmitXmlDeclaration = true, NewLineOnAttributes = true
  4. 代码行数更少

    static string PrettyXml(string xml)
    {
    var stringBuilder = new StringBuilder();
    
    
    var element = XElement.Parse(xml);
    
    
    var settings = new XmlWriterSettings();
    settings.OmitXmlDeclaration = true;
    settings.Indent = true;
    settings.NewLineOnAttributes = true;
    
    
    using (var xmlWriter = XmlWriter.Create(stringBuilder, settings))
    {
    element.Save(xmlWriter);
    }
    
    
    return stringBuilder.ToString();
    }
    

.NET 2.0忽略名称解析,并使用适当的资源处理、缩进、保留空格以及自定义编码:

public static string Beautify(System.Xml.XmlDocument doc)
{
string strRetValue = null;
System.Text.Encoding enc = System.Text.Encoding.UTF8;
// enc = new System.Text.UTF8Encoding(false);


System.Xml.XmlWriterSettings xmlWriterSettings = new System.Xml.XmlWriterSettings();
xmlWriterSettings.Encoding = enc;
xmlWriterSettings.Indent = true;
xmlWriterSettings.IndentChars = "    ";
xmlWriterSettings.NewLineChars = "\r\n";
xmlWriterSettings.NewLineHandling = System.Xml.NewLineHandling.Replace;
//xmlWriterSettings.OmitXmlDeclaration = true;
xmlWriterSettings.ConformanceLevel = System.Xml.ConformanceLevel.Document;




using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
{
using (System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(ms, xmlWriterSettings))
{
doc.Save(writer);
writer.Flush();
ms.Flush();


writer.Close();
} // End Using writer


ms.Position = 0;
using (System.IO.StreamReader sr = new System.IO.StreamReader(ms, enc))
{
// Extract the text from the StreamReader.
strRetValue = sr.ReadToEnd();


sr.Close();
} // End Using sr


ms.Close();
} // End Using ms




/*
System.Text.StringBuilder sb = new System.Text.StringBuilder(); // Always yields UTF-16, no matter the set encoding
using (System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(sb, settings))
{
doc.Save(writer);
writer.Close();
} // End Using writer
strRetValue = sb.ToString();
sb.Length = 0;
sb = null;
*/


xmlWriterSettings = null;
return strRetValue;
} // End Function Beautify

用法:

System.Xml.XmlDocument xmlDoc = new System.Xml.XmlDocument();
xmlDoc.XmlResolver = null;
xmlDoc.PreserveWhitespace = true;
xmlDoc.Load("C:\Test.svg");
string SVG = Beautify(xmlDoc);

我试着:

internal static void IndentedNewWSDLString(string filePath)
{
var xml = File.ReadAllText(filePath);
XDocument doc = XDocument.Parse(xml);
File.WriteAllText(filePath, doc.ToString());
}

它像预期的那样正常工作。

带有UTF-8 XML声明的可定制的漂亮XML输出

下面的类定义给出了一个简单的方法,将输入XML字符串转换为带有UTF-8声明的格式化输出XML。它支持XmlWriterSettings类提供的所有配置选项。

using System;
using System.Text;
using System.Xml;
using System.IO;


namespace CJBS.Demo
{
/// <summary>
/// Supports formatting for XML in a format that is easily human-readable.
/// </summary>
public static class PrettyXmlFormatter
{


/// <summary>
/// Generates formatted UTF-8 XML for the content in the <paramref name="doc"/>
/// </summary>
/// <param name="doc">XmlDocument for which content will be returned as a formatted string</param>
/// <returns>Formatted (indented) XML string</returns>
public static string GetPrettyXml(XmlDocument doc)
{
// Configure how XML is to be formatted
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
, IndentChars = "  "
, NewLineChars = System.Environment.NewLine
, NewLineHandling = NewLineHandling.Replace
//,NewLineOnAttributes = true
//,OmitXmlDeclaration = false
};


// Use wrapper class that supports UTF-8 encoding
StringWriterWithEncoding sw = new StringWriterWithEncoding(Encoding.UTF8);


// Output formatted XML to StringWriter
using (XmlWriter writer = XmlWriter.Create(sw, settings))
{
doc.Save(writer);
}


// Get formatted text from writer
return sw.ToString();
}






/// <summary>
/// Wrapper class around <see cref="StringWriter"/> that supports encoding.
/// Attribution: http://stackoverflow.com/a/427737/3063884
/// </summary>
private sealed class StringWriterWithEncoding : StringWriter
{
private readonly Encoding encoding;


/// <summary>
/// Creates a new <see cref="PrettyXmlFormatter"/> with the specified encoding
/// </summary>
/// <param name="encoding"></param>
public StringWriterWithEncoding(Encoding encoding)
{
this.encoding = encoding;
}


/// <summary>
/// Encoding to use when dealing with text
/// </summary>
public override Encoding Encoding
{
get { return encoding; }
}
}
}
}

进一步改进的可能性:-

  • 可以创建一个额外的方法GetPrettyXml(XmlDocument doc, XmlWriterSettings settings),允许调用者自定义输出。
  • 可以添加一个额外的方法GetPrettyXml(String rawXml)来支持解析原始文本,而不是让客户端使用XmlDocument。在我的例子中,我需要使用XmlDocument操作XML,因此我没有添加这个。

用法:

String myFormattedXml = null;
XmlDocument doc = new XmlDocument();
try
{
doc.LoadXml(myRawXmlString);
myFormattedXml = PrettyXmlFormatter.GetPrettyXml(doc);
}
catch(XmlException ex)
{
// Failed to parse XML -- use original XML as formatted XML
myFormattedXml = myRawXmlString;
}

对我来说,最简单的解决方法就是:

        XmlDocument xmlDoc = new XmlDocument();
StringWriter sw = new StringWriter();
xmlDoc.LoadXml(rawStringXML);
xmlDoc.Save(sw);
String formattedXml = sw.ToString();

检查以下链接:格式化XML文件,使其在c#中看起来更好

// Format the XML text.
StringWriter string_writer = new StringWriter();
XmlTextWriter xml_text_writer = new XmlTextWriter(string_writer);
xml_text_writer.Formatting = Formatting.Indented;
xml_document.WriteTo(xml_text_writer);


// Display the result.
txtResult.Text = string_writer.ToString();

可以通过流转换XmlWriter.WriteNode(XmlReader, true)来漂亮地打印XML字符串。这个方法

将所有内容从读取器复制到写入器,并将读取器移动到下一个同级的开始。

定义以下扩展方法:

public static class XmlExtensions
{
public static string FormatXml(this string xml, bool indent = true, bool newLineOnAttributes = false, string indentChars = "  ", ConformanceLevel conformanceLevel = ConformanceLevel.Document) =>
xml.FormatXml( new XmlWriterSettings { Indent = indent, NewLineOnAttributes = newLineOnAttributes, IndentChars = indentChars, ConformanceLevel = conformanceLevel });


public static string FormatXml(this string xml, XmlWriterSettings settings)
{
using (var textReader = new StringReader(xml))
using (var xmlReader = XmlReader.Create(textReader, new XmlReaderSettings { ConformanceLevel = settings.ConformanceLevel } ))
using (var textWriter = new StringWriter())
{
using (var xmlWriter = XmlWriter.Create(textWriter, settings))
xmlWriter.WriteNode(xmlReader, true);
return textWriter.ToString();
}
}
}

现在你可以做:

var inXml = @"<?xml version='1.0'?><response><error code='1'> Success</error></response>";
var newXml = inXml.FormatXml(indentChars : "", newLineOnAttributes : false); // Or true, if you prefer
Console.WriteLine(newXml);

的打印

<?xml version='1.0'?>
<response>
<error code="1"> Success</error>
</response>

注:

  • 其他答案将XML加载到一些文档对象模型中,例如XmlDocumentXDocument/XElement,然后重新序列化启用缩进的DOM。

    这种流解决方案完全避免了DOM增加的内存开销。

  • 在你的问题中,你没有为嵌套的<error code='1'> Success</error>节点添加任何缩进,所以我设置了indentChars : ""。通常,习惯上每层嵌套缩进两个空格。

  • 属性分隔符将无条件地转换为双引号,如果当前是单引号。(我相信其他答案也是如此。)

  • 传递conformanceLevel : ConformanceLevel.Fragment允许包含XML片段序列的字符串被格式化。

  • 除了ConformanceLevel.Fragment之外,输入的XML字符串必须是格式良好的。如果不是,XmlReader将抛出异常。

演示小提琴在这里

嗨,你为什么不试试这个

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.PreserveWhitespace = false;
....
....
xmlDoc.Save(fileName);

PreserveWhitespace = false;该选项也可以用于XML美化器。