从 XmlDocument 获得带换行符的缩进 XML 的最简单方法是什么?

当我使用 XmlDocument从头开始构建 XML 时,OuterXml属性已经使用换行符很好地缩进了所有内容。但是,如果我在一些非常“压缩”的 XML 上调用 LoadXml(没有换行或缩进) ,那么 OuterXml的输出将保持这种状态。那么..。

XmlDocument实例获得美化 XML 输出的最简单方法是什么?

158857 次浏览

改编自 Erika Ehrli 的博客的这个应该可以做到:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<item><name>wrench</name></item>");
// Save the document to a file and auto-indent the output.
using (XmlTextWriter writer = new XmlTextWriter("data.xml", null)) {
writer.Formatting = Formatting.Indented;
doc.Save(writer);
}
XmlTextWriter xw = new XmlTextWriter(writer);
xw.Formatting = Formatting.Indented;

根据其他答案,我查看了 XmlTextWriter,得出了以下 helper 方法:

static public string Beautify(this XmlDocument doc)
{
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true,
IndentChars = "  ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
doc.Save(writer);
}
return sb.ToString();
}

代码比我想象的要多一点,但是工作得很好。

如果为已经包含 XmlProcessingInstruction子节点的 XmlDocument调用上述 Beautify 方法,将引发以下异常:

无法写入 XML 声明。 WriteStartDocument 方法已经 写的。

This is my modified version of the original one to get rid of the exception:

private static string beautify(
XmlDocument doc)
{
var sb = new StringBuilder();
var settings =
new XmlWriterSettings
{
Indent = true,
IndentChars = @"    ",
NewLineChars = Environment.NewLine,
NewLineHandling = NewLineHandling.Replace,
};


using (var writer = XmlWriter.Create(sb, settings))
{
if (doc.ChildNodes[0] is XmlProcessingInstruction)
{
doc.RemoveChild(doc.ChildNodes[0]);
}


doc.Save(writer);
return sb.ToString();
}
}

它现在对我有用,可能你需要扫描所有子节点的 XmlProcessingInstruction节点,而不仅仅是第一个?


2015年4月更新:

由于我还遇到过另一个编码错误的情况,所以我搜索了如何在没有 BOM 的情况下强制执行 UTF-8。我找到了 这篇博文,并基于它创建了一个函数:

private static string beautify(string xml)
{
var doc = new XmlDocument();
doc.LoadXml(xml);


var settings = new XmlWriterSettings
{
Indent = true,
IndentChars = "\t",
NewLineChars = Environment.NewLine,
NewLineHandling = NewLineHandling.Replace,
Encoding = new UTF8Encoding(false)
};


using (var ms = new MemoryStream())
using (var writer = XmlWriter.Create(ms, settings))
{
doc.Save(writer);
var xmlString = Encoding.UTF8.GetString(ms.ToArray());
return xmlString;
}
}

更短的扩展方法版本

public static string ToIndentedString( this XmlDocument doc )
{
var stringWriter = new StringWriter(new StringBuilder());
var xmlTextWriter = new XmlTextWriter(stringWriter) {Formatting = Formatting.Indented};
doc.Save( xmlTextWriter );
return stringWriter.ToString();
}

一个简单的方法是:

writer.WriteRaw(space_char);

像这个示例代码一样,这段代码就是我用 XMLWriter 创建树视图(如结构)的代码:

private void generateXML(string filename)
{
using (XmlWriter writer = XmlWriter.Create(filename))
{
writer.WriteStartDocument();
//new line
writer.WriteRaw("\n");
writer.WriteStartElement("treeitems");
//new line
writer.WriteRaw("\n");
foreach (RootItem root in roots)
{
//indent
writer.WriteRaw("\t");
writer.WriteStartElement("treeitem");
writer.WriteAttributeString("name", root.name);
writer.WriteAttributeString("uri", root.uri);
writer.WriteAttributeString("fontsize", root.fontsize);
writer.WriteAttributeString("icon", root.icon);
if (root.children.Count != 0)
{
foreach (ChildItem child in children)
{
//indent
writer.WriteRaw("\t");
writer.WriteStartElement("treeitem");
writer.WriteAttributeString("name", child.name);
writer.WriteAttributeString("uri", child.uri);
writer.WriteAttributeString("fontsize", child.fontsize);
writer.WriteAttributeString("icon", child.icon);
writer.WriteEndElement();
//new line
writer.WriteRaw("\n");
}
}
writer.WriteEndElement();
//new line
writer.WriteRaw("\n");
}


writer.WriteEndElement();
writer.WriteEndDocument();


}


}

通过这种方式,您可以按照通常使用的方式添加制表符或换行符,即 t 或 n

如果你能接近 Linq 就更简单了

try
{
RequestPane.Text = System.Xml.Linq.XElement.Parse(RequestPane.Text).ToString();
}
catch (System.Xml.XmlException xex)
{
displayException("Problem with formating text in Request Pane: ", xex);
}

When implementing the suggestions posted here, I had trouble with the text encoding. It seems the encoding of the XmlWriterSettings is ignored, and always overridden by the encoding of the stream. When using a StringBuilder, this is always the text encoding used internally in C#, namely UTF-16.

这个版本也支持其他编码。

重要提示: 如果在加载文档时启用了 XMLDocument对象的 preserveWhitespace属性,则完全忽略该格式。这让我困惑了一段时间,所以一定不要启动它。

我的终极准则:

public static void SaveFormattedXml(XmlDocument doc, String outputPath, Encoding encoding)
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = "\t";
settings.NewLineChars = "\r\n";
settings.NewLineHandling = NewLineHandling.Replace;


using (MemoryStream memstream = new MemoryStream())
using (StreamWriter sr = new StreamWriter(memstream, encoding))
using (XmlWriter writer = XmlWriter.Create(sr, settings))
using (FileStream fileWriter = new FileStream(outputPath, FileMode.Create))
{
if (doc.ChildNodes.Count > 0 && doc.ChildNodes[0] is XmlProcessingInstruction)
doc.RemoveChild(doc.ChildNodes[0]);
// save xml to XmlWriter made on encoding-specified text writer
doc.Save(writer);
// Flush the streams (not sure if this is really needed for pure mem operations)
writer.Flush();
// Write the underlying stream of the XmlWriter to file.
fileWriter.Write(memstream.GetBuffer(), 0, (Int32)memstream.Length);
}
}

这将使用给定的文本编码将格式化的 xml 保存到磁盘。

如果您有一个 XML 字符串,而不是一个可以使用的文档,您可以这样做:

var xmlString = "<xml>...</xml>"; // Your original XML string that needs indenting.
xmlString = this.PrettifyXml(xmlString);


private string PrettifyXml(string xmlString)
{
var prettyXmlString = new StringBuilder();


var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlString);


var xmlSettings = new XmlWriterSettings()
{
Indent = true,
IndentChars = " ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};


using (XmlWriter writer = XmlWriter.Create(prettyXmlString, xmlSettings))
{
xmlDoc.Save(writer);
}


return prettyXmlString.ToString();
}
    public static string FormatXml(string xml)
{
try
{
var doc = XDocument.Parse(xml);
return doc.ToString();
}
catch (Exception)
{
return xml;
}
}

基于公认答案的一种更简化的方法:

static public string Beautify(this XmlDocument doc) {
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};


using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
doc.Save(writer);
}


return sb.ToString();
}

不需要设置新行。缩进字符也有默认的两个空格,所以我不喜欢设置它。

装弹之前将 保留空白设置为 没错

var document = new XmlDocument();
document.PreserveWhitespace = true;
document.Load(filename);