如何在 C # 中处理 XML

在 C # 2.0中处理 XML 文档、 XSD 等的最佳方式是什么?

使用哪些类等等,解析和制作 XML 文档的最佳实践是什么等等。

编辑: . Net 3.5的建议也是受欢迎的。

73057 次浏览

It depends on the size; for small to mid size xml, a DOM such as XmlDocument (any C#/.NET versions) or XDocument (.NET 3.5/C# 3.0) is the obvious winner. For using xsd, You can load xml using an XmlReader, and an XmlReader accepts (to Create) an XmlReaderSettings. The XmlReaderSettings objects has a Schemas property that can be used to perform xsd (or dtd) validation.

For writing xml, the same things apply, noting that it is a little easier to lay out content with LINQ-to-XML (XDocument) than the older XmlDocument.

However, for huge xml, a DOM may chomp too much memory, in which case you might need to use XmlReader/XmlWriter directly.

Finally, for manipulating xml you may wish to use XslCompiledTransform (an xslt layer).

The alternative to working with xml is to work with an object model; you can use xsd.exe to create classes that represent an xsd-compliant model, and simply load the xml as objects, manipulate it with OO, and then serialize those objects again; you do this with XmlSerializer.

First of all, get to know the new XDocument and XElement classes, because they are an improvement over the previous XmlDocument family.

  1. They work with LINQ
  2. They are faster and more lightweight

However, you may have to still use the old classes to work with legacy code - particularly previously generated proxies. In that case, you will need to become familiar with some patterns for interoperating between these XML-handling classes.

I think your question is quite broad, and would require too much in a single answer to give details, but this is the first general answer I thought of, and serves as a start.

The primary means of reading and writing in C# 2.0 is done through the XmlDocument class. You can load most of your settings directly into the XmlDocument through the XmlReader it accepts.

Loading XML Directly

XmlDocument document = new XmlDocument();
document.LoadXml("<People><Person Name='Nick' /><Person Name='Joe' /></People>");

Loading XML From a File

XmlDocument document = new XmlDocument();
document.Load(@"C:\Path\To\xmldoc.xml");
// Or using an XmlReader/XmlTextReader
XmlReader reader = XmlReader.Create(@"C:\Path\To\xmldoc.xml");
document.Load(reader);

I find the easiest/fastest way to read an XML document is by using XPath.

Reading an XML Document using XPath (Using XmlDocument which allows us to edit)

XmlDocument document = new XmlDocument();
document.LoadXml("<People><Person Name='Nick' /><Person Name='Joe' /></People>");


// Select a single node
XmlNode node = document.SelectSingleNode("/People/Person[@Name = 'Nick']");


// Select a list of nodes
XmlNodeList nodes = document.SelectNodes("/People/Person");

If you need to work with XSD documents to validate an XML document you can use this.

Validating XML Documents against XSD Schemas

XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidateType = ValidationType.Schema;
settings.Schemas.Add("", pathToXsd); // targetNamespace, pathToXsd


XmlReader reader = XmlReader.Create(pathToXml, settings);
XmlDocument document = new XmlDocument();


try {
document.Load(reader);
} catch (XmlSchemaValidationException ex) { Trace.WriteLine(ex.Message); }

Validating XML against XSD at each Node (UPDATE 1)

XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidateType = ValidationType.Schema;
settings.Schemas.Add("", pathToXsd); // targetNamespace, pathToXsd
settings.ValidationEventHandler += new ValidationEventHandler(settings_ValidationEventHandler);


XmlReader reader = XmlReader.Create(pathToXml, settings);
while (reader.Read()) { }


private void settings_ValidationEventHandler(object sender, ValidationEventArgs args)
{
// e.Message, e.Severity (warning, error), e.Error
// or you can access the reader if you have access to it
// reader.LineNumber, reader.LinePosition.. etc
}

Writing an XML Document (manually)

XmlWriter writer = XmlWriter.Create(pathToOutput);
writer.WriteStartDocument();
writer.WriteStartElement("People");


writer.WriteStartElement("Person");
writer.WriteAttributeString("Name", "Nick");
writer.WriteEndElement();


writer.WriteStartElement("Person");
writer.WriteStartAttribute("Name");
writer.WriteValue("Nick");
writer.WriteEndAttribute();
writer.WriteEndElement();


writer.WriteEndElement();
writer.WriteEndDocument();


writer.Flush();

(UPDATE 1)

In .NET 3.5, you use XDocument to perform similar tasks. The difference however is you have the advantage of performing Linq Queries to select the exact data you need. With the addition of object initializers you can create a query that even returns objects of your own definition right in the query itself.

    XDocument doc = XDocument.Load(pathToXml);
List<Person> people = (from xnode in doc.Element("People").Elements("Person")
select new Person
{
Name = xnode.Attribute("Name").Value
}).ToList();

(UPDATE 2)

A nice way in .NET 3.5 is to use XDocument to create XML is below. This makes the code appear in a similar pattern to the desired output.

XDocument doc =
new XDocument(
new XDeclaration("1.0", Encoding.UTF8.HeaderName, String.Empty),
new XComment("Xml Document"),
new XElement("catalog",
new XElement("book", new XAttribute("id", "bk001"),
new XElement("title", "Book Title")
)
)
);

creates

<!--Xml Document-->
<catalog>
<book id="bk001">
<title>Book Title</title>
</book>
</catalog>

All else fails, you can check out this MSDN article that has many examples that I've discussed here and more. http://msdn.microsoft.com/en-us/library/aa468556.aspx

If you're working in .NET 3.5 and you aren't affraid of experimental code you can check out LINQ to XSD (http://blogs.msdn.com/xmlteam/archive/2008/02/21/linq-to-xsd-alpha-0-2.aspx) which will generate .NET classes from an XSD (including built in rules from the XSD).

It then has the ability to write straight out to a file and read from a file ensuring that it conforms to the XSD rules.

I definately suggest having an XSD for any XML document you work with:

  • Allows you to enforce rules in the XML
  • Allows others to see how the XML is/ will be structured
  • Can be used for validation of XML

I find that Liquid XML Studio is a great tool for generating XSD's and it's free!

101 Linq samples

http://msdn.microsoft.com/en-us/library/bb387098.aspx

and Linq to XML samples

http://msdn.microsoft.com/en-us/vbasic/bb688087.aspx

And I think Linq makes XML easy.

If you create a typed dataset in the designer then you automatically get an xsd, a strongly typed object, and can load and save the xml with one line of code.

nyxtom's answer is very good. I'd add a couple of things to it:

If you need read-only access to an XML document, XPathDocument is a much lighter-weight object than XmlDocument.

The downside of using XPathDocument is that you can't use the familiar SelectNodes and SelectSingleNode methods of XmlNode. Instead, you have to use the tools that the IXPathNavigable provides: use CreateNavigator to create an XPathNavigator, and use the XPathNavigator to create XPathNodeIterators to iterate over the lists of nodes you find via XPath. This generally requires a few more lines of code than the XmlDocument methods.

But: the XmlDocument and XmlNode classes implement IXPathNavigable, so any code you write to use those methods on an XPathDocument will also work on an XmlDocument. If you get used to writing against IXPathNavigable, your methods can work against either object. (This is why using XmlNode and XmlDocument in method signatures is flagged by FxCop.)

Lamentably, XDocument and XElement (and XNode and XObject) don't implement IXPathNavigable.

Another thing not present in nyxtom's answer is XmlReader. You generally use XmlReader to avoid the overhead of parsing the XML stream into an object model before you begin processing it. Instead, you use an XmlReader to process the input stream one XML node at a time. This is essentially .NET's answer to SAX. It lets you write very fast code for processing very large XML documents.

XmlReader also provides the simplest way of processing XML document fragments, e.g. the stream of XML elements with no encluding element that SQL Server's FOR XML RAW option returns.

The code you write using XmlReader is generally very tightly coupled to the format of the XML it's reading. Using XPath allows your code to be much, much more loosely coupled to the XML, which is why it's generally the right answer. But when you need to use XmlReader, you really need it.

Cookey's answer is good... but here are detailed instructions on how to create a strongly typed object from an XSD(or XML) and serialize/deserialize in a few lines of code:

Instructions

My personal opinion, as a C# programmer, is that the best way to deal with XML in C# is to delegate that part of the code to a VB .NET project. In .NET 3.5, VB .NET has XML Literals, which make dealing with XML much more intuitive. See here, for example:

Overview of LINQ to XML in Visual Basic

(Be sure to set the page to display VB code, not C# code.)

I'd write the rest of the project in C#, but handle the XML in a referenced VB project.

nyxtom,

Shouldn't "doc" and "xdoc" match in Example 1?

XDocument **doc** = XDocument.Load(pathToXml);
List<Person> people = (from xnode in **xdoc**.Element("People").Elements("Person")
select new Person
{
Name = xnode.Attribute("Name").Value
}).ToList();

Writing XML with the XmlDocument class

//itemValues is collection of items in Key value pair format
//fileName i name of XML file which to creatd or modified with content
private void WriteInXMLFile(System.Collections.Generic.Dictionary<string, object> itemValues, string fileName)
{
string filePath = "C:\\\\tempXML\\" + fileName + ".xml";
try
{


if (System.IO.File.Exists(filePath))
{
XmlDocument doc = new XmlDocument();
doc.Load(filePath);


XmlNode rootNode = doc.SelectSingleNode("Documents");


XmlNode pageNode = doc.CreateElement("Document");
rootNode.AppendChild(pageNode);




foreach (string key in itemValues.Keys)
{


XmlNode attrNode = doc.CreateElement(key);
attrNode.InnerText = Convert.ToString(itemValues[key]);
pageNode.AppendChild(attrNode);
//doc.DocumentElement.AppendChild(attrNode);


}
doc.DocumentElement.AppendChild(pageNode);
doc.Save(filePath);
}
else
{
XmlDocument doc = new XmlDocument();
using(System.IO.FileStream fs = System.IO.File.Create(filePath))
{
//Do nothing
}


XmlNode rootNode = doc.CreateElement("Documents");
doc.AppendChild(rootNode);
doc.Save(filePath);


doc.Load(filePath);


XmlNode pageNode = doc.CreateElement("Document");
rootNode.AppendChild(pageNode);


foreach (string key in itemValues.Keys)
{
XmlNode attrNode = doc.CreateElement(key);
attrNode.InnerText = Convert.ToString(itemValues[key]);
pageNode.AppendChild(attrNode);
//doc.DocumentElement.AppendChild(attrNode);


}
doc.DocumentElement.AppendChild(pageNode);


doc.Save(filePath);


}
}
catch (Exception ex)
{


}


}


OutPut look like below
<Dcouments>
<Document>
<DocID>01<DocID>
<PageName>121<PageName>
<Author>Mr. ABC<Author>
<Dcoument>
<Document>
<DocID>02<DocID>
<PageName>122<PageName>
<Author>Mr. PQR<Author>
<Dcoument>
</Dcouments>

If you ever need to convert data between XmlNode <=> XNode <=> XElement
(e. g. in order to use LINQ) this Extensions may be helpful for you:

public static class MyExtensions
{
public static XNode GetXNode(this XmlNode node)
{
return GetXElement(node);
}


public static XElement GetXElement(this XmlNode node)
{
XDocument xDoc = new XDocument();
using (XmlWriter xmlWriter = xDoc.CreateWriter())
node.WriteTo(xmlWriter);
return xDoc.Root;
}


public static XmlNode GetXmlNode(this XElement element)
{
using (XmlReader xmlReader = element.CreateReader())
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlReader);
return xmlDoc;
}
}


public static XmlNode GetXmlNode(this XNode node)
{
return GetXmlNode(node);
}
}

Usage:

XmlDocument MyXmlDocument = new XmlDocument();
MyXmlDocument.Load("MyXml.xml");
XElement MyXElement = MyXmlDocument.GetXElement(); // Convert XmlNode to XElement
List<XElement> List = MyXElement.Document
.Descendants()
.ToList(); // Now you can use LINQ
...