XML Serialization and Inherited Types

Following on from my previous question I have been working on getting my object model to serialize to XML. But I have now run into a problem (quelle surprise!).

The problem I have is that I have a collection, which is of a abstract base class type, which is populated by the concrete derived types.

I thought it would be fine to just add the XML attributes to all of the classes involved and everything would be peachy. Sadly, thats not the case!

So I have done some digging on Google and I now understand why it's not working. In that the XmlSerializer is in fact doing some clever reflection in order to serialize objects to/from XML, and since its based on the abstract type, it cannot figure out what the hell it's talking to. Fine.

I did come across this page on CodeProject, which looks like it may well help a lot (yet to read/consume fully), but I thought I would like to bring this problem to the StackOverflow table too, to see if you have any neat hacks/tricks in order to get this up and running in the quickest/lightest way possible.

One thing I should also add is that I DO NOT want to go down the XmlInclude route. There is simply too much coupling with it, and this area of the system is under heavy development, so the it would be a real maintenance headache!

57346 次浏览

I've done things similar to this. What I normally do is make sure all the XML serialization attributes are on the concrete class, and just have the properties on that class call through to the base classes (where required) to retrieve information that will be de/serialized when the serializer calls on those properties. It's a bit more coding work, but it does work much better than attempting to force the serializer to just do the right thing.

One thing to look at is the fact that in the XmlSerialiser constructor you can pass an array of types that the serialiser might be having difficulty resolving. I've had to use that quite a few times where a collection or complex set of datastructures needed to be serialised and those types lived in different assemblies etc.

XmlSerialiser Constructor with extraTypes param

EDIT: I would add that this approach has the benefit over XmlInclude attributes etc that you can work out a way of discovering and compiling a list of your possible concrete types at runtime and stuff them in.

Seriously, an extensible framework of POCOs will never serialize to XML reliably. I say this because I can guarantee someone will come along, extend your class, and botch it up.

You should look into using XAML for serializing your object graphs. It is designed to do this, whereas XML serialization isn't.

The Xaml serializer and deserializer handles generics without a problem, collections of base classes and interfaces as well (as long as the collections themselves implement IList or IDictionary). There are some caveats, such as marking your read only collection properties with the DesignerSerializationAttribute, but reworking your code to handle these corner cases isn't that hard.

Just a quick update on this, I have not forgotten!

Just doing some more research, looks like I am on to a winner, just need to get the code sorted.

So far, I have the following:

  • The XmlSeralizer is basically a class that does some nifty reflection on the classes it is serializing. It determines the properties that are serialized based on the Type.
  • The reason the problem occurs is because a type mismatch is occurring, it is expecting the BaseType but in fact receives the DerivedType .. While you may think that it would treat it polymorphically, it doesn't since it would involve a whole extra load of reflection and type-checking, which it is not designed to do.

This behaviour appears to be able to be overridden (code pending) by creating a proxy class to act as the go-between for the serializer. This will basically determine the type of the derived class and then serialize that as normal. This proxy class then will feed that XML back up the line to the main serializer..

Watch this space! ^_^

It's certainly a solution to your problem, but there is another problem, which somewhat undermines your intention to use "portable" XML format. Bad thing happens when you decide to change classes in the next version of your program and you need to support both formats of serialization -- the new one and the old one (because your clients still use thier old files/databases, or they connect to your server using old version of your product). But you can't use this serializator anymore, because you used

type.AssemblyQualifiedName

which looks like

TopNamespace.SubNameSpace.ContainingClass+NestedClass, MyAssembly, Version=1.3.0.0, Culture=neutral, PublicKeyToken=b17a5c561934e089

that is contains your assembly attributes and version...

Now if you try to change your assembly version, or you decide to sign it, this deserialization is not going to work...

Problem Solved!

OK, so I finally got there (admittedly with a lot of help from here!).

So summarise:

Goals:

  • I didn't want to go down the XmlInclude route due to the maintenence headache.
  • Once a solution was found, I wanted it to be quick to implement in other applications.
  • Collections of Abstract types may be used, as well as individual abstract properties.
  • I didn't really want to bother with having to do "special" things in the concrete classes.

Identified Issues/Points to Note:

  • XmlSerializer does some pretty cool reflection, but it is very limited when it comes to abstract types (i.e. it will only work with instances of the abstract type itself, not subclasses).
  • The Xml attribute decorators define how the XmlSerializer treats the properties its finds. The physical type can also be specified, but this creates a tight coupling between the class and the serializer (not good).
  • We can implement our own XmlSerializer by creating a class that implements IXmlSerializable .

The Solution

I created a generic class, in which you specify the generic type as the abstract type you will be working with. This gives the class the ability to "translate" between the abstract type and the concrete type since we can hard-code the casting (i.e. we can get more info than the XmlSerializer can).

I then implemented the IXmlSerializable interface, this is pretty straight forward, but when serializing we need to ensure we write the type of the concrete class to the XML, so we can cast it back when de-serializing. It is also important to note it must be fully qualified as the assemblies that the two classes are in are likely to differ. There is of course a little type checking and stuff that needs to happen here.

Since the XmlSerializer cannot cast, we need to provide the code to do that, so the implicit operator is then overloaded (I never even knew you could do this!).

The code for the AbstractXmlSerializer is this:

using System;
using System.Collections.Generic;
using System.Text;
using System.Xml.Serialization;


namespace Utility.Xml
{
public class AbstractXmlSerializer<AbstractType> : IXmlSerializable
{
// Override the Implicit Conversions Since the XmlSerializer
// Casts to/from the required types implicitly.
public static implicit operator AbstractType(AbstractXmlSerializer<AbstractType> o)
{
return o.Data;
}


public static implicit operator AbstractXmlSerializer<AbstractType>(AbstractType o)
{
return o == null ? null : new AbstractXmlSerializer<AbstractType>(o);
}


private AbstractType _data;
/// <summary>
/// [Concrete] Data to be stored/is stored as XML.
/// </summary>
public AbstractType Data
{
get { return _data; }
set { _data = value; }
}


/// <summary>
/// **DO NOT USE** This is only added to enable XML Serialization.
/// </summary>
/// <remarks>DO NOT USE THIS CONSTRUCTOR</remarks>
public AbstractXmlSerializer()
{
// Default Ctor (Required for Xml Serialization - DO NOT USE)
}


/// <summary>
/// Initialises the Serializer to work with the given data.
/// </summary>
/// <param name="data">Concrete Object of the AbstractType Specified.</param>
public AbstractXmlSerializer(AbstractType data)
{
_data = data;
}


#region IXmlSerializable Members


public System.Xml.Schema.XmlSchema GetSchema()
{
return null; // this is fine as schema is unknown.
}


public void ReadXml(System.Xml.XmlReader reader)
{
// Cast the Data back from the Abstract Type.
string typeAttrib = reader.GetAttribute("type");


// Ensure the Type was Specified
if (typeAttrib == null)
throw new ArgumentNullException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
"' because no 'type' attribute was specified in the XML.");


Type type = Type.GetType(typeAttrib);


// Check the Type is Found.
if (type == null)
throw new InvalidCastException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
"' because the type specified in the XML was not found.");


// Check the Type is a Subclass of the AbstractType.
if (!type.IsSubclassOf(typeof(AbstractType)))
throw new InvalidCastException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
"' because the Type specified in the XML differs ('" + type.Name + "').");


// Read the Data, Deserializing based on the (now known) concrete type.
reader.ReadStartElement();
this.Data = (AbstractType)new
XmlSerializer(type).Deserialize(reader);
reader.ReadEndElement();
}


public void WriteXml(System.Xml.XmlWriter writer)
{
// Write the Type Name to the XML Element as an Attrib and Serialize
Type type = _data.GetType();


// BugFix: Assembly must be FQN since Types can/are external to current.
writer.WriteAttributeString("type", type.AssemblyQualifiedName);
new XmlSerializer(type).Serialize(writer, _data);
}


#endregion
}
}

So, from there, how do we tell the XmlSerializer to work with our serializer rather than the default? We must pass our type within the Xml attributes type property, for example:

[XmlRoot("ClassWithAbstractCollection")]
public class ClassWithAbstractCollection
{
private List<AbstractType> _list;
[XmlArray("ListItems")]
[XmlArrayItem("ListItem", Type = typeof(AbstractXmlSerializer<AbstractType>))]
public List<AbstractType> List
{
get { return _list; }
set { _list = value; }
}


private AbstractType _prop;
[XmlElement("MyProperty", Type=typeof(AbstractXmlSerializer<AbstractType>))]
public AbstractType MyProperty
{
get { return _prop; }
set { _prop = value; }
}


public ClassWithAbstractCollection()
{
_list = new List<AbstractType>();
}
}

Here you can see, we have a collection and a single property being exposed, and all we need to do is add the type named parameter to the Xml declaration, easy! :D

NOTE: If you use this code, I would really appreciate a shout-out. It will also help drive more people to the community :)

Now, but unsure as to what to do with answers here since they all had their pro's and con's. I'll upmod those that I feel were useful (no offence to those that weren't) and close this off once I have the rep :)

Interesting problem and good fun to solve! :)

Even better, using notation:

[XmlRoot]
public class MyClass {
public abstract class MyAbstract {}
public class MyInherited : MyAbstract {}
[XmlArray(), XmlArrayItem(typeof(MyInherited))]
public MyAbstract[] Items {get; set; }
}