What does the keyword "new" do to a struct in C#?

In C#, Structs are managed in terms of values, and objects are in reference. From my understanding, when creating an instance of a class, the keyword new causes C# to use the class information to make the instance, as in below:

class MyClass
{
...
}
MyClass mc = new MyClass();

For struct, you're not creating an object but simply set a variable to a value:

struct MyStruct
{
public string name;
}
MyStruct ms;
//MyStruct ms = new MyStruct();
ms.name = "donkey";

What I do not understand is if declare variables by MyStruct ms = new MyStruct(), what is the keyword new here is doing to the statement? . If struct cannot be an object, what is the new here instantiating?

49383 次浏览

From struct (C# Reference) on MSDN:

When you create a struct object using the new operator, it gets created and the appropriate constructor is called. Unlike classes, structs can be instantiated without using the new operator. If you do not use new, the fields will remain unassigned and the object cannot be used until all of the fields are initialized.

To my understanding, you won't actually be able to use a struct properly without using new unless you make sure you initialise all the fields manually. If you use the new operator, then a properly-written constructor has the opportunity to do this for you.

Hope that clears it up. If you need clarification on this let me know.


Edit

There's quite a long comment thread, so I thought I'd add a bit more here. I think the best way to understand it is to give it a go. Make a console project in Visual Studio called "StructTest" and copy the following code into it.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;


namespace struct_test
{
class Program
{
public struct Point
{
public int x, y;


public Point(int x)
{
this.x = x;
this.y = 5;
}


public Point(int x, int y)
{
this.x = x;
this.y = y;
}


// It will break with this constructor. If uncommenting this one
// comment out the other one with only one integer, otherwise it
// will fail because you are overloading with duplicate parameter
// types, rather than what I'm trying to demonstrate.
/*public Point(int y)
{
this.y = y;
}*/
}


static void Main(string[] args)
{
// Declare an object:
Point myPoint;
//Point myPoint = new Point(10, 20);
//Point myPoint = new Point(15);
//Point myPoint = new Point();




// Initialize:
// Try not using any constructor but comment out one of these
// and see what happens. (It should fail when you compile it)
myPoint.x = 10;
myPoint.y = 20;


// Display results:
Console.WriteLine("My Point:");
Console.WriteLine("x = {0}, y = {1}", myPoint.x, myPoint.y);


Console.ReadKey(true);
}
}
}

Play around with it. Remove the constructors and see what happens. Try using a constructor that only initialises one variable(I've commented one out... it won't compile). Try with and without the new keyword(I've commented out some examples, uncomment them and give them a try).

Using "new MyStuct()" ensures that all fields are set to some value. In the case above, nothing is different. If instead of setting ms.name you where trying to read it, you would get a "Use of possible unassigned field 'name'" error in VS.

Any time an object or struct comes into existence, all of its fields come into existence as well; if any of those fields are struct types, all nested fields come into existence as well. When an array is created, all of its elements come into existence (and, as above, if any of those elements are structs, the fields of those structs also come into existence). All of this occurs before any constructor code has a chance to run.

In .net, a struct constructor is effectively nothing more than a method which takes a struct as an 'out' parameter. In C#, an expression which calls a struct constructor will allocate a temporary struct instance, call the constructor on that, and then use that temporary instance as the value of the expression. Note that this is different from vb.net, where the generated code for a constructor will start by zeroing out all fields, but where the code from the caller will attempt to have the constructor operate directly upon the destination. For example: myStruct = new myStructType(whatever) in vb.net will clear myStruct before the first statement of the constructor executes; within the constructor, any writes to the object under construction will immediately operate upon myStruct.

ValueType and structures are something special in C#. Here I'm showing you what happens when you new something.

Here we have the following

  • Code

    partial class TestClass {
    public static void NewLong() {
    var i=new long();
    }
    
    
    public static void NewMyLong() {
    var i=new MyLong();
    }
    
    
    public static void NewMyLongWithValue() {
    var i=new MyLong(1234);
    }
    
    
    public static void NewThatLong() {
    var i=new ThatLong();
    }
    }
    
    
    [StructLayout(LayoutKind.Sequential)]
    public partial struct MyLong {
    const int bits=8*sizeof(int);
    
    
    public static implicit operator int(MyLong x) {
    return (int)x.m_Low;
    }
    
    
    public static implicit operator long(MyLong x) {
    long y=x.m_Hi;
    return (y<<bits)|x.m_Low;
    }
    
    
    public static implicit operator MyLong(long x) {
    var y=default(MyLong);
    y.m_Low=(uint)x;
    y.m_Hi=(int)(x>>bits);
    return y;
    }
    
    
    public MyLong(long x) {
    this=x;
    }
    
    
    uint m_Low;
    int m_Hi;
    }
    
    
    public partial class ThatLong {
    const int bits=8*sizeof(int);
    
    
    public static implicit operator int(ThatLong x) {
    return (int)x.m_Low;
    }
    
    
    public static implicit operator long(ThatLong x) {
    long y=x.m_Hi;
    return (y<<bits)|x.m_Low;
    }
    
    
    public static implicit operator ThatLong(long x) {
    return new ThatLong(x);
    }
    
    
    public ThatLong(long x) {
    this.m_Low=(uint)x;
    this.m_Hi=(int)(x>>bits);
    }
    
    
    public ThatLong() {
    int i=0;
    var b=i is ValueType;
    }
    
    
    uint m_Low;
    int m_Hi;
    }
    

And the generated IL of the methods of the test class would be

  • IL

    // NewLong
    .method public hidebysig static
    void NewLong () cil managed
    {
    .maxstack 1
    .locals init (
    [0] int64 i
    )
    
    
    IL_0000: nop
    IL_0001: ldc.i4.0 // push 0 as int
    IL_0002: conv.i8  // convert the pushed value to long
    IL_0003: stloc.0  // pop it to the first local variable, that is, i
    IL_0004: ret
    }
    
    
    // NewMyLong
    .method public hidebysig static
    void NewMyLong () cil managed
    {
    .maxstack 1
    .locals init (
    [0] valuetype MyLong i
    )
    
    
    IL_0000: nop
    IL_0001: ldloca.s i     // push address of i
    IL_0003: initobj MyLong // pop address of i and initialze as MyLong
    IL_0009: ret
    }
    
    
    // NewMyLongWithValue
    .method public hidebysig static
    void NewMyLongWithValue () cil managed
    {
    .maxstack 2
    .locals init (
    [0] valuetype MyLong i
    )
    
    
    IL_0000: nop
    IL_0001: ldloca.s i  // push address of i
    IL_0003: ldc.i4 1234 // push 1234 as int
    IL_0008: conv.i8     // convert the pushed value to long
    
    
    // call the constructor
    IL_0009: call instance void MyLong::.ctor(int64)
    
    
    IL_000e: nop
    IL_000f: ret
    }
    
    
    // NewThatLong
    .method public hidebysig static
    void NewThatLong () cil managed
    {
    // Method begins at RVA 0x33c8
    // Code size 8 (0x8)
    .maxstack 1
    .locals init (
    [0] class ThatLong i
    )
    
    
    IL_0000: nop
    
    
    // new by calling the constructor and push it's reference
    IL_0001: newobj instance void ThatLong::.ctor()
    
    
    // pop it to the first local variable, that is, i
    IL_0006: stloc.0
    
    
    IL_0007: ret
    }
    

The behaviour of the methods are commented in the IL code. And you might want to take a look of OpCodes.Initobj and OpCodes.Newobj. The value type is usually initialized with OpCodes.Initobj, but as MSDN says OpCodes.Newobj would also be used.

  • description in OpCodes.Newobj

    Value types are not usually created using newobj. They are usually allocated either as arguments or local variables, using newarr (for zero-based, one-dimensional arrays), or as fields of objects. Once allocated, they are initialized using Initobj. However, the newobj instruction can be used to create a new instance of a value type on the stack, that can then be passed as an argument, stored in a local, and so on.

For each value type which is numeric, from byte to double, has a defined op-code. Although they are declared as struct, there's some difference in the generated IL as shown.

Here are two more things to mention:

  1. ValueType itself is declared a abstract class

    That is, you cannot new it directly.

  2. structs cannot contain explicit parameterless constructors

    That is, when you new a struct, you would fall into the case above of either NewMyLong or NewMyLongWithValue.

To summarize, new for the value types and structures are for the consistency of the language concept.

Catch Eric Lippert's excellent answer from this thread. To quote him:

When you "new" a value type, three things happen. First, the memory manager allocates space from short term storage. Second, the constructor is passed a reference to the short term storage location. After the constructor runs, the value that was in the short-term storage location is copied to the storage location for the value, wherever that happens to be. Remember, variables of value type store the actual value.

(Note that the compiler is allowed to optimize these three steps into one step if the compiler can determine that doing so never exposes a partially-constructed struct to user code. That is, the compiler can generate code that simply passes a reference to the final storage location to the constructor, thereby saving one allocation and one copy.)

(Making this answer since it really is one)

In a struct, the new keyword is needlessly confusing. It doesn't do anything. It's just required if you want to use the constructor. It does not perform a new.

The usual meaning of new is to allocate permanent storage (on the heap.) A language like C++ allows new myObject() or just myObject(). Both call the same constructor. But the former creates a new object and returns a pointer. The latter merely creates a temp. Any struct or class can use either. new is a choice, and it means something.

C# doesn't give you a choice. Classes are always in the heap, and structs are always on the stack. It isn't possible to perform a real new on a struct. Experienced C# programmers are used to this. When they see ms = new MyStruct(); they know to ignore the new as just syntax. They know it's acting like ms = MyStruct(), which merely assigns to an existing object.

Oddly(?), classes require the new. c=myClass(); isn't allowed (using the constructor to set values of existing object c.) You'd have to make something like c.init();. So you really never have a choice -- constructors always allocate for classes, and never for structs. The new is always just decoration.

I assume the reason for requiring fake new's in structs is so you can easily change a struct into a class (assuming you always use myStruct=new myStruct(); when you first declare, which is recommended.)