为什么 C # 默认将方法实现为非虚方法?

与 Java 不同,为什么 C # 默认将方法视为非虚函数?这是否更有可能是一个绩效问题,而不是其他可能的结果?

这让我想起了安德斯·海尔斯伯格中的一段话,讲述了现有架构带来的几个优势。那副作用呢?默认情况下使用非虚方法真的是一个很好的权衡吗?

14369 次浏览

C# is influenced by C++ (and more). C++ does not enable dynamic dispatch (virtual functions) by default. One (good?) argument for this is the question: "How often do you implement classes that are members of a class hiearchy?". Another reason to avoid enabling dynamic dispatch by default is the memory footprint. A class without a virtual pointer (vpointer) pointing to a virtual table, is ofcourse smaller than the corresponding class with late binding enabled.

The performance issue is not so easy to say "yes" or "no" to. The reason for this is the Just In Time (JIT) compilation which is a run time optimization in C#.

Another, similar question about "speed of virtual calls.."

Classes should be designed for inheritance to be able to take advantage of it. Having methods virtual by default means that every function in the class can be plugged out and replaced by another, which is not really a good thing. Many people even believe that classes should have been sealed by default.

virtual methods can also have a slight performance implication. This is not likely to be the primary reason, however.

Because it's too easy to forget that a method may be overridden and not design for that. C# makes you think before you make it virtual. I think this is a great design decision. Some people (such as Jon Skeet) have even said that classes should be sealed by default.

To summarize what others said, there are a few reasons:

1- In C#, there are many things in syntax and semantics that come straight from C++. The fact that methods where not-virtual by default in C++ influenced C#.

2- Having every method virtual by default is a performance concern because every method call must use the object's Virtual Table. Moreover, this strongly limits the Just-In-Time compiler's ability to inline methods and perform other kinds of optimization.

3- Most importantly, if methods are not virtual by default, you can guarantee the behavior of your classes. When they are virtual by default, such as in Java, you can't even guarantee that a simple getter method will do as intended because it could be overridden to do anything in a derived class (of course you can, and should, make the method and/or the class final).

One might wonder, as Zifre mentioned, why the C# language did not go a step further and make classes sealed by default. That's part of the whole debate about the problems of implementation inheritance, which is a very interesting topic.

The simple reason is design and maintenance cost in addition to performance costs. A virtual method has additional cost as compared with a non-virtual method because the designer of the class must plan for what happens when the method is overridden by another class. This has a big impact if you expect a particular method to update internal state or have a particular behavior. You now have to plan for what happens when a derived class changes that behavior. It's much harder to write reliable code in that situation.

With a non-virtual method you have total control. Anything that goes wrong is the fault of the original author. The code is much easier to reason about.

It is certainly not a performance issue. Sun's Java interpreter uses he same code to dispatch (invokevirtual bytecode) and HotSpot generates exactly the same code whether final or not. I believe all C# objects (but not structs) have virtual methods, so you are always going to need the vtbl/runtime class identification. C# is a dialect of "Java-like languages". To suggest it comes from C++ is not entirely honest.

There is an idea that you should "design for inheritance or else prohibit it". Which sounds like a great idea right up to the moment you have a severe business case to put in a quick fix. Perhaps inheriting from code that you don't control.

If all C# methods were virtual then the vtbl would be much bigger.

C# objects only have virtual methods if the class has virtual methods defined. It is true that all objects have type information that includes a vtbl equivalent, but if no virtual methods are defined then only the base Object methods will be present.

@Tom Hawtin: It is probably more accurate to say that C++, C# and Java are all from the C family of languages :)

I'm surprised that there seems to be such a consensus here that non-virtual-by-default is the right way to do things. I'm going to come down on the other - I think pragmatic - side of the fence.

Most of the justifications read to me like the old "If we give you the power you might hurt yourself" argument. From programmers?!

It seems to me like the coder who didn't know enough (or have enough time) to design their library for inheritance and/or extensibility is the coder who's produced exactly the library I'm likely to have to fix or tweak - exactly the library where the ability to override would come in most useful.

The number of times I've had to write ugly, desperate work-around code (or to abandon usage and roll my own alternative solution) because I can't override far, far outweighs the number of times I've ever been bitten (e.g. in Java) by overriding where the designer might not have considered I might.

Non-virtual-by-default makes my life harder.

UPDATE: It's been pointed out [quite correctly] that I didn't actually answer the question. So - and with apologies for being rather late....

I kinda wanted to be able to write something pithy like "C# implements methods as non-virtual by default because a bad decision was made which valued programs more highly than programmers". (I think that could be somewhat justified based on some of the other answers to this question - like performance (premature optimisation, anyone?), or guaranteeing the behaviour of classes.)

However, I realise I'd just be stating my opinion and not that definitive answer that Stack Overflow desires. Surely, I thought, at the highest level the definitive (but unhelpful) answer is:

They're non-virtual by default because the language-designers had a decision to make and that's what they chose.

Now I guess the exact reason that they made that decision we'll never.... oh, wait! The transcript of a conversation!

So it would seem that the answers and comments here about the dangers of overriding APIs and the need to explicitly design for inheritance are on the right track but are all missing an important temporal aspect: Anders' main concern was about maintaining a class's or API's implicit contract across versions. And I think he's actually more concerned about allowing the .Net / C# platform to change under code rather than concerned about user-code changing on top of the platform. (And his "pragmatic" viewpoint is the exact opposite of mine because he's looking from the other side.)

(But couldn't they just have picked virtual-by-default and then peppered "final" through the codebase? Perhaps that's not quite the same.. and Anders is clearly smarter than me so I'm going to let it lie.)

Coming from a perl background I think C# sealed the doom of every developer who might have wanted to extend and modify the behaviour of a base class' thru a non virtual method without forcing all users of the new class to be aware of potentially behind the scene details.

Consider the List class' Add method. What if a developer wanted to update one of several potential databases whenever a particular List is 'Added' to? If 'Add' had been virtual by default the developer could develop a 'BackedList' class that overrode the 'Add' method without forcing all client code to know it was a 'BackedList' instead of a regular 'List'. For all practical purposes the 'BackedList' can be viewed as just another 'List' from client code.

This makes sense from the perspective of a large main class which might provide access to one or more list components which themselves are backed by one or more schemas in a database. Given that C# methods are not virtual by default, the list provided by the main class cannot be a simple IEnumerable or ICollection or even a List instance but must instead be advertised to the client as a 'BackedList' in order to ensure that the new version of the 'Add' operation is called to update the correct schema.

Performance.

Imagine a set of classes that override a virtual base method:

class Base {
public virtual int func(int x) { return 0; }
}


class ClassA: Base {
public override int func(int x) { return x + 100; }
}


class ClassB: Base {
public override int func(int x) { return x + 200; }
}

Now imagine you want to call the func method:

   Base foo;
//...sometime later...
int x = foo.func(42);

Look at what the CPU has to actually do:

    mov   ecx, bfunc$ -- load the address of the "ClassB.func" method from the VMT
push  42          -- push the 42 argument
call  [eax]       -- call ClassB.func

No problem? No, problem!

The assembly isn't that hard to follow:

  1. mov ecx, foo$: This needs to reach into memory, and hit the part of the object's Virtual Method Table (VMT) to get the address of the overridden foo method. The CPU will begin the fetch of the data from memory, and then it will continue on:
  2. push 42: Push the argument 42 onto the stack for the call to the function. No problem, that can run right away, and then we continue to:
  3. call [ecx] Call the address of the ClassB.func function. ← 𝕊𝕋𝔸𝕃𝕃!!!

That's a problem. The address of ClassB.func function has not been fetched from the VMT yet. This means that the CPU doesn't know where the go to next. Ideally it would follow a jump and continue spectatively executing instructions as it waits for the address of ClassB.func to come back from memory. But it can't; so we wait.

If we are lucky: the data already is in the L2 cache. Getting a value out of the L2 cache into a place where it can be used is going to take 12-15 cycles. The CPU can't know where to go next without having to wait for memory for 12-15 cycles.

𝕋𝕙𝕖 ℂℙ𝕌 𝕚𝕤 𝕤𝕥𝕒𝕝𝕝𝕖𝕕 𝕗𝕠𝕣 𝟙𝟚-𝟙𝟝 𝕔𝕪𝕔𝕝𝕖𝕤

Our program is stuck doing nothing for 12-15 cycles.

The CPU core has 7 execution engines. The main job of the CPU is keeping those 7 pipelines full of stuff to do. That means:

  • JITing your machine code into a different order
  • Starting the fetch from memory as soon as possible, letting us move on to other things
  • executing 100, 200, 300 instructions ahead. It will be executing 17 iterations ahead in your loop, across multiple function call and returns
  • it has a branch predictor to try to guess which way a comparison will go, so that it can continue executing ahead while we wait. If it guesses wrong, then it does have to undo all that work. But the branch predictor is not stupid - it's right 94% of the time.

Your CPU has all this power, and capability, and it's just STALLED FOR 15 CYCLES!?

This is awful. This is terrible. And you suffer this penalty every time you call a virtual method - whether you actually overrode it or not.

Our program is 12-15 cycles slower every method call because the language designer made virtual methods opt-out rather than opt-in.

This is why Microsoft decided to not make all methods virtual by default: they learned from Java's mistakes.

Someone ported Android to C#, and it was faster

In 2012, the Xamarin people ported all of Android's Dalvik (i.e. Java) to C#. From them:

Performance

When C# came around, Microsoft modified the language in a couple of significant ways that made it easier to optimize. Value types were introduced to allow small objects to have low overheads and virtual methods were made opt-in, instead of opt-out which made for simpler VMs.

(emphasis mine)