哈斯克尔的孤儿

当使用 -Wall选项编译 Haskell 应用程序时,GHC 会抱怨孤立的实例,例如:

Publisher.hs:45:9:
Warning: orphan instance: instance ToSElem Result

类型类 ToSElem不是我的,它是由 HStringTemplate定义的。

现在我知道如何修复这个问题(将实例声明移动到其中声明 Result 的模块中) ,我也知道 为什么 GHC 倾向于避免孤立实例,但我仍然相信我的方法更好。我不在乎编译器是否不方便-而不是我。

之所以要在 Publisher 模块中声明我的 ToSElem实例,是因为它是依赖于 HStringTemplate 的 Publisher 模块,而不是其他模块。我试图维护一个关注点分离,避免让每个模块都依赖于 HStringTemplate。

我认为 Haskell 的类型类的一个优点,例如与 Java 的接口相比,是它们是开放的而不是关闭的,因此实例不必在与数据类型相同的位置声明。GHC 的建议似乎是忽略这一点。

所以,我要寻找的是一些验证,我的想法是正确的,我会忽略/压制这个警告,或者一个更有说服力的论点,反对我的做事方式。

11101 次浏览

Go ahead and suppress this warning!

You are in good company. Conal does it in "TypeCompose". "chp-mtl" and "chp-transformers" do it, "control-monad-exception-mtl" and "control-monad-exception-monadsfd" do it, etc.

btw you probably already know this, but for those that don't and stumble your question on a search:

{-# OPTIONS_GHC -fno-warn-orphans #-}

Edit:

I acknowledge the problems that Yitz mentioned in his answer as real problems. However I see not using orphaned instances as a problem as well, and I try to pick the "least of all evils", which is imho to prudently use orphan instances.

I only used an exclamation-point in my short answer because your question shows that you are already well aware of the problems. Otherwise, I would have been less enthusiastic :)

A bit of a diversion, but what I believe is the perfect solution in a perfect world without compromise:

I believe that the problems Yitz mentions (not knowing which instance is picked) could be solved in a "holistic" programming system where:

  • You are not editing mere text files primitively, but are rather assisted by the environment (for example code completion only suggest things of relevant types etc)
  • The "lower level" language has no special support for type-classes, and instead function tables are passed along explicitly
  • But, the "higher level" programming environment displays the code in similar way to how Haskell is presented now (you usually won't see the function tables passed along), and picks the explicit type-classes for you when they are obvious (for example all cases of Functor have only one choice) and when there are several examples (zipping list Applicative or list-monad Applicative, First/Last/lift maybe Monoid) it lets you choose which instance to use.
  • In any case, even when the instance was picked for you automatically, the environment easily allows you to see which instance was used, with an easy interface (a hyperlink or hover interface or something)

Back from fantasy world (or hopefully the future), right now: I recommend trying to avoid orphan instances while still using them when you "really need" to

I understand why you want to do this, but unfortunately, it may be only an illusion that Haskell classes seem to be "open" in the way that you say. Many people feel that the possibility of doing this is a bug in the Haskell specification, for reasons I'll explain below. Anyway, if it is really not appropriate for the instance you need to be declared either in the module where the class is declared or in the module where the type is declared, that is probably a sign that you should be using a newtype or some other wrapper around your type.

The reasons why orphan instances need to be avoided run far deeper than convenience of the compiler. This topic is rather controversial, as you can see from other answers. To balance the discussion, I am going to explain the point of view that one should never, ever, write orphan instances, which I think is the majority opinion among experienced Haskellers. My own opinion is somewhere in the middle, which I'll explain at the end.

The problem stems from the fact that when more than one instance declaration exists for the same class and type, there is no mechanism in standard Haskell to specify which to use. Rather, the program is rejected by the compiler.

The simplest effect of that is that you could have a perfectly working program that would suddenly stop compiling because of a change someone else makes in some far off dependency of your module.

Even worse, it's possible for a working program to start crashing at runtime because of a distant change. You could be using a method that you are assuming comes from a certain instance declaration, and it could silently be replaced by a different instance that is just different enough to cause your program to start inexplicably crashing.

People who want guarantees that these problems won't ever happen to them must follow the rule that if anyone, anywhere, has ever declared an instance of a certain class for a certain type, no other instance must ever be declared again in any program written by anyone. Of course, there is the workaround of using a newtype to declare a new instance, but that is always at least a minor inconvenience, and sometimes a major one. So in this sense, those who write orphan instances intentionally are being rather impolite.

So what should be done about this problem? The anti-orphan-instance camp says that the GHC warning is a bug, it needs to be an error that rejects any attempt to declare an orphan instance. In the meantime, we must exercise self-discipline and avoid them at all costs.

As you have seen, there are those who are not so worried about those potential problems. They actually encourage the use of orphan instances as a tool for separation of concerns, as you suggest, and say that one should just make sure on a case-by-case basis that there is no problem. I have been inconvenienced enough times by other people's orphan instances to be convinced that this attitude is too cavalier.

I think the right solution would be to add an extension to Haskell's import mechanism that would control the import of instances. That would not solve the problems completely, but it would give some help towards protecting our programs against damage from the orphan instances that already exist in the world. And then, with time, I might become convinced that in certain limited cases, perhaps an orphan instance might not be so bad. (And that very temptation is the reason that some in the anti-orphan-instance camp are opposed to my proposal.)

My conclusion from all this is that at least for the time being, I would strongly advise that you avoid declaring any orphan instances, to be considerate to others if for no other reason. Use a newtype.

In this case, I think the use of orphan instances is fine. The general rule of thumb for me is -- you can define an instance if you "own" the typeclass or if you "own" the data type (or some component thereof -- i.e., an instance for Maybe MyData is fine as well, at least sometimes). Within those constraints, where you decide to put the instance is your own business.

There's one further exception -- if you neither own the typeclass or the data type, but are producing a binary and not a library, then that's fine too.

Orphan instances is a nuisance, but in my opinion they are sometimes necessary. I often combine libraries where a type comes from one library and a class comes from another library. Of course the authors of these libraries cannot be expected to provide instances for every conceivable combination of types and classes. So I have to provide them, and so they are orphans.

The idea that you should wrap the type in a new type when you need to provide an instance is an idea with theoretical merit, but it's just too tedious in many circumstances; it's the kind of idea put forward by people who don't write Haskell code for a living. :)

So go ahead and provide orphan instances. They are harmless.
If you can crash ghc with orphan instances then that is a bug and should be reported as such. (The bug ghc had/has about not detecting multiple instances is not that hard to fix.)

But be aware that some time in the future someone else might add the some instance as you already have, and you might get a (compile time) error.

Along these lines, I understand the anti-orphan instance camp's position WRT libraries, but for executable targets shouldn't orphan instances be fine?

(I know I'm late to the party but this may be still be useful to others)

You could keep the orphan instances in their own module, then if anyone imports that module it's specifically because they need them and they can avoid importing them if they cause problems.