覆盖GetHashCode的最佳算法是什么?

在. NET中,#0方法在. NET基类库的很多地方都有使用。正确实现它对于在集合中快速查找项目或确定相等性尤为重要。

是否有标准算法或最佳实践来实现我的自定义类的GetHashCode,这样我就不会降低性能?

276288 次浏览

我通常使用Josh Bloch的神话般有效Java中给出的实现。它很快,并且创建了一个非常好的哈希,不太可能导致冲突。选择两个不同的素数,例如17和23,然后执行:

public override int GetHashCode(){unchecked // Overflow is fine, just wrap{int hash = 17;// Suitable nullity checks etc, of course :)hash = hash * 23 + field1.GetHashCode();hash = hash * 23 + field2.GetHashCode();hash = hash * 23 + field3.GetHashCode();return hash;}}

正如评论中提到的,你可能会发现最好选择一个大素数乘以。显然486187739很好……尽管我见过的大多数小数的例子都倾向于使用素数,但至少有类似的算法经常使用非素数。例如,在后面的不完全FNV示例中,我使用了显然效果很好的数字——但初始值不是素数。(不过乘法常数素数。我不知道这有多重要。)

这比XORing哈希码的常见做法要好,主要原因有两个。假设我们有一个具有两个int字段的类型:

XorHash(x, x) == XorHash(y, y) == 0 for all x, yXorHash(x, y) == XorHash(y, x) for all x, y

顺便说一句,早期的算法是C#编译器当前用于匿名类型的算法。

本页面给出了很多选择。我认为在大多数情况下,上面的“足够好”,并且非常容易记住和正确。FNV的替代方案同样简单,但使用不同的常量和XOR而不是ADD作为组合操作。它看起来东西像下面的代码,但正常的FNV算法对单个字节进行操作,因此这需要修改以每字节执行一次迭代,而不是每32位哈希值。FNV也设计用于可变长度的数据,而我们在这里使用它的方式总是用于相同数量的字段值。对这个答案的评论表明,这里的代码实际上并不像上面的加法方法那样工作(在测试的示例案例中)。

// Note: Not quite FNV!public override int GetHashCode(){unchecked // Overflow is fine, just wrap{int hash = (int) 2166136261;// Suitable nullity checks etc, of course :)hash = (hash * 16777619) ^ field1.GetHashCode();hash = (hash * 16777619) ^ field2.GetHashCode();hash = (hash * 16777619) ^ field3.GetHashCode();return hash;}}

请注意,需要注意的一件事是,理想情况下,您应该防止您的相等敏感(因此哈希代码敏感)状态在将其添加到依赖哈希代码的集合后发生更改。

根据留档

您可以为不可变引用类型重写GetHashCode。通常,对于可变引用类型,您应该仅在以下情况下重写GetHashCode:

  • 您可以从不可变的字段计算哈希代码;或者
  • 您可以确保可变对象的哈希代码在对象包含在依赖其哈希代码的集合中时不会更改。

FNV文章的链接已损坏,但这是Internet存档中的副本:永恒困惑-散列的艺术

我的大部分工作都是通过数据库连接完成的,这意味着我的类都有一个来自数据库的唯一标识符。我总是使用数据库中的ID来生成哈希码。

// Unique ID from databaseprivate int _id;
...{return _id.GetHashCode();}

我在Helper库中有一个哈希类,我用它来实现这个目的。

/// <summary>/// This is a simple hashing function from Robert Sedgwicks Hashing in C book./// Also, some simple optimizations to the algorithm in order to speed up/// its hashing process have been added. from: www.partow.net/// </summary>/// <param name="input">array of objects, parameters combination that you need/// to get a unique hash code for them</param>/// <returns>Hash code</returns>public static int RSHash(params object[] input){const int b = 378551;int a = 63689;int hash = 0;
// If it overflows then just wrap aroundunchecked{for (int i = 0; i < input.Length; i++){if (input[i] != null){hash = hash * a + input[i].GetHashCode();a = a * b;}}}
return hash;}

然后,您可以简单地将其用作:

public override int GetHashCode(){return Hashing.RSHash(_field1, _field2, _field3);}

我没有评估它的性能,所以欢迎任何反馈。

在大多数情况下,Equals()比较多个字段,GetHash()是对一个字段还是对多个字段进行哈希并不重要。你只需要确保计算哈希值非常便宜(没有分配,请)和快速(没有繁重的计算,当然没有数据库连接)并提供良好的分布。

繁重的工作应该是Equals()方法的一部分;散列应该是一个非常便宜的操作,以便在尽可能少的项目上调用Equals()。

最后一个提示:不要依赖GetHashCode()在多个应用运行中保持稳定。许多。Net类型不保证它们的哈希码在重新启动后保持不变,因此您应该仅对内存中的数据结构使用GetHashCode()的值。

这是我的哈希码助手。
它的优点是它使用泛型类型参数,因此不会导致装箱:

public static class HashHelper{public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2){unchecked{return 31 * arg1.GetHashCode() + arg2.GetHashCode();}}
public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3){unchecked{int hash = arg1.GetHashCode();hash = 31 * hash + arg2.GetHashCode();return 31 * hash + arg3.GetHashCode();}}
public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3,T4 arg4){unchecked{int hash = arg1.GetHashCode();hash = 31 * hash + arg2.GetHashCode();hash = 31 * hash + arg3.GetHashCode();return 31 * hash + arg4.GetHashCode();}}
public static int GetHashCode<T>(T[] list){unchecked{int hash = 0;foreach (var item in list){hash = 31 * hash + item.GetHashCode();}return hash;}}
public static int GetHashCode<T>(IEnumerable<T> list){unchecked{int hash = 0;foreach (var item in list){hash = 31 * hash + item.GetHashCode();}return hash;}}
/// <summary>/// Gets a hashcode for a collection for that the order of items/// does not matter./// So {1, 2, 3} and {3, 2, 1} will get same hash code./// </summary>public static int GetHashCodeForOrderNoMatterCollection<T>(IEnumerable<T> list){unchecked{int hash = 0;int count = 0;foreach (var item in list){hash += item.GetHashCode();count++;}return 31 * hash + count.GetHashCode();}}
/// <summary>/// Alternative way to get a hashcode is to use a fluent/// interface like this:<br />/// return 0.CombineHashCode(field1).CombineHashCode(field2).///     CombineHashCode(field3);/// </summary>public static int CombineHashCode<T>(this int hashCode, T arg){unchecked{return 31 * hashCode + arg.GetHashCode();}}

它还具有扩展方法来提供流畅的界面,因此您可以像这样使用它:

public override int GetHashCode(){return HashHelper.GetHashCode(Manufacturer, PartN, Quantity);}

或者像这样:

public override int GetHashCode(){return 0.CombineHashCode(Manufacturer).CombineHashCode(PartN).CombineHashCode(Quantity);}

这是一个很好的:

/// <summary>/// Helper class for generating hash codes suitable/// for use in hashing algorithms and data structures like a hash table./// </summary>public static class HashCodeHelper{private static int GetHashCodeInternal(int key1, int key2){unchecked{var num = 0x7e53a269;num = (-1521134295 * num) + key1;num += (num << 10);num ^= (num >> 6);
num = ((-1521134295 * num) + key2);num += (num << 10);num ^= (num >> 6);
return num;}}
/// <summary>/// Returns a hash code for the specified objects/// </summary>/// <param name="arr">An array of objects used for generating the/// hash code.</param>/// <returns>/// A hash code, suitable for use in hashing algorithms and data/// structures like a hash table./// </returns>public static int GetHashCode(params object[] arr){int hash = 0;foreach (var item in arr)hash = GetHashCodeInternal(hash, item.GetHashCode());return hash;}
/// <summary>/// Returns a hash code for the specified objects/// </summary>/// <param name="obj1">The first object.</param>/// <param name="obj2">The second object.</param>/// <param name="obj3">The third object.</param>/// <param name="obj4">The fourth object.</param>/// <returns>/// A hash code, suitable for use in hashing algorithms and/// data structures like a hash table./// </returns>public static int GetHashCode<T1, T2, T3, T4>(T1 obj1, T2 obj2, T3 obj3,T4 obj4){return GetHashCode(obj1, GetHashCode(obj2, obj3, obj4));}
/// <summary>/// Returns a hash code for the specified objects/// </summary>/// <param name="obj1">The first object.</param>/// <param name="obj2">The second object.</param>/// <param name="obj3">The third object.</param>/// <returns>/// A hash code, suitable for use in hashing algorithms and data/// structures like a hash table./// </returns>public static int GetHashCode<T1, T2, T3>(T1 obj1, T2 obj2, T3 obj3){return GetHashCode(obj1, GetHashCode(obj2, obj3));}
/// <summary>/// Returns a hash code for the specified objects/// </summary>/// <param name="obj1">The first object.</param>/// <param name="obj2">The second object.</param>/// <returns>/// A hash code, suitable for use in hashing algorithms and data/// structures like a hash table./// </returns>public static int GetHashCode<T1, T2>(T1 obj1, T2 obj2){return GetHashCodeInternal(obj1.GetHashCode(), obj2.GetHashCode());}}

以下是如何使用它:

private struct Key{private Type _type;private string _field;
public Type Type { get { return _type; } }public string Field { get { return _field; } }
public Key(Type type, string field){_type = type;_field = field;}
public override int GetHashCode(){return HashCodeHelper.GetHashCode(_field, _type);}
public override bool Equals(object obj){if (!(obj is Key))return false;var tf = (Key)obj;return tf._field.Equals(_field) && tf._type.Equals(_type);}}

ValueTuple-C#7更新

正如@c的注释中提到的,可以使用一个值元组。这节省了一些击键,更重要的是纯粹在堆栈上执行(没有垃圾):

(PropA, PropB, PropC, PropD).GetHashCode();

(注意:使用匿名类型的原始技术似乎在堆上创建了一个对象,即垃圾,因为匿名类型被实现为类,尽管这可能会被编译器优化。基准测试这些选项会很有趣,但元组选项应该更好。)

匿名类型(原始答案)

微软已经提供了一个很好的通用HashCode生成器:只需将您的属性/字段值复制到匿名类型并对其进行哈希:

new { PropA, PropB, PropC, PropD }.GetHashCode();

这适用于任意数量的属性。它不使用拳击。它只是使用框架中已经实现的匿名类型的算法。

这是我的简单方法。我为此使用经典的构建器模式。它是类型安全的(没有装箱/拆箱),也与. NET 2.0兼容(没有扩展方法等)。

它是这样使用的:

public override int GetHashCode(){HashBuilder b = new HashBuilder();b.AddItems(this.member1, this.member2, this.member3);return b.Result;}

这是一个基本的构建类:

internal class HashBuilder{private const int Prime1 = 17;private const int Prime2 = 23;private int result = Prime1;
public HashBuilder(){}
public HashBuilder(int startHash){this.result = startHash;}
public int Result{get{return this.result;}}
public void AddItem<T>(T item){unchecked{this.result = this.result * Prime2 + item.GetHashCode();}}
public void AddItems<T1, T2>(T1 item1, T2 item2){this.AddItem(item1);this.AddItem(item2);}
public void AddItems<T1, T2, T3>(T1 item1, T2 item2, T3 item3){this.AddItem(item1);this.AddItem(item2);this.AddItem(item3);}
public void AddItems<T1, T2, T3, T4>(T1 item1, T2 item2, T3 item3,T4 item4){this.AddItem(item1);this.AddItem(item2);this.AddItem(item3);this.AddItem(item4);}
public void AddItems<T1, T2, T3, T4, T5>(T1 item1, T2 item2, T3 item3,T4 item4, T5 item5){this.AddItem(item1);this.AddItem(item2);this.AddItem(item3);this.AddItem(item4);this.AddItem(item5);}
public void AddItems<T>(params T[] items){foreach (T item in items){this.AddItem(item);}}}

微软领导了几种散列方式…

//for classes that contain a single int valuereturn this.value;
//for classes that contain multiple int valuereturn x ^ y;
//for classes that contain single number bigger than intreturn ((int)value ^ (int)(value >> 32));
//for classes that contain class instance fields which inherit from objectreturn obj1.GetHashCode();
//for classes that contain multiple class instance fields which inherit from objectreturn obj1.GetHashCode() ^ obj2.GetHashCode() ^ obj3.GetHashCode();

我可以猜到,对于多个大int,你可以使用这个:

int a=((int)value1 ^ (int)(value1 >> 32));int b=((int)value2 ^ (int)(value2 >> 32));int c=((int)value3 ^ (int)(value3 >> 32));return a ^ b ^ c;

对于多类型也是如此:使用GetHashCode()首先转换为int然后int值将被xor'ed,结果就是你的哈希。

对于那些使用哈希作为ID(我的意思是一个唯一的值)的人来说,哈希自然被限制为一些数字,我认为哈希算法是5个字节,至少MD5。

您可以将多个值转换为哈希值,其中一些是相同的,因此不要将其用作标识符。(也许有一天我会使用您的组件)

这是我使用Jon Skeet的实现的帮助类。

public static class HashCode{public const int Start = 17;
public static int Hash<T>(this int hash, T obj){var h = EqualityComparer<T>.Default.GetHashCode(obj);return unchecked((hash * 31) + h);}}

用法:

public override int GetHashCode(){return HashCode.Start.Hash(_field1).Hash(_field2).Hash(_field3);}

如果您想避免为System. Int32编写扩展方法:

public readonly struct HashCode{private readonly int _value;
public HashCode(int value) => _value = value;
public static HashCode Start { get; } = new HashCode(17);
public static implicit operator int(HashCode hash) => hash._value;
public HashCode Hash<T>(T obj){var h = EqualityComparer<T>.Default.GetHashCode(obj);return unchecked(new HashCode((_value * 31) + h));}
public override int GetHashCode() => _value;}

它仍然避免任何堆分配,并且使用方式完全相同:

public override int GetHashCode(){// This time `HashCode.Start` is not an `Int32`, it's a `HashCode` instance.// And the result is implicitly converted to `Int32`.return HashCode.Start.Hash(_field1).Hash(_field2).Hash(_field3);}

编辑(2018年5月):EqualityComparer<T>.Default getter现在是一个JIT内部特性-拉取请求由Stephen Toub在这篇博客文章中提到。

直到最近,我的答案才会与Jon Skeet的答案非常接近。然而,我最近开始了一个项目,它使用了两个哈希表的幂,即哈希表,其中内部表的大小为8、16、32等。

它非常糟糕。所以经过一些实验和研究,我开始用以下内容重新散列我的哈希:

public static int ReHash(int source){unchecked{ulong c = 0xDEADBEEFDEADBEEF + (ulong)source;ulong d = 0xE2ADBEEFDEADBEEF ^ c;ulong a = d += c = c << 15 | c >> -15;ulong b = a += d = d << 52 | d >> -52;c ^= b += a = a << 26 | a >> -26;d ^= c += b = b << 51 | b >> -51;a ^= d += c = c << 28 | c >> -28;b ^= a += d = d << 9 | d >> -9;c ^= b += a = a << 47 | a >> -47;d ^= c += b << 54 | b >> -54;a ^= d += c << 32 | c >> 32;a += d << 25 | d >> -25;return (int)(a >> 1);}}

然后我的二次方哈希表不再糟糕了。

这让我很不安,因为上面的方法不应该起作用,或者更准确地说,除非最初的GetHashCode()在某些方面很差,否则它不应该起作用。

重新混合哈希码并不能改善一个伟大的哈希码,因为唯一可能的效果是我们引入了更多的碰撞。

重新混合哈希码并不能改善糟糕的哈希码,因为唯一可能的效果是我们将值53上的大量冲突更改为大量值18,3487,291。

重新混合哈希码只能改善一个哈希码,这个哈希码至少在避免整个范围内的绝对冲突(232个可能的值)方面做得相当好,但是当在哈希表中实际使用时,在避免冲突方面做得很糟糕。虽然二次方表的更简单模使这一点更加明显,但它也对更常见的素数表有负面影响,只是没有那么明显(重新哈希的额外工作超过了好处,但好处仍然存在)。

编辑:我也使用了开放寻址,这也会增加对碰撞的敏感性,也许比它是二次方的事实更重要。

而且,令人不安的是. NET(或研究这里)中的string.GetHashCode()实现可以以这种方式改进(由于碰撞较少,测试运行速度大约快20-30倍),更令人不安的是我自己的哈希代码可以改进多少(远不止这些)。

我过去编写的所有GetHashCode()实现,实际上用作本网站答案的基础,比我想象的要糟糕得多.大多数时候,它对大多数用途来说都是“足够好”的,但我想要更好的东西。

所以我把这个项目放在一边(无论如何,这是一个宠物项目),并开始研究如何在. NET中快速生成一个好的、分布良好的哈希代码。

最后我决定将SpookyHash移植到. NET。事实上,上面的代码是使用SpookyHash从32位输入生成32位输出的快速路径版本。

现在,SpookyHash不是一个很好的快速记住一段代码。我对它的移植甚至更少,因为我手动内联了很多代码以提高速度*。但这就是代码重用的目的。

然后我把项目放在一边,因为就像原始项目产生了如何产生更好的哈希代码的问题一样,该项目产生了如何产生更好的问题。NET memcpy。

然后我回来了,生成了很多重载,可以轻松地将几乎所有本机类型(除了decimal)输入哈希代码。

它很快,BobJenkins应该得到大部分的荣誉,因为我移植的他的原始代码更快,特别是在算法优化的64位机器上。

完整的代码可以在https://bitbucket.org/JonHanna/spookilysharp/src中看到,但请考虑上面的代码是它的简化版本。

但是,由于它现在已经编写,因此可以更轻松地使用它:

public override int GetHashCode(){var hash = new SpookyHash();hash.Update(field1);hash.Update(field2);hash.Update(field3);return hash.Final().GetHashCode();}

它还需要种子值,因此如果您需要处理不受信任的输入并希望防止Hash DoS攻击,您可以根据正常运行时间或类似情况设置种子,并使攻击者无法预测结果:

private static long hashSeed0 = Environment.TickCount;private static long hashSeed1 = DateTime.Now.Ticks;public override int GetHashCode(){//produce different hashes ever time this application is restarted//but remain consistent in each run, so attackers have a harder time//DoSing the hash tables.var hash = new SpookyHash(hashSeed0, hashSeed1);hash.Update(field1);hash.Update(field2);hash.Update(field3);return hash.Final().GetHashCode();}

*一个很大的惊喜是手动内联返回(x << n) | (x >> -n)的旋转方法改善了事情。我本来可以肯定抖动会为我内联,但分析显示并非如此。

从. NET的角度来看,它不是原生的,尽管它来自C#。它的问题是它自己的GetHashCode()将精度视为重要的,而它自己的Equals()则不然。两者都是有效的选择,但不能像那样混合。在实现自己的版本时,你需要选择做一个,或者做另一个,但我不知道你想要哪个。

通过比较。如果用于字符串,64位的SpookyHash比32位的string.GetHashCode()快得多,32位的string.GetHashCode()略快于64位的string.GetHashCode(),这比32位的SpookyHash快得多,尽管速度仍然足够快,是一个合理的选择。

这是Jon Skeet发布的算法的另一个流畅的实现,但它不包括分配或装箱操作:

public static class Hash{public const int Base = 17;
public static int HashObject(this int hash, object obj){unchecked { return hash * 23 + (obj == null ? 0 : obj.GetHashCode()); }}
public static int HashValue<T>(this int hash, T value)where T : struct{unchecked { return hash * 23 + value.GetHashCode(); }}}

用法:

public class MyType<T>{public string Name { get; set; }
public string Description { get; set; }
public int Value { get; set; }
public IEnumerable<T> Children { get; set; }
public override int GetHashCode(){return Hash.Base.HashObject(this.Name).HashObject(this.Description).HashValue(this.Value).HashObject(this.Children);}}

编译器将确保由于泛型类型约束而不使用类调用HashValue。但是没有编译器支持HashObject,因为添加泛型参数也会添加装箱操作。

我遇到了浮点数和小数的问题,使用上面选择的实现作为答案。

此测试失败(浮点数;即使我将2个值切换为负值,哈希也相同):

        var obj1 = new { A = 100m, B = 100m, C = 100m, D = 100m};var obj2 = new { A = 100m, B = 100m, C = -100m, D = -100m};var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

但是这个测试通过了(使用int):

        var obj1 = new { A = 100m, B = 100m, C = 100, D = 100};var obj2 = new { A = 100m, B = 100m, C = -100, D = -100};var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

我改变了我的实现,不使用GetHashCode的基本类型,它似乎工作得更好

    private static int InternalComputeHash(params object[] obj){unchecked{var result = (int)SEED_VALUE_PRIME;for (uint i = 0; i < obj.Length; i++){var currval = result;var nextval = DetermineNextValue(obj[i]);result = (result * MULTIPLIER_VALUE_PRIME) + nextval;
}return result;}}


private static int DetermineNextValue(object value){unchecked{
int hashCode;if (value is short|| value is int|| value is byte|| value is sbyte|| value is uint|| value is ushort|| value is ulong|| value is long|| value is float|| value is double|| value is decimal){return Convert.ToInt32(value);}else{return value != null ? value.GetHashCode() : 0;}}}

非常类似于夜编码器的解决方案,除了如果你愿意,它更容易提高素数。

PS:这是你吐在嘴里的时候之一,知道这可以重构成一个有9个默认值的方法,但它会更慢,所以你只是闭上眼睛,试着忘记它。

/// <summary>/// Try not to look at the source code. It works. Just rely on it./// </summary>public static class HashHelper{private const int PrimeOne = 17;private const int PrimeTwo = 23;
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();hash = hash * PrimeTwo + arg6.GetHashCode();hash = hash * PrimeTwo + arg7.GetHashCode();hash = hash * PrimeTwo + arg8.GetHashCode();hash = hash * PrimeTwo + arg9.GetHashCode();hash = hash * PrimeTwo + arg10.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();hash = hash * PrimeTwo + arg6.GetHashCode();hash = hash * PrimeTwo + arg7.GetHashCode();hash = hash * PrimeTwo + arg8.GetHashCode();hash = hash * PrimeTwo + arg9.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();hash = hash * PrimeTwo + arg6.GetHashCode();hash = hash * PrimeTwo + arg7.GetHashCode();hash = hash * PrimeTwo + arg8.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();hash = hash * PrimeTwo + arg6.GetHashCode();hash = hash * PrimeTwo + arg7.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4, T5, T6>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();hash = hash * PrimeTwo + arg6.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4, T5>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();hash = hash * PrimeTwo + arg5.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();hash = hash * PrimeTwo + arg4.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();hash = hash * PrimeTwo + arg3.GetHashCode();
return hash;}}
public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2){unchecked{int hash = PrimeOne;hash = hash * PrimeTwo + arg1.GetHashCode();hash = hash * PrimeTwo + arg2.GetHashCode();
return hash;}}}

ReSharper用户可以使用ReSharper -> Edit -> Generate Code -> Equality Members生成GetHashCode、Equals和其他。

// ReSharper's GetHashCode looks like thispublic override int GetHashCode() {unchecked {int hashCode = Id;hashCode = (hashCode * 397) ^ IntMember;hashCode = (hashCode * 397) ^ OtherIntMember;hashCode = (hashCode * 397) ^ (RefMember != null ? RefMember.GetHashCode() : 0);// ...return hashCode;}}

https://github.com/dotnet/coreclr/pull/14863开始,有一种生成哈希码的新方法非常简单!只需编写

public override int GetHashCode()=> HashCode.Combine(field1, field2, field3);

这将生成高质量的哈希代码,而无需担心实现细节。

如果我们没有超过8个属性(希望如此),这是另一个选择。

ValueTuple是一个结构,似乎有一个可靠的GetHashCode实现。

这意味着我们可以简单地这样做:

// Yay, no allocations and no custom implementations!public override int GetHashCode() => (this.PropA, this.PropB).GetHashCode();

让我们来看看. NET Core当前对ValueTupleGetHashCode的实现。

这是来自#0

    internal static int CombineHashCodes(int h1, int h2){return HashHelpers.Combine(HashHelpers.Combine(HashHelpers.RandomSeed, h1), h2);}
internal static int CombineHashCodes(int h1, int h2, int h3){return HashHelpers.Combine(CombineHashCodes(h1, h2), h3);}

这是来自#0

    public static readonly int RandomSeed = Guid.NewGuid().GetHashCode();
public static int Combine(int h1, int h2){unchecked{// RyuJIT optimizes this to use the ROL instruction// Related GitHub pull request: dotnet/coreclr#1830uint rol5 = ((uint)h1 << 5) | ((uint)h1 >> 27);return ((int)rol5 + h1) ^ h2;}}

英文:

  • 向左旋转(圆周移位)h1 5个位置。
  • 将结果和h1相加。
  • 用h2异或结果。
  • 首先对{静态随机种子,h1}执行上述操作。
  • 对于每个进一步的项目,对前一个结果和下一个项目(例如h2)执行操作。

很高兴了解更多关于这个ROL-5哈希码算法的属性。

遗憾的是,对于我们自己的GetHashCode,推迟到ValueTuple可能没有我们想要和预期的那么快。在相关的讨论中,这个评论说明直接调用HashHelpers.Combine更高性能。另一方面,那个是内部的,所以我们必须复制代码,牺牲了我们在这里获得的大部分代码。此外,我们有责任记住使用随机种子的第一个Combine。我不知道如果我们跳过这一步会有什么后果。

这是一个静态辅助类,实现了Josh Bloch的实现;并提供显式重载来“防止”装箱,并专门为长原语实现哈希。

您可以传递与您的equals实现匹配的字符串比较。

因为Hash输出总是一个int,所以您可以链接Hash调用。

using System;using System.Collections;using System.Collections.Generic;using System.Reflection;using System.Runtime.CompilerServices;

namespace Sc.Util.System{/// <summary>/// Static methods that allow easy implementation of hashCode. Example usage:/// <code>/// public override int GetHashCode()///     => HashCodeHelper.Seed///         .Hash(primitiveField)///         .Hsh(objectField)///         .Hash(iEnumerableField);/// </code>/// </summary>public static class HashCodeHelper{/// <summary>/// An initial value for a hashCode, to which is added contributions from fields./// Using a non-zero value decreases collisions of hashCode values./// </summary>public const int Seed = 23;
private const int oddPrimeNumber = 37;

/// <summary>/// Rotates the seed against a prime number./// </summary>/// <param name="aSeed">The hash's first term.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]private static int rotateFirstTerm(int aSeed){unchecked {return HashCodeHelper.oddPrimeNumber * aSeed;}}

/// <summary>/// Contributes a boolean to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aBoolean">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, bool aBoolean){unchecked {return HashCodeHelper.rotateFirstTerm(aSeed)+ (aBoolean? 1: 0);}}
/// <summary>/// Contributes a char to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aChar">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, char aChar){unchecked {return HashCodeHelper.rotateFirstTerm(aSeed)+ aChar;}}
/// <summary>/// Contributes an int to the developing HashCode seed./// Note that byte and short are handled by this method, through implicit conversion./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aInt">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, int aInt){unchecked {return HashCodeHelper.rotateFirstTerm(aSeed)+ aInt;}}
/// <summary>/// Contributes a long to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aLong">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, long aLong){unchecked {return HashCodeHelper.rotateFirstTerm(aSeed)+ (int)(aLong ^ (aLong >> 32));}}
/// <summary>/// Contributes a float to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aFloat">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, float aFloat){unchecked {return HashCodeHelper.rotateFirstTerm(aSeed)+ Convert.ToInt32(aFloat);}}
/// <summary>/// Contributes a double to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aDouble">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, double aDouble)=> aSeed.Hash(Convert.ToInt64(aDouble));
/// <summary>/// Contributes a string to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aString">The value to contribute.</param>/// <param name="stringComparison">Optional comparison that creates the hash.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed,string aString,StringComparison stringComparison = StringComparison.Ordinal){if (aString == null)return aSeed.Hash(0);switch (stringComparison) {case StringComparison.CurrentCulture :return StringComparer.CurrentCulture.GetHashCode(aString);case StringComparison.CurrentCultureIgnoreCase :return StringComparer.CurrentCultureIgnoreCase.GetHashCode(aString);case StringComparison.InvariantCulture :return StringComparer.InvariantCulture.GetHashCode(aString);case StringComparison.InvariantCultureIgnoreCase :return StringComparer.InvariantCultureIgnoreCase.GetHashCode(aString);case StringComparison.OrdinalIgnoreCase :return StringComparer.OrdinalIgnoreCase.GetHashCode(aString);default :return StringComparer.Ordinal.GetHashCode(aString);}}
/// <summary>/// Contributes a possibly-null array to the developing HashCode seed./// Each element may be a primitive, a reference, or a possibly-null array./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aArray">CAN be null.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, IEnumerable aArray){if (aArray == null)return aSeed.Hash(0);int countPlusOne = 1; // So it differs from nullforeach (object item in aArray) {++countPlusOne;if (item is IEnumerable arrayItem) {if (!object.ReferenceEquals(aArray, arrayItem))aSeed = aSeed.Hash(arrayItem); // recursive call!} elseaSeed = aSeed.Hash(item);}return aSeed.Hash(countPlusOne);}
/// <summary>/// Contributes a possibly-null array to the developing HashCode seed./// You must provide the hash function for each element./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aArray">CAN be null.</param>/// <param name="hashElement">Required: yields the hash for each element/// in <paramref name="aArray"/>.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash<T>(this int aSeed, IEnumerable<T> aArray, Func<T, int> hashElement){if (aArray == null)return aSeed.Hash(0);int countPlusOne = 1; // So it differs from nullforeach (T item in aArray) {++countPlusOne;aSeed = aSeed.Hash(hashElement(item));}return aSeed.Hash(countPlusOne);}
/// <summary>/// Contributes a possibly-null object to the developing HashCode seed./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="aObject">CAN be null.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int Hash(this int aSeed, object aObject){switch (aObject) {case null :return aSeed.Hash(0);case bool b :return aSeed.Hash(b);case char c :return aSeed.Hash(c);case int i :return aSeed.Hash(i);case long l :return aSeed.Hash(l);case float f :return aSeed.Hash(f);case double d :return aSeed.Hash(d);case string s :return aSeed.Hash(s);case IEnumerable iEnumerable :return aSeed.Hash(iEnumerable);}return aSeed.Hash(aObject.GetHashCode());}

/// <summary>/// This utility method uses reflection to iterate all specified properties that are readable/// on the given object, excluding any property names given in the params arguments, and/// generates a hashcode./// </summary>/// <param name="aSeed">The developing hash code, or the seed: if you have no seed, use/// the <see cref="Seed"/>.</param>/// <param name="aObject">CAN be null.</param>/// <param name="propertySelector"><see cref="BindingFlags"/> to select the properties to hash.</param>/// <param name="ignorePropertyNames">Optional.</param>/// <returns>A hash from the properties contributed to <c>aSeed</c>.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int HashAllProperties(this int aSeed,object aObject,BindingFlags propertySelector= BindingFlags.Instance| BindingFlags.Public| BindingFlags.GetProperty,params string[] ignorePropertyNames){if (aObject == null)return aSeed.Hash(0);if ((ignorePropertyNames != null)&& (ignorePropertyNames.Length != 0)) {foreach (PropertyInfo propertyInfo in aObject.GetType().GetProperties(propertySelector)) {if (!propertyInfo.CanRead|| (Array.IndexOf(ignorePropertyNames, propertyInfo.Name) >= 0))continue;aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));}} else {foreach (PropertyInfo propertyInfo in aObject.GetType().GetProperties(propertySelector)) {if (propertyInfo.CanRead)aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));}}return aSeed;}

/// <summary>/// NOTICE: this method is provided to contribute a <see cref="KeyValuePair{TKey,TValue}"/> to/// the developing HashCode seed; by hashing the key and the value independently. HOWEVER,/// this method has a different name since it will not be automatically invoked by/// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,/// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless/// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise/// the generated hash code will not be consistent. This method itself ALSO will not invoke/// this method on the Key or Value here if that itself is a KeyValuePair./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="keyValuePair">The value to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int HashKeyAndValue<TKey, TValue>(this int aSeed, KeyValuePair<TKey, TValue> keyValuePair)=> aSeed.Hash(keyValuePair.Key).Hash(keyValuePair.Value);
/// <summary>/// NOTICE: this method is provided to contribute a collection of <see cref="KeyValuePair{TKey,TValue}"/>/// to the developing HashCode seed; by hashing the key and the value independently. HOWEVER,/// this method has a different name since it will not be automatically invoked by/// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,/// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless/// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise/// the generated hash code will not be consistent. This method itself ALSO will not invoke/// this method on a Key or Value here if that itself is a KeyValuePair or an Enumerable of/// KeyValuePair./// </summary>/// <param name="aSeed">The developing HashCode value or seed.</param>/// <param name="keyValuePairs">The values to contribute.</param>/// <returns>The new hash code.</returns>[MethodImpl(MethodImplOptions.AggressiveInlining)]public static int HashKeysAndValues<TKey, TValue>(this int aSeed,IEnumerable<KeyValuePair<TKey, TValue>> keyValuePairs){if (keyValuePairs == null)return aSeed.Hash(null);foreach (KeyValuePair<TKey, TValue> keyValuePair in keyValuePairs) {aSeed = aSeed.HashKeyAndValue(keyValuePair);}return aSeed;}}}

使用System.HashCode

如果您使用的是。NET Standard 2.1或更高版本,您可以使用系统哈希值结构。在早期的框架中,它可从#0包中获得。有两种使用它的方法:

合并哈希码

Combine方法可用于创建哈希码,最多提供八个对象。

public override int GetHashCode() => HashCode.Combine(this.object1, this.object2);

哈希码添加列表

Add方法可帮助您处理集合:

public override int GetHashCode(){var hashCode = new HashCode();hashCode.Add(this.object1);foreach (var item in this.collection){hashCode.Add(item);}return hashCode.ToHashCode();}

轻松获取HashCode

System.HashCode的替代品,非常易于使用,同时仍然快速。您可以阅读完整的博客文章“轻松获取HashCode”以获取更多详细信息和评论。

用法示例

public class SuperHero{public int Age { get; set; }public string Name { get; set; }public List<string> Powers { get; set; }
public override int GetHashCode() =>HashCode.Of(this.Name).And(this.Age).AndEach(this.Powers);}

实施

public struct HashCode : IEquatable<HashCode>{private const int EmptyCollectionPrimeNumber = 19;private readonly int value;
private HashCode(int value) => this.value = value;
public static implicit operator int(HashCode hashCode) => hashCode.value;
public static bool operator ==(HashCode left, HashCode right) => left.Equals(right);
public static bool operator !=(HashCode left, HashCode right) => !(left == right);
public static HashCode Of<T>(T item) => new HashCode(GetHashCode(item));
public static HashCode OfEach<T>(IEnumerable<T> items) =>items == null ? new HashCode(0) : new HashCode(GetHashCode(items, 0));
public HashCode And<T>(T item) =>new HashCode(CombineHashCodes(this.value, GetHashCode(item)));
public HashCode AndEach<T>(IEnumerable<T> items){if (items == null){return new HashCode(this.value);}
return new HashCode(GetHashCode(items, this.value));}
public bool Equals(HashCode other) => this.value.Equals(other.value);
public override bool Equals(object obj){if (obj is HashCode){return this.Equals((HashCode)obj);}
return false;}
public override int GetHashCode() => this.value.GetHashCode();
private static int CombineHashCodes(int h1, int h2){unchecked{// Code copied from System.Tuple a good way to combine hashes.return ((h1 << 5) + h1) ^ h2;}}
private static int GetHashCode<T>(T item) => item?.GetHashCode() ?? 0;
private static int GetHashCode<T>(IEnumerable<T> items, int startHashCode){var temp = startHashCode;
var enumerator = items.GetEnumerator();if (enumerator.MoveNext()){temp = CombineHashCodes(temp, GetHashCode(enumerator.Current));
while (enumerator.MoveNext()){temp = CombineHashCodes(temp, GetHashCode(enumerator.Current));}}else{temp = CombineHashCodes(temp, EmptyCollectionPrimeNumber);}
return temp;}}

什么是好的算法?

性能

计算哈希码的算法需要快速。简单的算法通常会更快。不分配额外内存的算法也将减少对垃圾回收机制的需求,这反过来也会提高性能。

特别是在C#哈希函数中,您经常使用unchecked关键字来停止溢出检查以提高性能。

确定性

哈希算法需要确定性,即给定相同的输入,它必须始终产生相同的输出。

减少碰撞

计算哈希码的算法需要将哈希冲突保持在最小值。哈希冲突是指两个不同对象上对GetHashCode的两次调用产生相同的哈希码时发生的情况。请注意,允许冲突(有些人错误地认为它们不是),但它们应该保持在最低限度。

许多哈希函数包含像1723这样的神奇数字。这些是特殊的质数,由于它们的数学特性,与使用非素数相比,它们有助于减少哈希冲突。

哈希一致性

一个好的哈希函数应该在其输出范围内尽可能均匀地映射预期的输入,即它应该基于其均匀分布的输入输出广泛的哈希。它应该具有哈希均匀性。

预防失败

在. NET Core中,每次重新启动应用程序时,您都会得到不同的哈希代码。这是一项防止拒绝服务攻击(DoS)的安全功能。对于. NET Framework,您应该通过添加以下App.config文件来启用此功能:

<?xml version ="1.0"?><configuration><runtime><UseRandomizedStringHashAlgorithm enabled="1" /></runtime></configuration>

由于此功能,哈希代码不应在创建它们的应用程序域之外使用,它们不应用作集合中的键字段,也不应持久化。

阅读更多关于这个这里

加密安全?

该算法不必是密码散列函数。这意味着它不必满足以下条件:

  • 生成产生给定哈希值的消息是不可行的。
  • 找到具有相同哈希值的两条不同消息是不可行的。
  • 对消息的微小更改应该如此广泛地更改哈希值,以至于新的哈希值看起来与旧的哈希值不相关(雪崩效应)。

如果你想从netstandard2.1填充HashCode

public static class HashCode{public static int Combine(params object[] instances){int hash = 17;
foreach (var i in instances){hash = unchecked((hash * 31) + (i?.GetHashCode() ?? 0));}
return hash;}}

注意:如果与struct一起使用,它将由于装箱而分配内存

可以尝试采用C++Boost库的方法。如下所示:

class HashUtil{public static int HashCombine(int seed, int other){unchecked{return other + 0x9e3779b9 + (seed << 6) + (seed >> 2);}}}

然后:

class MyClass{private string _field1;private int _field2;private AnotherClass _field3;private YetAnotherClass _field4;
public override int GetHashCode(){int result = HashUtil.HashCombine(_field1.GetHashCode(), _field2);result = HashUtil.HashCombine(result, _field3.GetHashCode());return HashUtil.HashCombine(result, _field4.GetHashCode());}}

我想把我的最新发现添加到我经常回来的这个线程中。

我当前的Visual Studio/项目设置提供了自动将元组重构为结构的功能。这将生成一个GetHashCode函数,如下所示:

        public override int GetHashCode(){int hashCode = -2088324004;hashCode = hashCode * -1521134295 + AuftragGesperrt.GetHashCode();hashCode = hashCode * -1521134295 + Auftrag_gesperrt_von.GetHashCode();hashCode = hashCode * -1521134295 + Auftrag_gesperrt_am.GetHashCode();return hashCode;}

编辑:澄清一下,Auftrag_gesperrt_von和Auftrag_gesperrt_am是属性。如果微软开发人员使用此功能,它可能不是太糟糕的解决方案。