C++ 标准库中的哈希函数：从 `std::hash` 到自定义哈希器

1. 引言

在上一篇中，我们介绍了哈希表为什么能够实现 O(1) 查找。
核心秘密在于：哈希函数。

在 C++ 标准库中，哈希表容器（如 unordered_map、unordered_set）依赖一个统一的机制：std::hash。
那么 std::hash 是什么？它能处理哪些类型？如果我要存储自定义类型，应该怎么写哈希函数呢？
本文将一一揭开这些问题。

2. 什么是 `std::hash`

std::hash 是 C++ 标准库提供的 函数对象模板。它为常见的基础类型（如 int、double、string 等）定义了默认哈希函数。

简单理解：

std::hash<T> 是一个类模板。
它重载了 operator()，返回一个 size_t 类型的哈希值。
默认可以直接用于 unordered_map 或 unordered_set。

示例代码

#include <iostream>
#include <string>
#include <functional>
using namespace std;int main() {hash<int> hashInt;hash<string> hashStr;cout << "Hash(42) = " << hashInt(42) << endl;cout << "Hash(\"Alice\") = " << hashStr("Alice") << endl;return 0;
}

输出的结果是一个整数（具体值依赖编译器实现），这个整数就是键的哈希值。

3. `std::hash` 支持哪些类型

标准库中，以下常见类型都有默认实现：

所有基本数值类型：int, long, double, char …
指针类型：T*
std::string 与 std::u16string 等字符串类

因此，对于这些类型，你可以直接使用 unordered_map：

#include <unordered_map>
#include <string>
using namespace std;int main() {unordered_map<string, int> score = {{"Alice", 90},{"Bob", 85}};return 0;
}

无需额外定义哈希函数。

4. 为什么需要自定义哈希函数

如果你要存储 自定义类型（例如坐标点、复数、pair 等），标准库并不知道如何为其生成哈希值。
这时就需要 自己提供哈希函数。

5. 自定义哈希函数的方法

方法一：自定义函数对象

#include <iostream>
#include <unordered_set>
using namespace std;struct Point {int x, y;bool operator==(const Point& other) const {return x == other.x && y == other.y;}
};struct PointHash {size_t operator()(const Point& p) const {return hash<int>()(p.x) ^ (hash<int>()(p.y) << 1);}
};int main() {unordered_set<Point, PointHash> points;points.insert({1, 2});points.insert({3, 4});cout << "size = " << points.size() << endl;return 0;
}

这里：

operator== 用于判断两个对象是否相等。
PointHash 提供哈希值。
注意用位运算（如 ^ 和 <<）来混合不同字段，减少冲突。

方法二：使用 `lambda`（C++14 起）

#include <unordered_map>
#include <string>
#include <functional>
#include <utility>
using namespace std;int main() {auto pair_hash = [](const pair<int,int>& p) {return hash<int>()(p.first) ^ (hash<int>()(p.second) << 1);};unordered_map<pair<int,int>, string, decltype(pair_hash)> mp(10, pair_hash);mp[{1,2}] = "坐标(1,2)";return 0;
}

这种方式更灵活，但语法稍微复杂，需要 decltype(pair_hash) 指定哈希类型。

方法三：偏特化 `std::hash`

C++ 允许为自定义类型对 std::hash 模板进行偏特化：

#include <unordered_set>
#include <string>
using namespace std;struct Student {string name;int id;bool operator==(const Student& other) const {return name == other.name && id == other.id;}
};namespace std {template<>struct hash<Student> {size_t operator()(const Student& s) const {return hash<string>()(s.name) ^ (hash<int>()(s.id) << 1);}};
}int main() {unordered_set<Student> students;students.insert({"Alice", 1001});students.insert({"Bob", 1002});return 0;
}