如何找到两个 STL 集的交集?

我一直试图在 C + + 中找到两个 std: : set 之间的交集,但总是出现错误。

我为此创建了一个小样本测试

#include <iostream>
#include <vector>
#include <algorithm>
#include <set>
using namespace std;


int main() {
set<int> s1;
set<int> s2;


s1.insert(1);
s1.insert(2);
s1.insert(3);
s1.insert(4);


s2.insert(1);
s2.insert(6);
s2.insert(3);
s2.insert(0);


set_intersection(s1.begin(),s1.end(),s2.begin(),s2.end());
return 0;
}

The latter program does not generate any output, but I expect to have a new set (let's call it s3) with the following values:

s3 = [ 1 , 3 ]

相反,我得到了一个错误:

test.cpp: In function ‘int main()’:
test.cpp:19: error: no matching function for call to ‘set_intersection(std::_Rb_tree_const_iterator<int>, std::_Rb_tree_const_iterator<int>, std::_Rb_tree_const_iterator<int>, std::_Rb_tree_const_iterator<int>)’

我对这个错误的理解是,在 set_intersection中没有接受 Rb_tree_const_iterator<int>作为参数的定义。

此外,我假设 std::set.begin()方法返回这种类型的对象,

Is there a better way to find the intersection of two std::set in C++? Preferably a built-in function?

112364 次浏览

您还没有为 set_intersection提供输出迭代器

template <class InputIterator1, class InputIterator2, class OutputIterator>
OutputIterator set_intersection ( InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2,
OutputIterator result );

解决这个问题的方法是

...;
set<int> intersect;
set_intersection(s1.begin(), s1.end(), s2.begin(), s2.end(),
std::inserter(intersect, intersect.begin()));

You need a std::insert iterator since the set is as of now empty. We cannot use std::back_inserter or std::front_inserter since set doesn't support those operations.

看看链接中的样本: Http://en.cppreference.com/w/cpp/algorithm/set_intersection

您需要另一个容器来存储交集数据,下面的代码假定可以工作:

std::vector<int> common_data;
set_intersection(s1.begin(),s1.end(),s2.begin(),s2.end(), std::back_inserter(common_data));

请参阅 Set _ cross。您必须添加一个输出迭代器,您将在其中存储结果:

#include <iterator>
std::vector<int> s3;
set_intersection(s1.begin(),s1.end(),s2.begin(),s2.end(), std::back_inserter(s3));

有关完整清单,请参见 理念

请在这里评论。我认为现在是时候添加联合,交集操作的设置界面。让我们在未来的标准中提出这一点。我已经使用了很长一段时间的标准,每次我使用设置操作,我希望标准是更好的。对于一些复杂的集合运算,例如 intersect,您可以简单(更容易?)修改以下代码:

template <class InputIterator1, class InputIterator2, class OutputIterator>
OutputIterator set_intersection (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2,
OutputIterator result)
{
while (first1!=last1 && first2!=last2)
{
if (*first1<*first2) ++first1;
else if (*first2<*first1) ++first2;
else {
*result = *first1;
++result; ++first1; ++first2;
}
}
return result;
}

复制自 http://www.cplusplus.com/reference/algorithm/set_intersection/

例如,如果您的输出是一个集合,您可以 output.insert (* first1)。此外,您的函数可能不是 templated.如果您的代码可以比使用 std set _ cross 函数短,那么继续编写。

如果希望进行两个集合的并集,只需要 setA.insert (setB.start () ,setB.end ()) ; 这比 set _ union 方法简单得多。但是,这不适用于向量。

接受的答案的第一个(投票结果良好的)注释抱怨现有的 std 集操作缺少一个运算符。

一方面,我理解标准库中缺乏这样的操作员。另一方面,如果愿意,可以很容易地添加它们(为了个人快乐)。 我超载了

  • operator *() for intersection of sets
  • 集合并的 operator +()

样本 test-set-ops.cc:

#include <algorithm>
#include <iterator>
#include <set>


template <class T, class CMP = std::less<T>, class ALLOC = std::allocator<T> >
std::set<T, CMP, ALLOC> operator * (
const std::set<T, CMP, ALLOC> &s1, const std::set<T, CMP, ALLOC> &s2)
{
std::set<T, CMP, ALLOC> s;
std::set_intersection(s1.begin(), s1.end(), s2.begin(), s2.end(),
std::inserter(s, s.begin()));
return s;
}


template <class T, class CMP = std::less<T>, class ALLOC = std::allocator<T> >
std::set<T, CMP, ALLOC> operator + (
const std::set<T, CMP, ALLOC> &s1, const std::set<T, CMP, ALLOC> &s2)
{
std::set<T, CMP, ALLOC> s;
std::set_union(s1.begin(), s1.end(), s2.begin(), s2.end(),
std::inserter(s, s.begin()));
return s;
}


// sample code to check them out:


#include <iostream>


using namespace std;


template <class T>
ostream& operator << (ostream &out, const set<T> &values)
{
const char *sep = " ";
for (const T &value : values) {
out << sep << value; sep = ", ";
}
return out;
}


int main()
{
set<int> s1 { 1, 2, 3, 4 };
cout << "s1: {" << s1 << " }" << endl;
set<int> s2 { 0, 1, 3, 6 };
cout << "s2: {" << s2 << " }" << endl;
cout << "I: {" << s1 * s2 << " }" << endl;
cout << "U: {" << s1 + s2 << " }" << endl;
return 0;
}

编译和测试:

$ g++ -std=c++11 -o test-set-ops test-set-ops.cc


$ ./test-set-ops
s1: { 1, 2, 3, 4 }
s2: { 0, 1, 3, 6 }
I: { 1, 3 }
U: { 0, 1, 2, 3, 4, 6 }


$

我不喜欢的是运算符中返回值的副本。也许,这个问题可以通过移动任务来解决,但这仍然超出了我的技能范围。

Due to my limited knowledge about these "new fancy" move semantics, I was concerned about the operator returns which might cause copies of the returned sets. Olaf Dietsche pointed out that these concerns are unnecessary as std::set is already equipped with move constructor/assignment.

尽管我相信他,但我还是在思考如何验证这一点(为了“自我说服”之类的东西)。其实很简单。由于必须在源代码中提供模板,所以只需使用调试器逐步完成。因此,我在 operator *()return s;上放置了一个断点,然后进行单步操作,这使我立即进入 std::set::set(_myt&& _Right): et voilà & nash; the move 构造函数。谢谢你,奥拉夫,给我的启发。

为了完整起见,我还实现了相应的赋值运算符

  • operator *=()表示集合的“破坏性”交集
  • operator +=() for "destructive" union of sets.

样本 test-set-assign-ops.cc:

#include <iterator>
#include <set>


template <class T, class CMP = std::less<T>, class ALLOC = std::allocator<T> >
std::set<T, CMP, ALLOC>& operator *= (
std::set<T, CMP, ALLOC> &s1, const std::set<T, CMP, ALLOC> &s2)
{
auto iter1 = s1.begin();
for (auto iter2 = s2.begin(); iter1 != s1.end() && iter2 != s2.end();) {
if (*iter1 < *iter2) iter1 = s1.erase(iter1);
else {
if (!(*iter2 < *iter1)) ++iter1;
++iter2;
}
}
while (iter1 != s1.end()) iter1 = s1.erase(iter1);
return s1;
}


template <class T, class CMP = std::less<T>, class ALLOC = std::allocator<T> >
std::set<T, CMP, ALLOC>& operator += (
std::set<T, CMP, ALLOC> &s1, const std::set<T, CMP, ALLOC> &s2)
{
s1.insert(s2.begin(), s2.end());
return s1;
}


// sample code to check them out:


#include <iostream>


using namespace std;


template <class T>
ostream& operator << (ostream &out, const set<T> &values)
{
const char *sep = " ";
for (const T &value : values) {
out << sep << value; sep = ", ";
}
return out;
}


int main()
{
set<int> s1 { 1, 2, 3, 4 };
cout << "s1: {" << s1 << " }" << endl;
set<int> s2 { 0, 1, 3, 6 };
cout << "s2: {" << s2 << " }" << endl;
set<int> s1I = s1;
s1I *= s2;
cout << "s1I: {" << s1I << " }" << endl;
set<int> s2I = s2;
s2I *= s1;
cout << "s2I: {" << s2I << " }" << endl;
set<int> s1U = s1;
s1U += s2;
cout << "s1U: {" << s1U << " }" << endl;
set<int> s2U = s2;
s2U += s1;
cout << "s2U: {" << s2U << " }" << endl;
return 0;
}

编译和测试:

$ g++ -std=c++11 -o test-set-assign-ops test-set-assign-ops.cc


$ ./test-set-assign-ops
s1: { 1, 2, 3, 4 }
s2: { 0, 1, 3, 6 }
s1I: { 1, 3 }
s2I: { 1, 3 }
s1U: { 0, 1, 2, 3, 4, 6 }
s2U: { 0, 1, 2, 3, 4, 6 }


$

为了保持界面简单,你可以复制/粘贴这个模板:

template<typename Type>
auto setIntersection(set<Type> set0, set<Type> set1)
{
set<Type> intersection;
for (auto value : set0)
if (set1.find(value) != set1.end())
intersection.insert(value);
return intersection;
}

那么在你的情况下

intersection = setIntersection<int>(s1, s2);

或者

intersection = setIntersection(s1, s2);