使用 std: : Vector 作为原始内存的视图

我正在使用一个外部库,它在某个时刻给我一个原始指针,指向一个整数数组和一个大小。

现在,我想使用 std::vector就地访问和修改这些值,而不是使用原始指针访问它们。

这里有一个人为的例子来解释这一点:

size_t size = 0;
int * data = get_data_from_library(size);   // raw data from library {5,3,2,1,4}, size gets filled in


std::vector<int> v = ????;                  // pseudo vector to be used to access the raw data


std::sort(v.begin(), v.end());              // sort raw data in place


for (int i = 0; i < 5; i++)
{
std::cout << data[i] << "\n";             // display sorted raw data
}

预期产出:

1
2
3
4
5

原因是我需要对这些数据应用来自 <algorithm>的算法(排序、交换元素等)。

另一方面,改变该矢量的大小永远不会改变,所以 push_backeraseinsert不需要在该矢量上工作。

我可以根据库中的数据构造一个向量,修改这个向量,然后将数据复制回库中,但这将是两个完整的副本,因为数据集可能非常大,我希望避免这样做。

19490 次浏览

Since the algorithm-library works with iterators you can keep the array.

For pointers and known array length

Here you can use raw pointers as iterators. They support all the opertations an iterator supports (increment, comparison for equality, value of, etc...):

#include <iostream>
#include <algorithm>


int *get_data_from_library(int &size) {
static int data[] = {5,3,2,1,4};


size = 5;


return data;
}




int main()
{
int size;
int *data = get_data_from_library(size);


std::sort(data, data + size);


for (int i = 0; i < size; i++)
{
std::cout << data[i] << "\n";
}
}

data points to the dirst array member like an iterator returned by begin() and data + size points to the element after the last element of the array like an iterator returned by end().

For arrays

Here you can use std::begin() and std::end()

#include <iostream>
#include <algorithm>


int main()
{
int data[] = {5,3,2,1,4};         // raw data from library


std::sort(std::begin(data), std::end(data));    // sort raw data in place


for (int i = 0; i < 5; i++)
{
std::cout << data[i] << "\n";   // display sorted raw data
}
}

But keep in mind that this only works, if data does not decay to a pointer, because then length information goes missing.

You can get iterators on raw arrays and use them in algorithms:

    int data[] = {5,3,2,1,4};
std::sort(std::begin(data), std::end(data));
for (auto i : data) {
std::cout << i << std::endl;
}

If you are working with raw pointers (ptr + size), then you can use the following technique:

    size_t size = 0;
int * data = get_data_from_library(size);
auto b = data;
auto e = b + size;
std::sort(b, e);
for (auto it = b; it != e; ++it) {
cout << *it << endl;
}

UPD: However, the above example is of bad design. The library returns us a raw pointer and we don't know where the underlying buffer is allocated and who is supposed to free it.

Usually, the caller provides a buffered for the function to fill the data. In that case, we can preallocate the vector and use its underlying buffer:

    std::vector<int> v;
v.resize(256); // allocate a buffer for 256 integers
size_t size = get_data_from_library(v.data(), v.size());
// shrink down to actual data. Note that no memory realocations or copy is done here.
v.resize(size);
std::sort(v.begin(), v.end());
for (auto i : v) {
cout << i << endl;
}

When using C++11 or above we can even make get_data_from_library() to return a vector. Thanks to move operations, there will be no memory copy.

You can't do this with a std::vector without making a copy. std::vector owns the pointer it has under the hood and allocates space through the allocator that is provided.

If you have acess to a compiler that has support for C++20 you could use std::span which was built for exactly this purpose. It wraps a pointer and size into a "container" that has the C++ container interface.

If not, you can use gsl::span which is what the standard version was based off of.

If you don't want to import another library you could trivially implement this yourself depending on what all functionality you want to have.

You actually could almost use std::vector for this, by abusing the custom allocator functionality to return a pointer to the memory you want to view. That wouldn't be guaranteed by the standard to work (padding, alignment, initialization of the returned values; you'd have to take pains when assigning the initial size, and for non-primitives you'd also need to hack up your constructors), but in practice I would expect it to given enough tweaks.

Never ever ever do that. It's ugly, surprising, hacky, and unnecessary. The standard library's algorithms are already designed to work as well with raw arrays as with vectors. See the other answers for details of that.

C++20's std::span

If you are able to use C++20, you could use std::span which is a pointer - length pair that gives the user a view into a contiguous sequence of elements. It is some sort of a std::string_view, and while both std::span and std::string_view are non-owning views, std::string_view is a read-only view.

From the docs:

The class template span describes an object that can refer to a contiguous sequence of objects with the first element of the sequence at position zero. A span can either have a static extent, in which case the number of elements in the sequence is known and encoded in the type, or a dynamic extent.

So the following would work:

#include <span>
#include <iostream>
#include <algorithm>


int main() {
int data[] = { 5, 3, 2, 1, 4 };
std::span<int> s{data, 5};


std::sort(s.begin(), s.end());


for (auto const i : s) {
std::cout << i << "\n";
}


return 0;
}

Check it out live

Since std::span is basically pointer - length pair, you can use in a following manner too:

size_t size = 0;
int *data = get_data_from_library(size);
std::span<int> s{data, size};

Note: Not all compilers support std::span. Check compiler support here.

UPDATE

If you are not able to use C++20, you could use gsl::span which is basically the base version of the C++ standard's std::span.

C++11 solution

If you are limited to C++11 standard, you can try implementing your own simple span class:

template<typename T>
class span {
T* ptr_;
std::size_t len_;


public:
span(T* ptr, std::size_t len) noexcept
: ptr_{ptr}, len_{len}
{}


T& operator[](int i) noexcept {
return *ptr_[i];
}


T const& operator[](int i) const noexcept {
return *ptr_[i];
}


std::size_t size() const noexcept {
return len_;
}


T* begin() noexcept {
return ptr_;
}


T* end() noexcept {
return ptr_ + len_;
}
};

Check out C++11 version live

The problem is that std::vector has to make a copy of the elements from the array you initialize it with as it has the ownership of the objects it contains.

To avoid this, you can use a slice object for an array (i.e., similar to what std::string_view is to std::string). You could write your own array_view class template implementation whose instances are constructed by taking a raw pointer to an array's first element and the array length:

#include <cstdint>


template<typename T>
class array_view {
T* ptr_;
std::size_t len_;
public:
array_view(T* ptr, std::size_t len) noexcept: ptr_{ptr}, len_{len} {}


T& operator[](int i) noexcept { return ptr_[i]; }
T const& operator[](int i) const noexcept { return ptr_[i]; }
auto size() const noexcept { return len_; }


auto begin() noexcept { return ptr_; }
auto end() noexcept { return ptr_ + len_; }
};

array_view doesn't store an array; it just holds a pointer to the beginning of the array and the length of that array. Therefore, array_view objects are cheap to construct and to copy.

Since array_view provides the begin() and end() member functions, you can use the standard library algorithms (e.g., std::sort, std::find, std::lower_bound, etc.) on it:

#define LEN 5


auto main() -> int {
int arr[LEN] = {4, 5, 1, 2, 3};


array_view<int> av(arr, LEN);


std::sort(av.begin(), av.end());


for (auto const& val: av)
std::cout << val << ' ';
std::cout << '\n';
}

Output:

1 2 3 4 5

Use std::span (or gsl::span) instead

The implementation above exposes the concept behind slice objects. However, since C++20 you can directly use std::span instead. In any case, you can use gsl::span since C++14.

You could use a std::reference_wrapper available since C++11:

#include <iostream>
#include <iterator>
#include <vector>
#include <algorithm>


int main()
{
int src_table[] = {5, 4, 3, 2, 1, 0};


std::vector< std::reference_wrapper< int > > dest_vector;


std::copy(std::begin(src_table), std::end(src_table), std::back_inserter(dest_vector));
// if you don't have the array defined just a pointer and size then:
// std::copy(src_table_ptr, src_table_ptr + size, std::back_inserter(dest_vector));


std::sort(std::begin(dest_vector), std::end(dest_vector));


std::for_each(std::begin(src_table), std::end(src_table), [](int x) { std::cout << x << '\n'; });
std::for_each(std::begin(dest_vector), std::end(dest_vector), [](int x) { std::cout << x << '\n'; });
}

Besides the other good suggestion about std::span coming in and gsl:span, including your own (lightweight) span class until then is easy enough already (feel free to copy):

template<class T>
struct span {
T* first;
size_t length;
span(T* first_, size_t length_) : first(first_), length(length_) {};
using value_type = std::remove_cv_t<T>;//primarily needed if used with templates
bool empty() const { return length == 0; }
auto begin() const { return first; }
auto end() const { return first + length; }
};


static_assert(_MSVC_LANG <= 201703L, "remember to switch to std::span");

Of special note is also the boost range library if you are interested in the more generic range concept: https://www.boost.org/doc/libs/1_60_0/libs/range/doc/html/range/reference/utilities/iterator_range.html.

Range concepts will also arrive in

Now I'd like to use std::vector to access and modify these values in place

You cannot. That's not what std::vector is for. std::vector manages its own buffer, which is always acquired from an allocator. It never takes ownership of another buffer (except from another vector of same type).

On the other hand, you also don't need to because ...

The reason is that I need to apply algorithms from (sorting, swaping elements etc.) on that data.

Those algorithms work on iterators. A pointer is an iterator to an array. You don't need a vector:

std::sort(data, data + size);

Unlike function templates in <algorithm>, some tools such as range-for,std::begin/std::end and C++20 ranges do not work with just a pair of iterators though, while they do work with containers such as vectors. It is possible to create a wrapper class for iterator + size that behaves as a range, and works with these tools. C++20 will introduce such wrapper into the standard library: std::span.

As others have pointed out, std::vector must own the underlying memory (short of messing with a custom allocator) so can't be used.

Others have also recommended c++20's span, however obviously that requires c++20.

I would recommend the span-lite span. To quote it's subtitle:

span lite - A C++20-like span for C++98, C++11 and later in a single-file header-only library

It provides a non-owning and mutable view (as in you can mutate elements and their order but not insert them) and as the quote says has no dependencies and works on most compilers.

Your example:

#include <algorithm>
#include <cstddef>
#include <iostream>


#include <nonstd/span.hpp>


static int data[] = {5, 1, 2, 4, 3};


// For example
int* get_data_from_library()
{
return data;
}


int main ()
{
const std::size_t size = 5;


nonstd::span<int> v{get_data_from_library(), size};


std::sort(v.begin(), v.end());


for (auto i = 0UL; i < v.size(); ++i)
{
std::cout << v[i] << "\n";
}
}

Prints

1
2
3
4
5

This also has the added upside if one day you do switch to c++20, you should just be able to replace this nonstd::span with std::span.

A full implementation of ArrayView written in C++:

template<typename T>
class ArrayView {
public:
using value_type = T;
using const_iterator = const T*;


ArrayView(T* ptr, size_t size) noexcept : ptr_(ptr), size_(size) {}
  

template <typename U, size_t N>
ArrayView(U (&buffer)[N]) noexcept : ArrayView(buffer, N) {}


// ArrayView<T> to ArraryView<const T>
// std::vector<T> to ArraryView<const T> or ArraryView<T>
template <
typename U,
// Container has data and size
typename std::enable_if<
std::is_convertible<decltype(std::declval<U>().data()), T*>::value &&
std::is_convertible<decltype(std::declval<U>().size()), std::size_t>::value
>::type* = nullptr
>
ArrayView(const U& u) noexcept : ArrayView(u.data(), u.size()) {}


T& operator[](int i) noexcept { return ptr_[i]; }
T const& operator[](int i) const noexcept { return  ptr_[i]; }
T* data() const noexcept { return ptr_; }
size_t size() const noexcept { return size_; };


T* begin() const noexcept { return this->data(); }
T* end() const noexcept { return this->data() + this->size(); }
const T* cbegin() const { return this->data(); }
const T* cend() const { return this->data() + this->size(); }


std::reverse_iterator<T*> rbegin() const {
return std::make_reverse_iterator(end());
}
std::reverse_iterator<T*> rend() const {
return std::make_reverse_iterator(begin());
}
std::reverse_iterator<const T*> crbegin() const {
return std::make_reverse_iterator(cend());
}
std::reverse_iterator<const T*> crend() const {
return std::make_reverse_iterator(cbegin());
}


ArrayView<T> subview(size_t offset, size_t size) const noexcept {
return offset < this->size() ? ArrayView<T>(this->data() + offset, std::min(size, this->size() - offset))
: ArrayView<T>(nullptr, 0);
}


ArrayView<T> subview(size_t offset) const noexcept {
return subview(offset, this->size());
}


private:
T* ptr_;
size_t size_;
};