P0401R5
Providing size feedback in the Allocator interface

Published Proposal,

This version:
https://wg21.link/P0401R5
Authors:
(Google)
Audience:
LWG
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++

Abstract

Utilize size feedback from Allocator to reduce spurious reallocations

1. Introduction

This is a library paper to describe how size feedback can be used with allocators:

2. Motivation

Consider code adding elements to vector:

std::vector<int> v = {1, 2, 3};
// Expected: v.capacity() == 3

// Add an additional element, triggering a reallocation.
v.push_back(4);

Many allocators only allocate in fixed-sized chunks of memory, rounding up requests. Our underlying heap allocator received a request for 12 bytes (3 * sizeof(int)) while constructing v. For several implementations, this request is turned into a 16 byte region.

2.1. Why not realloc

realloc poses several problems problems:

For fixed-size allocators it makes more sense, and is much simpler for containers to adapt to, if the allocator is able to over-allocate on the initial request and inform the caller how much memory was made available.

2.2. Why not ask the allocator for the size

We could also explore APIs that answer the question: "If I ask for N bytes, how many do I actually get?" [jemalloc] and [TCMalloc] call this nallocx.

See also [P0901R5]'s discussion "nallocx: not as awesome as it looks."

3. Proposal

Wording relative to [N4868].

We propose a free function to return the allocation size simultaneously with an allocation.

Amend [memory.syn]:

namespace std {
  // ... elided ...

  // [allocator.traits], allocator traits
  template<class Alloc> struct allocator_traits;

  template<class Pointer>
  struct allocation_result {
    Pointer ptr;
    size_t count;
  };

  template<class Allocator>
  constexpr allocation_result<typename allocator_traits::pointer> allocate_at_least(
    Allocator& a, size_t n);

  // [default.allocator], the default allocator
  template<class T> class allocator;
  template<class T, class U>
    constexpr bool operator==(const allocator<T>&, const allocator<U>&) noexcept;

  // ... elided ...
}

Amend [allocator.traits.types] to ensure structured binding compatibility:

The class template allocation_result has the template parameters, data members, and special members specified above. It has no base classes or members other than those specified.

Insert [allocator.traits.other]:

template<class Allocator>
[[nodiscard]] constexpr allocation_result<typename allocator_traitss:pointer>
  allocate_at_least(Allocator& a, size_t n);

Amend [default.allocator.general]:

namespace std {
  template<class T> class allocator {
   public:
    using value_type                             = T;
    using size_type                              = size_t;
    using difference_type                        = ptrdiff_t;
    using propagate_on_container_move_assignment = true_type;
    using is_always_equal                        = true_type;

    constexpr allocator() noexcept;
    constexpr allocator(const allocator&) noexcept;
    template<class U> constexpr allocator(const allocator<U>&) noexcept;
    constexpr ~allocator();
    constexpr allocator& operator=(const allocator&) = default;

    [[nodiscard]] constexpr T* allocate(size_t n);
    [[nodiscard]] constexpr allocation_result<T*> allocate_at_least(size_t n);
    constexpr void deallocate(T* p, size_t n);
  };
}

Amend [allocator.members]:

Except for the destructor, member functions of the default allocator shall not introduce data races as a result of concurrent calls to those member functions from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.

[[nodiscard]] constexpr T* allocate(size_t n);
[[nodiscard]] constexpr allocation_result<T*> allocate_at_least(size_t n);
constexpr void deallocate(T* p, size_t n);

Amend [allocator.requirements.general], [tab:cpp17.allocator]. The return type X::pointer is used in the table for consistency with a.allocate but should be resolved according to the LWG issue.

a.allocate(n) X::pointer Memory is allocated for an array of n T and such an object is created but array elements are not constructed.

[Example 1: When reusing storage denoted by some pointer value p, launder(reinterpret_­cast<T*>(new (p) byte[n * sizeof(T)])) can be used to implicitly create a suitable array object and obtain a pointer to it. — end example]

allocate may throw an appropriate exception.

[Note 1: If n == 0, the return value is unspecified. — end note] See Note C, below.
a.allocate(n, y) X::pointer Same as a.allocate(n). The use of y is unspecified, but it is intended as an aid to locality. a.allocate(n)
a.allocate_at_least(n) allocation_result<X::pointer> Returns: allocation_result<X::pointer>{ptr, count} where p is memory is allocated for an array of count T and such an object is created but array elements are not constructed, such that count >= n. allocate_at_least may throw an appropriate exception. See Note C, below. See Note D, below.
a.deallocate(p,n) (not used)
  • Preconditions: p is a value returned by an earlier call to allocate that has not been invalidated by an intervening call to deallocate. n matches the value passed to allocate to obtain this memory.
  • Preconditions:
    • If memory was obtained by a call to a.allocate_at_least, let ret be the value returned and req be the value passed as the first argument. p is equal to ret.ptr and n is a value that satisfies req <= n <= ret.count .
    • Otherwise, p is a pointer value obtained from allocate(). n equals the value passed as the first argument to the invocation of allocate which returned p.
    p has not been invalidated by an intervening call to deallocate.
  • Throws: Nothing.

... elided ...

Note C: If n == 0, the return value is unspecified. Note D: An allocator need not support allocate_at_least, but no default is provided in allocator_traits. If an allocator has an allocate_at_least member, it must satisify the requirements.

Amend [version.syn]

#define __cpp_lib_allocate_at_least PLACEHOLDER // also in <memory>

4. Design Considerations

4.1. allocate selection

There are multiple approaches here:

4.2. Size Return Value

In [Prague], LEWG discussed the return value. For compatibility with the existing allocator APIs, which work in units of objects rather than bytes, this proposal chooses to continue to return an integer number of objects.

Additionally, for types with non-trivial alignment requirements, we must allocate storage for objects, rather than bytes, as raw bytes do not convey the appropriate alignment needs to the allocator.

For example: In the std::vector<T> case, many implementations use 3 pointers for representing the state of the container (begin, end, and capacity). If we preserved the precise value returned by the underlying allocator, we may not be able to legally form the capacity pointer. For these implementations, replacing the capacity pointer with a capacity in bytes would be an ABI break.

4.3. deallocate changes

We now require flexibility on the size we pass to deallocate. For container types using this allocator interface, they are faced with the choice of storing both the original size request as well as the provided size (requiring additional auxillary storage), or being unable to reproduce one during the call to deallocate.

As the true size is expected to be useful for the capacity of a string or vector instance, the returned size is available without additional storage requirements. The original (requested) size is unavailable, so we relax the deallocate size parameter requirements for these allocations.

In [P0901R5], we anticipate use cases where the user does not store enough information about the size to return a precise, byte-level count. We may be allocating T's and receive enough storage to allocate 3.5 objects. Requiring callers return the precise size may be impractical (std::vector frequently stores its capacity as a T*) without additionally auxillary storage.

In this paper, we may be similarly unable to return the true size back. In the [Prague] review of this paper, we discussed cases where we rounded down to the multiple of the size being allocated. Alternatively, custom data structures may use allocator to obtain storage, but quantize that. In Abseil’s [Cord], data is stored in chunks with a length-encoding tag at the start of each chunk. These tags quantize the capacity in AllocatedSizeToTag that can be represented to multiples of 16- and 32-bytes. This makes the encoded length more compact, but means we cannot reproduce the size the allocator provided us precisely.

4.4. Interaction with polymorphic_allocator

std::pmr::memory_resource is implemented using virtual functions. Adding new methods, such as the proposed allocate API would require taking an ABI break.

4.5. Zero-Sized Requests

In Prague, LEWG discussed the behavior of allocate_at_least(allocator, 0). This maximizes implementation freedom.

5. Revision History

5.1. R4 → R5

Applied wording feedback from [LWGNovTelecon].

5.2. R3 → R4

5.3. R2 → R3

Applied LEWG feedback from [Prague].

Poll: We prefer to return number of bytes.
SF F N A SA
0 0 6 6 5
POLL: Change the template parameter to sized_ptr_t to pointer type (to support fancy ptr).

Unanimous consent

POLL: We want to deduce the object type from the allocator (alloc_least_n [sic]).
SF F N A SA
4 8 1 0 0
POLL: We like the design, please revise and do initial wording review (with Pablo, and David Stone) send to LWG (tentatively Ready).
SF F N A SA
6 12 0 0 0

5.4. R1 → R2

Applied LEWG feedback from [Cologne].

Poll: We want a feature like this.
SF F N A SA
2 9 2 0 0
Poll: Name choice
5 allocate_with_size
4 overallocate
2 allocate_with_oversubscribe
6 allocate_oversize
14 allocate_at_least

Poll: Prefer a struct rather than a requirement of structured bindings.

SF F N A SA
2 8 0 2 1

References

Informative References

[Cologne]
Cologne Meeting Minutes. 2019-07-17. URL: http://wiki.edg.com/bin/view/Wg21cologne2019/P0401
[Cord]
absl::Cord. URL: https://github.com/abseil/abseil-cpp/blob/master/absl/strings/cord.h
[JEMALLOC]
jemalloc(3) - Linux man page. URL: http://jemalloc.net/jemalloc.3.html
[LWG3373]
{to,from}_chars_result and format_to_n_result need the 'we really mean what we say' wording. URL: https://cplusplus.github.io/LWG/lwg-active.html#3373
[LWGNovTelecon]
November 2020 LWG Review. URL: https://wiki.edg.com/bin/view/Wg21fall2020/P0401R4-2020-11-20
[N3495]
Ariane van der Steldt. inplace realloc. 7 December 2012. URL: https://wg21.link/n3495
[N4868]
Working Draft, Standard for Programming Language C++. 2020-10-18. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4868.pdf
[P0894R1]
realloc for C++. 2019-01-18. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0894r1.md
[P0901R5]
Size feedback in operator new. 2019-10-03. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0901r5.html
[P0901R5-Review]
P0901R5 LEWG Review Prague. 2020-02-10. URL: http://wiki.edg.com/bin/view/Wg21prague/P0901
[Prague]
P0401R2 LEWG Review Prague. 2020-02-14. URL: http://wiki.edg.com/bin/view/Wg21prague/P0401
[TCMalloc]
TCMalloc. URL: https://github.com/google/tcmalloc