| __BUILTIN_PREFETCH(3) | Library Functions Manual | __BUILTIN_PREFETCH(3) |
__builtin_prefetch —
GNU extension to prefetch memory
void
__builtin_prefetch(const
void *addr,
...);
The
__builtin_prefetch()
function prefetches memory from addr. The rationale is
to minimize cache-miss latency by trying to move data into a cache before
accessing the data. Possible use cases include frequently called sections of
code in which it is known that the data in a given address is likely to be
accessed soon.
In addition to addr, there are two optional stdarg(3) arguments, rw and locality. The value of the latter should be a compile-time constant integer between 0 and 3. The higher the value, the higher the temporal locality in the data. When locality is 0, it is assumed that there is little or no temporal locality in the data; after access, it is not necessary to leave the data in the cache. The default value is 3. The value of rw is either 0 or 1, corresponding with read and write prefetch, respectively. The default value of rw is 0. Also rw must be a compile-time constant integer.
The
__builtin_prefetch()
function translates into prefetch instructions only if the architecture has
support for these. If there is no support, addr is
evaluated only if it includes side effects, although no warnings are issued
by gcc(1).
The following optimization appears in the heavily used
cpu_in_cksum() function that calculates checksums
for the inet(4) headers:
while (mlen >= 32) {
__builtin_prefetch(data + 32);
partial += *(uint16_t *)data;
partial += *(uint16_t *)(data + 2);
partial += *(uint16_t *)(data + 4);
...
partial += *(uint16_t *)(data + 28);
partial += *(uint16_t *)(data + 30);
data += 32;
mlen -= 32;
...
Ulrich Drepper, What Every Programmer Should Know About Memory, https://www.akkadia.org/drepper/cpumemory.pdf, November 21, 2007.
This is a non-standard, compiler-specific extension.
| December 22, 2010 | NetBSD 11.0 |