Browse Source

Remove jemalloc 4 from our tree.

wolfbeast 4 years ago committed by Roy Tam
  1. 1
  2. 81
  3. 1
  4. 1
  5. 2
  6. 17
  7. 9
  8. 8
  9. 78
  10. 28
  11. 29
  12. 27
  13. 981
  14. 414
  15. 506
  16. 20
  17. 1
  18. 17
  19. 79
  20. 9
  21. 5611
  22. 1420
  23. 1797
  24. 250
  25. 0
  26. 10773
  27. 1970
  28. 16
  29. 5
  30. 2966
  31. 4
  32. 10
  33. 1516
  34. 45
  35. 651
  36. 25
  37. 274
  38. 96
  39. 37
  40. 21
  41. 86
  42. 118
  43. 239
  44. 357
  45. 35
  46. 1288
  47. 75
  48. 307
  49. 57
  50. 115
  51. 147
  52. 48
  53. 27
  54. 345
  55. 5
  56. 626
  57. 5
  58. 207
  59. 547
  60. 6
  61. 6
  62. 81
  63. 69
  64. 60
  65. 1003
  66. 366
  67. 318
  68. 246
  69. 115
  70. 51
  71. 201
  72. 469
  73. 75
  74. 787
  75. 338
  76. 114
  77. 266
  78. 28
  79. 45
  80. 103
  81. 45
  82. 66
  83. 22
  84. 57
  85. 20
  86. 247
  87. 59
  88. 6
  89. 12
  90. 24
  91. 63
  92. 402
  93. 272
  94. 89
  95. 3
  96. 327
  97. 26
  98. 12
  99. 3781
  100. 2
  101. Some files were not shown because too many files have changed in this diff Show More

aclocal.m4 vendored

@ -27,7 +27,6 @@ builtin(include, build/autoconf/icu.m4)dnl
builtin(include, build/autoconf/clang-plugin.m4)dnl
builtin(include, build/autoconf/alloc.m4)dnl
builtin(include, build/autoconf/ios.m4)dnl
builtin(include, build/autoconf/jemalloc.m4)dnl
builtin(include, build/autoconf/sanitize.m4)dnl


@ -1,81 +0,0 @@
dnl This Source Code Form is subject to the terms of the Mozilla Public
dnl License, v. 2.0. If a copy of the MPL was not distributed with this
dnl file, You can obtain one at
if test "$MOZ_BUILD_APP" != js -o -n "$JS_STANDALONE"; then
# Run jemalloc configure script
if test "$MOZ_MEMORY" && test -n "$MOZ_REPLACE_MALLOC"; then
ac_configure_args="--build=$build --host=$target --enable-stats --with-jemalloc-prefix=je_ --disable-valgrind"
# We're using memalign for _aligned_malloc in memory/build/mozmemory_wrap.c
# on Windows, so just export memalign on all platforms.
ac_configure_args="$ac_configure_args ac_cv_func_memalign=yes"
if test -n "$MOZ_REPLACE_MALLOC"; then
# When using replace_malloc, we always want valloc exported from jemalloc.
ac_configure_args="$ac_configure_args ac_cv_func_valloc=yes"
if test "${OS_ARCH}" = Darwin; then
# We also need to enable pointer validation on Mac because jemalloc's
# zone allocator is not used.
ac_configure_args="$ac_configure_args --enable-ivsalloc"
MANGLE="malloc posix_memalign aligned_alloc calloc realloc free memalign valloc malloc_usable_size"
if test -n "$MANGLE"; then
for mangle in ${MANGLE}; do
if test -n "$MANGLED"; then
ac_configure_args="$ac_configure_args --with-mangling=$MANGLED"
if test -z "$MOZ_TLS"; then
ac_configure_args="$ac_configure_args --disable-tls"
ac_configure_args="$ac_configure_args $var='`eval echo \\${${var}}`'"
# jemalloc's configure assumes that if you have CFLAGS set at all, you set
# all the flags necessary to configure jemalloc, which is not likely to be
# the case on Windows if someone is building Firefox with flags set in
# their mozconfig.
if test "$_MSC_VER"; then
ac_configure_args="$ac_configure_args CFLAGS="
# Force disable DSS support in jemalloc.
ac_configure_args="$ac_configure_args ac_cv_func_sbrk=false"
# Make Linux builds munmap freed chunks instead of recycling them.
ac_configure_args="$ac_configure_args --enable-munmap"
# Disable cache oblivious behavior that appears to have a performance
# impact on Firefox.
ac_configure_args="$ac_configure_args --disable-cache-oblivious"
if ! test -e memory/jemalloc; then
mkdir -p memory/jemalloc
# jemalloc's configure runs git to determine the version. But when building
# from a gecko git clone, the git commands it uses is going to pick gecko's
# information, not jemalloc's, which is useless. So pretend we don't have git
# at all. That will make jemalloc's configure pick the in-tree VERSION file.
) || exit 1

js/src/aclocal.m4 vendored

@ -25,7 +25,6 @@ builtin(include, ../../build/autoconf/zlib.m4)dnl
builtin(include, ../../build/autoconf/icu.m4)dnl
builtin(include, ../../build/autoconf/clang-plugin.m4)dnl
builtin(include, ../../build/autoconf/alloc.m4)dnl
builtin(include, ../../build/autoconf/jemalloc.m4)dnl
builtin(include, ../../build/autoconf/sanitize.m4)dnl
builtin(include, ../../build/autoconf/ios.m4)dnl


@ -151,7 +151,6 @@ case $cmd in
${TOPSRCDIR}/memory/ \
${TOPSRCDIR}/memory/build \
${TOPSRCDIR}/memory/fallible \
${TOPSRCDIR}/memory/jemalloc \
${TOPSRCDIR}/memory/mozalloc \
${TOPSRCDIR}/memory/mozjemalloc \


@ -2250,8 +2250,6 @@ AC_SUBST(JS_LIBRARY_NAME)
# Avoid using obsolete NSPR features


@ -36,8 +36,7 @@
* - jemalloc_stats
* - jemalloc_purge_freed_pages
* - jemalloc_free_dirty_pages
* (these functions are native to mozjemalloc, and have compatibility
* implementations for jemalloc3)
* (these functions are native to mozjemalloc)
* These functions are all exported as part of libmozglue (see
* $(topsrcdir)/mozglue/build/, with a few implementation
@ -87,21 +86,16 @@
* char* strdup_impl(const char *)
* That implementation would call malloc by using "malloc_impl".
* While mozjemalloc uses these "_impl" suffixed helpers, jemalloc3, being
* third-party code, doesn't, but instead has an elaborate way to mangle
* individual functions. See under "Run jemalloc configure script" in
* $(topsrcdir)/
* When building with replace-malloc support, the above still holds, but
* the malloc implementation and jemalloc specific functions are the
* replace-malloc functions from replace_malloc.c.
* The actual jemalloc/mozjemalloc implementation is prefixed with "je_".
* The actual mozjemalloc implementation is prefixed with "je_".
* Thus, when MOZ_REPLACE_MALLOC is defined, the "_impl" suffixed macros
* expand to "je_" prefixed function when building mozjemalloc or
* jemalloc3/mozjemalloc_compat, where MOZ_JEMALLOC_IMPL is defined.
* expand to "je_" prefixed function when building mozjemalloc, where
* MOZ_JEMALLOC_IMPL is defined.
* In other cases, the "_impl" suffixed macros follow the original scheme,
* except on Windows and MacOSX, where they would expand to "je_" prefixed
@ -153,9 +147,6 @@
# endif
/* All other jemalloc3 functions are prefixed with "je_" */
#define je_(a) je_ ## a
#if !defined(MOZ_MEMORY_IMPL)


@ -1,9 +0,0 @@
This is a copy of the jemalloc source code. It is intended to be left pristine.
Modifications to this code without coordinating with upstream are unacceptable,
and will be reverted. Integration modifications should be made in memory/build
whenever possible.
The canonical repository for this source code is git://
The information about the upstream repository and revision lives in
In order to update the code, you can run the script located in
the same directory.


@ -1,8 +0,0 @@
# jemalloc's configure runs git to determine the version. But when building
# from a gecko git clone, the git commands it uses is going to pick gecko's
# information, not jemalloc's, which is useless. So pretend we don't have git
# at all. That will make jemalloc's configure pick the in-tree VERSION file.
exit 1


@ -1,78 +0,0 @@
# -*- Mode: python; indent-tabs-mode: nil; tab-width: 40 -*-
# vim: set filetype=python:
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at
# This file cannot be built in unified mode because of symbol clash on arena_purge.
# Only OSX needs the zone allocation implementation,
# but only if replace-malloc is not enabled.
FINAL_LIBRARY = 'replace_jemalloc'
LOCAL_INCLUDES += ['src/include/msvc_compat']
LOCAL_INCLUDES += ['src/include/msvc_compat/C99']
if CONFIG['OS_TARGET'] == 'Linux':
# For mremap
CFLAGS += ['-std=gnu99']
DEFINES['abort'] = 'moz_abort'
# We allow warnings for third-party code that can be updated from upstream.


@ -1,28 +0,0 @@
version: '{build}'
CPU: x86_64
MSVC: amd64
CPU: i686
MSVC: x86
CPU: x86_64
CPU: i686
- set PATH=c:\msys64\%MSYSTEM%\bin;c:\msys64\usr\bin;%PATH%
- if defined MSVC call "c:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" %MSVC%
- if defined MSVC pacman --noconfirm -Rsc mingw-w64-%CPU%-gcc gcc
- pacman --noconfirm -Suy mingw-w64-%CPU%-make
- bash -c "autoconf"
- bash -c "./configure"
- mingw32-make -j3
- file lib/jemalloc.dll
- mingw32-make -j3 tests
- mingw32-make -k check


@ -1,29 +0,0 @@
language: c
- os: linux
compiler: gcc
- os: linux
compiler: gcc
- gcc-multilib
- os: osx
compiler: clang
- os: osx
compiler: clang
- autoconf
- ./configure${EXTRA_FLAGS:+ CC="$CC $EXTRA_FLAGS"}
- make -j3
- make -j3 tests
- make check


@ -1,27 +0,0 @@
Unless otherwise specified, files in the jemalloc source distribution are
subject to the following license:
Copyright (C) 2002-2016 Jason Evans <>.
All rights reserved.
Copyright (C) 2007-2012 Mozilla Foundation. All rights reserved.
Copyright (C) 2009-2016 Facebook, Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice(s),
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice(s),
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.


@ -1,981 +0,0 @@
Following are change highlights associated with official releases. Important
bug fixes are all mentioned, but some internal enhancements are omitted here for
brevity. Much more detail can be found in the git revision history:
* 4.3.1 (November 7, 2016)
Bug fixes:
- Fix a severe virtual memory leak. This regression was first released in
4.3.0. (@interwq, @jasone)
- Refactor atomic and prng APIs to restore support for 32-bit platforms that
use pre-C11 toolchains, e.g. FreeBSD's mips. (@jasone)
* 4.3.0 (November 4, 2016)
This is the first release that passes the test suite for multiple Windows
configurations, thanks in large part to @glandium setting up continuous
integration via AppVeyor (and Travis CI for Linux and OS X).
New features:
- Add "J" (JSON) support to malloc_stats_print(). (@jasone)
- Add Cray compiler support. (@ronawho)
- Add/use adaptive spinning for bootstrapping and radix tree node
initialization. (@jasone)
Bug fixes:
- Fix large allocation to search starting in the optimal size class heap,
which can substantially reduce virtual memory churn and fragmentation. This
regression was first released in 4.0.0. (@mjp41, @jasone)
- Fix stats.arenas.<i>.nthreads accounting. (@interwq)
- Fix and simplify decay-based purging. (@jasone)
- Make DSS (sbrk(2)-related) operations lockless, which resolves potential
deadlocks during thread exit. (@jasone)
- Fix over-sized allocation of radix tree leaf nodes. (@mjp41, @ogaun,
- Fix over-sized allocation of arena_t (plus associated stats) data
structures. (@jasone, @interwq)
- Fix EXTRA_CFLAGS to not affect configuration. (@jasone)
- Fix a Valgrind integration bug. (@ronawho)
- Disallow 0x5a junk filling when running in Valgrind. (@jasone)
- Fix a file descriptor leak on Linux. This regression was first released in
4.2.0. (@vsarunas, @jasone)
- Fix static linking of jemalloc with glibc. (@djwatson)
- Use syscall(2) rather than {open,read,close}(2) during boot on Linux. This
works around other libraries' system call wrappers performing reentrant
allocation. (@kspinka, @Whissi, @jasone)
- Fix OS X default zone replacement to work with OS X 10.12. (@glandium,
- Fix cached memory management to avoid needless commit/decommit operations
during purging, which resolves permanent virtual memory map fragmentation
issues on Windows. (@mjp41, @jasone)
- Fix TSD fetches to avoid (recursive) allocation. This is relevant to
non-TLS and Windows configurations. (@jasone)
- Fix malloc_conf overriding to work on Windows. (@jasone)
- Forcibly disable lazy-lock on Windows (was forcibly *enabled*). (@jasone)
* 4.2.1 (June 8, 2016)
Bug fixes:
- Fix bootstrapping issues for configurations that require allocation during
tsd initialization (e.g. --disable-tls). (@cferris1000, @jasone)
- Fix gettimeofday() version of nstime_update(). (@ronawho)
- Fix Valgrind regressions in calloc() and chunk_alloc_wrapper(). (@ronawho)
- Fix potential VM map fragmentation regression. (@jasone)
- Fix opt_zero-triggered in-place huge reallocation zeroing. (@jasone)
- Fix heap profiling context leaks in reallocation edge cases. (@jasone)
* 4.2.0 (May 12, 2016)
New features:
- Add the arena.<i>.reset mallctl, which makes it possible to discard all of
an arena's allocations in a single operation. (@jasone)
- Add the stats.retained and stats.arenas.<i>.retained statistics. (@jasone)
- Add the --with-version configure option. (@jasone)
- Support --with-lg-page values larger than actual page size. (@jasone)
- Use pairing heaps rather than red-black trees for various hot data
structures. (@djwatson, @jasone)
- Streamline fast paths of rtree operations. (@jasone)
- Optimize the fast paths of calloc() and [m,d,sd]allocx(). (@jasone)
- Decommit unused virtual memory if the OS does not overcommit. (@jasone)
- Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order
to avoid unfortunate interactions during fork(2). (@jasone)
Bug fixes:
- Fix chunk accounting related to triggering gdump profiles. (@jasone)
- Link against librt for clock_gettime(2) if glibc < 2.17. (@jasone)
- Scale leak report summary according to sampling probability. (@jasone)
* 4.1.1 (May 3, 2016)
This bugfix release resolves a variety of mostly minor issues, though the
bitmap fix is critical for 64-bit Windows.
Bug fixes:
- Fix the linear scan version of bitmap_sfu() to shift by the proper amount
even when sizeof(long) is not the same as sizeof(void *), as on 64-bit
Windows. (@jasone)
- Fix hashing functions to avoid unaligned memory accesses (and resulting
crashes). This is relevant at least to some ARM-based platforms.
- Fix fork()-related lock rank ordering reversals. These reversals were
unlikely to cause deadlocks in practice except when heap profiling was
enabled and active. (@jasone)
- Fix various chunk leaks in OOM code paths. (@jasone)
- Fix malloc_stats_print() to print opt.narenas correctly. (@jasone)
- Fix MSVC-specific build/test issues. (@rustyx, @yuslepukhin)
- Fix a variety of test failures that were due to test fragility rather than
core bugs. (@jasone)
* 4.1.0 (February 28, 2016)
This release is primarily about optimizations, but it also incorporates a lot
of portability-motivated refactoring and enhancements. Many people worked on
this release, to an extent that even with the omission here of minor changes
(see git revision history), and of the people who reported and diagnosed
issues, so much of the work was contributed that starting with this release,
changes are annotated with author credits to help reflect the collaborative
effort involved.
New features:
- Implement decay-based unused dirty page purging, a major optimization with
mallctl API impact. This is an alternative to the existing ratio-based
unused dirty page purging, and is intended to eventually become the sole
purging mechanism. New mallctls:
+ opt.purge
+ opt.decay_time
+ arena.<i>.decay
+ arena.<i>.decay_time
+ arenas.decay_time
+ stats.arenas.<i>.decay_time
(@jasone, @cevans87)
- Add --with-malloc-conf, which makes it possible to embed a default
options string during configuration. This was motivated by the desire to
specify --with-malloc-conf=purge:decay , since the default must remain
purge:ratio until the 5.0.0 release. (@jasone)
- Add MS Visual Studio 2015 support. (@rustyx, @yuslepukhin)
- Make *allocx() size class overflow behavior defined. The maximum
size class is now less than PTRDIFF_MAX to protect applications against
numerical overflow, and all allocation functions are guaranteed to indicate
errors rather than potentially crashing if the request size exceeds the
maximum size class. (@jasone)
- jeprof:
+ Add raw heap profile support. (@jasone)
+ Add --retain and --exclude for backtrace symbol filtering. (@jasone)
- Optimize the fast path to combine various bootstrapping and configuration
checks and execute more streamlined code in the common case. (@interwq)
- Use linear scan for small bitmaps (used for small object tracking). In
addition to speeding up bitmap operations on 64-bit systems, this reduces
allocator metadata overhead by approximately 0.2%. (@djwatson)
- Separate arena_avail trees, which substantially speeds up run tree
operations. (@djwatson)
- Use memoization (boot-time-computed table) for run quantization. Separate
arena_avail trees reduced the importance of this optimization. (@jasone)
- Attempt mmap-based in-place huge reallocation. This can dramatically speed
up incremental huge reallocation. (@jasone)
Incompatible changes:
- Make opt.narenas unsigned rather than size_t. (@jasone)
Bug fixes:
- Fix stats.cactive accounting regression. (@rustyx, @jasone)
- Handle unaligned keys in hash(). This caused problems for some ARM systems.
(@jasone, @cferris1000)
- Refactor arenas array. In addition to fixing a fork-related deadlock, this
makes arena lookups faster and simpler. (@jasone)
- Move retained memory allocation out of the default chunk allocation
function, to a location that gets executed even if the application installs
a custom chunk allocation function. This resolves a virtual memory leak.
- Fix a potential tsd cleanup leak. (@cferris1000, @jasone)
- Fix run quantization. In practice this bug had no impact unless
applications requested memory with alignment exceeding one page.
(@jasone, @djwatson)
- Fix LinuxThreads-specific bootstrapping deadlock. (Cosmin Paraschiv)
- jeprof:
+ Don't discard curl options if timeout is not defined. (@djwatson)
+ Detect failed profile fetches. (@djwatson)
- Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for
--disable-stats case. (@jasone)
* 4.0.4 (October 24, 2015)
This bugfix release fixes another xallocx() regression. No other regressions
have come to light in over a month, so this is likely a good starting point
for people who prefer to wait for "dot one" releases with all the major issues
shaken out.
Bug fixes:
- Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large
allocations that have been randomly assigned an offset of 0 when
--enable-cache-oblivious configure option is enabled.
* 4.0.3 (September 24, 2015)
This bugfix release continues the trend of xallocx() and heap profiling fixes.
Bug fixes:
- Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large
allocations when --enable-cache-oblivious configure option is enabled.
- Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations
when resizing from/to a size class that is not a multiple of the chunk size.
- Fix prof_tctx_dump_iter() to filter out nodes that were created after heap
profile dumping started.
- Work around a potentially bad thread-specific data initialization
interaction with NPTL (glibc's pthreads implementation).
* 4.0.2 (September 21, 2015)
This bugfix release addresses a few bugs specific to heap profiling.
Bug fixes:
- Fix ixallocx_prof_sample() to never modify nor create sampled small
allocations. xallocx() is in general incapable of moving small allocations,
so this fix removes buggy code without loss of generality.
- Fix irallocx_prof_sample() to always allocate large regions, even when
alignment is non-zero.
- Fix prof_alloc_rollback() to read tdata from thread-specific data rather
than dereferencing a potentially invalid tctx.
* 4.0.1 (September 15, 2015)
This is a bugfix release that is somewhat high risk due to the amount of
refactoring required to address deep xallocx() problems. As a side effect of
these fixes, xallocx() now tries harder to partially fulfill requests for
optional extra space. Note that a couple of minor heap profiling
optimizations are included, but these are better thought of as performance
fixes that were integral to disovering most of the other bugs.
- Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the
fast path when heap profiling is enabled. Additionally, split a special
case out into arena_prof_tctx_reset(), which also avoids chunk metadata
- Optimize irallocx_prof() to optimistically update the sampler state. The
prior implementation appears to have been a holdover from when
rallocx()/xallocx() functionality was combined as rallocm().
Bug fixes:
- Fix TLS configuration such that it is enabled by default for platforms on
which it works correctly.
- Fix arenas_cache_cleanup() and arena_get_hard() to handle
allocation/deallocation within the application's thread-specific data
cleanup functions even after arenas_cache is torn down.
- Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS.
- Fix chunk purge hook calls for in-place huge shrinking reallocation to
specify the old chunk size rather than the new chunk size. This bug caused
no correctness issues for the default chunk purge function, but was
visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl.
- Fix heap profiling bugs:
+ Fix heap profiling to distinguish among otherwise identical sample sites
with interposed resets (triggered via the "prof.reset" mallctl). This bug
could cause data structure corruption that would most likely result in a
+ Fix irealloc_prof() to prof_alloc_rollback() on OOM.
+ Make one call to prof_active_get_unlocked() per allocation event, and use
the result throughout the relevant functions that handle an allocation
event. Also add a missing check in prof_realloc(). These fixes protect
allocation events against concurrent prof_active changes.
+ Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
in the correct order.
+ Fix prof_realloc() to call prof_free_sampled_object() after calling
prof_malloc_sample_object(). Prior to this fix, if tctx and old_tctx were
the same, the tctx could have been prematurely destroyed.
- Fix portability bugs:
+ Don't bitshift by negative amounts when encoding/decoding run sizes in
chunk header maps. This affected systems with page sizes greater than 8
+ Rename index_t to szind_t to avoid an existing type on Solaris.
+ Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to
match glibc and avoid compilation errors when including both
jemalloc/jemalloc.h and malloc.h in C++ code.
+ Don't assume that /bin/sh is appropriate when running
during configuration.
+ Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.
+ Link tests to librt if it contains clock_gettime(2).
* 4.0.0 (August 17, 2015)
This version contains many speed and space optimizations, both minor and
major. The major themes are generalization, unification, and simplification.
Although many of these optimizations cause no visible behavior change, their
cumulative effect is substantial.
New features:
- Normalize size class spacing to be consistent across the complete size
range. By default there are four size classes per size doubling, but this
is now configurable via the --with-lg-size-class-group option. Also add the
--with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and
--with-lg-tiny-min options, which can be used to tweak page and size class
settings. Impacts:
+ Worst case performance for incrementally growing/shrinking reallocation
is improved because there are far fewer size classes, and therefore
copying happens less often.
+ Internal fragmentation is limited to 20% for all but the smallest size
classes (those less than four times the quantum). (1B + 4 KiB)
and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation.
+ Chunk fragmentation tends to be lower because there are fewer distinct run
sizes to pack.
- Add support for explicit tcaches. The "tcache.create", "tcache.flush", and
"tcache.destroy" mallctls control tcache lifetime and flushing, and the
MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API
control which tcache is used for each operation.
- Implement per thread heap profiling, as well as the ability to
enable/disable heap profiling on a per thread basis. Add the "prof.reset",
"prof.lg_sample", "", "",
"opt.prof_thread_active_init", "prof.thread_active_init", and
"" mallctls.
- Add support for per arena application-specified chunk allocators, configured
via the "arena.<i>.chunk_hooks" mallctl.
- Refactor huge allocation to be managed by arenas, so that arenas now
function as general purpose independent allocators. This is important in
the context of user-specified chunk allocators, aside from the scalability
benefits. Related new statistics:
+ The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc",
"stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests"
mallctls provide high level per arena huge allocation statistics.
+ The "arenas.nhchunks", "arenas.hchunk.<i>.size",
"stats.arenas.<i>.hchunks.<j>.nrequests", and
"stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class
- Add the 'util' column to malloc_stats_print() output, which reports the
proportion of available regions that are currently in use for each small
size class.
- Add "alloc" and "free" modes for for junk filling (see the "opt.junk"
mallctl), so that it is possible to separately enable junk filling for
allocation versus deallocation.
- Add the jemalloc-config script, which provides information about how
jemalloc was configured, and how to integrate it into application builds.
- Add metadata statistics, which are accessible via the "stats.metadata",
"stats.arenas.<i>.metadata.mapped", and
"stats.arenas.<i>.metadata.allocated" mallctls.
- Add the "stats.resident" mallctl, which reports the upper limit of
physically resident memory mapped by the allocator.
- Add per arena control over unused dirty page purging, via the
"arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and
"stats.arenas.<i>.lg_dirty_mult" mallctls.
- Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump
feature on/off during program execution.
- Add sdallocx(), which implements sized deallocation. The primary
optimization over dallocx() is the removal of a metadata read, which often
suffers an L1 cache miss.
- Add missing header includes in jemalloc/jemalloc.h, so that applications
only have to #include <jemalloc/jemalloc.h>.
- Add support for additional platforms:
+ Bitrig
+ Cygwin
+ DragonFlyBSD
+ iOS
+ OpenBSD
+ OpenRISC/or1k
- Maintain dirty runs in per arena LRUs rather than in per arena trees of
dirty-run-containing chunks. In practice this change significantly reduces
dirty page purging volume.
- Integrate whole chunks into the unused dirty page purging machinery. This
reduces the cost of repeated huge allocation/deallocation, because it
effectively introduces a cache of chunks.
- Split the arena chunk map into two separate arrays, in order to increase
cache locality for the frequently accessed bits.
- Move small run metadata out of runs, into arena chunk headers. This reduces
run fragmentation, smaller runs reduce external fragmentation for small size
classes, and packed (less uniformly aligned) metadata layout improves CPU
cache set distribution.
- Randomly distribute large allocation base pointer alignment relative to page
boundaries in order to more uniformly utilize CPU cache sets. This can be
disabled via the --disable-cache-oblivious configure option, and queried via
the "config.cache_oblivious" mallctl.
- Micro-optimize the fast paths for the public API functions.
- Refactor thread-specific data to reside in a single structure. This assures
that only a single TLS read is necessary per call into the public API.
- Implement in-place huge allocation growing and shrinking.
- Refactor rtree (radix tree for chunk lookups) to be lock-free, and make
additional optimizations that reduce maximum lookup depth to one or two
levels. This resolves what was a concurrency bottleneck for per arena huge
allocation, because a global data structure is critical for determining
which arenas own which huge allocations.
Incompatible changes:
- Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
warnings by default.
- Assure that the constness of malloc_usable_size()'s return type matches that
of the system implementation.
- Change the heap profile dump format to support per thread heap profiling,
rename pprof to jeprof, and enhance it with the --thread=<n> option. As a
result, the bundled jeprof must now be used rather than the upstream
(gperftools) pprof.
- Disable "opt.prof_final" by default, in order to avoid atexit(3), which can
internally deadlock on some platforms.
- Change the "arenas.nlruns" mallctl type from size_t to unsigned.
- Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with
- Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
- Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the
MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage.
Removed features:
- Remove the *allocm() API, which is superseded by the *allocx() API.
- Remove the --enable-dss options, and make dss non-optional on all platforms
which support sbrk(2).
- Remove the "arenas.purge" mallctl, which was obsoleted by the
"arena.<i>.purge" mallctl in 3.1.0.
- Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically
detects whether it is running inside Valgrind.
- Remove the "stats.huge.allocated", "stats.huge.nmalloc", and
"stats.huge.ndalloc" mallctls.
- Remove the --enable-mremap option.
- Remove the "stats.chunks.current", "", and
"stats.chunks.high" mallctls.
Bug fixes:
- Fix the cactive statistic to decrease (rather than increase) when active
memory decreases. This regression was first released in 3.5.0.
- Fix OOM handling in memalign() and valloc(). A variant of this bug existed
in all releases since 2.0.0, which introduced these functions.
- Fix an OOM-related regression in arena_tcache_fill_small(), which could
cause cache corruption on OOM. This regression was present in all releases
from 2.2.0 through 3.6.0.
- Fix size class overflow handling for malloc(), posix_memalign(), memalign(),
calloc(), and realloc() when profiling is enabled.
- Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
"secondary" precedence is specified, but sbrk(2) is not supported.
- Fix fallback lg_floor() implementations to handle extremely large inputs.
- Ensure the default purgeable zone is after the default zone on OS X.
- Fix latent bugs in atomic_*().
- Fix the "arena.<i>.dss" mallctl to handle read-only calls.
- Fix tls_model configuration to enable the initial-exec model when possible.
- Mark malloc_conf as a weak symbol so that the application can override it.
- Correctly detect glibc's adaptive pthread mutexes.
- Fix the --without-export configure option.
* 3.6.0 (March 31, 2014)
This version contains a critical bug fix for a regression present in 3.5.0 and
Bug fixes:
- Fix a regression in arena_chunk_alloc() that caused crashes during
small/large allocation if chunk allocation failed. In the absence of this
bug, chunk allocation failure would result in allocation failure, e.g. NULL
return from malloc(). This regression was introduced in 3.5.0.
- Fix backtracing for gcc intrinsics-based backtracing by specifying
-fno-omit-frame-pointer to gcc. Note that the application (and all the
libraries it links to) must also be compiled with this option for
backtracing to be reliable.
- Use dss allocation precedence for huge allocations as well as small/large
- Fix test assertion failure message formatting. This bug did not manifest on
x86_64 systems because of implementation subtleties in va_list.
- Fix inconsequential test failures for hash and SFMT code.
New features:
- Support heap profiling on FreeBSD. This feature depends on the proc
filesystem being mounted during heap profile dumping.
* 3.5.1 (February 25, 2014)
This version primarily addresses minor bugs in test code.
Bug fixes:
- Configure Solaris/Illumos to use MADV_FREE.
- Fix junk filling for mremap(2)-based huge reallocation. This is only
relevant if configuring with the --enable-mremap option specified.
- Avoid compilation failure if 'restrict' C99 keyword is not supported by the
- Add a configure test for SSE2 rather than assuming it is usable on i686
systems. This fixes test compilation errors, especially on 32-bit Linux
- Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit
- Fix/remove flawed alignment-related overflow tests.
- Prevent compiler optimizations that could change backtraces in the
prof_accum unit test.
* 3.5.0 (January 22, 2014)
This version focuses on refactoring and automated testing, though it also
includes some non-trivial heap profiling optimizations not mentioned below.
New features:
- Add the *allocx() API, which is a successor to the experimental *allocm()
API. The *allocx() functions are slightly simpler to use because they have
fewer parameters, they directly return the results of primary interest, and
mallocx()/rallocx() avoid the strict aliasing pitfall that
allocm()/rallocm() share with posix_memalign(). Note that *allocm() is
slated for removal in the next non-bugfix release.
- Add support for LinuxThreads.
Bug fixes:
- Unless heap profiling is enabled, disable floating point code and don't link
with libm. This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64
systems, makes it possible to completely disable floating point register
use. Some versions of glibc neglect to save/restore caller-saved floating
point registers during dynamic lazy symbol loading, and the symbol loading
code uses whatever malloc the application happens to have linked/loaded
with, the result being potential floating point register corruption.
- Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
backtrace creation in imemalign(). This bug impacted posix_memalign() and
- Fix a file descriptor leak in a prof_dump_maps() error path.
- Fix prof_dump() to close the dump file descriptor for all relevant error
- Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for
allocation, not just deallocation.
- Fix a data race for large allocation stats counters.
- Fix a potential infinite loop during thread exit. This bug occurred on
Solaris, and could affect other platforms with similar pthreads TSD
- Don't junk-fill reallocations unless usable size changes. This fixes a
violation of the *allocx()/*allocm() semantics.
- Fix growing large reallocation to junk fill new space.
- Fix huge deallocation to junk fill when munmap is disabled.
- Change the default private namespace prefix from empty to je_, and change
--with-private-namespace-prefix so that it prepends an additional prefix
rather than replacing je_. This reduces the likelihood of applications
which statically link jemalloc experiencing symbol name collisions.
- Add missing private namespace mangling (relevant when
--with-private-namespace is specified).
- Add and use JEMALLOC_INLINE_C so that static inline functions are marked as
static even for debug builds.
- Add a missing mutex unlock in a malloc_init_hard() error path. In practice
this error path is never executed.
- Fix numerous bugs in malloc_strotumax() error handling/reporting. These
bugs had no impact except for malformed inputs.
- Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by
existing calls, so they had no impact.
* 3.4.1 (October 20, 2013)
Bug fixes:
- Fix a race in the "arenas.extend" mallctl that could cause memory corruption
of internal data structures and subsequent crashes.
- Fix Valgrind integration flaws that caused Valgrind warnings about reads of
uninitialized memory in:
+ arena chunk headers
+ internal zero-initialized data structures (relevant to tcache and prof
- Preserve errno during the first allocation. A readlink(2) call during
initialization fails unless /etc/malloc.conf exists, so errno was typically
set during the first allocation prior to this fix.
- Fix compilation warnings reported by gcc 4.8.1.
* 3.4.0 (June 2, 2013)
This version is essentially a small bugfix release, but the addition of
aarch64 support requires that the minor version be incremented.
Bug fixes:
- Fix race-triggered deadlocks in chunk_record(). These deadlocks were
typically triggered by multiple threads concurrently deallocating huge
New features:
- Add support for the aarch64 architecture.
* 3.3.1 (March 6, 2013)
This version fixes bugs that are typically encountered only when utilizing
custom run-time options.
Bug fixes:
- Fix a locking order bug that could cause deadlock during fork if heap
profiling were enabled.
- Fix a chunk recycling bug that could cause the allocator to lose track of
whether a chunk was zeroed. On FreeBSD, NetBSD, and OS X, it could cause
corruption if allocating via sbrk(2) (unlikely unless running with the
"dss:primary" option specified). This was completely harmless on Linux
unless using mlockall(2) (and unlikely even then, unless the
--disable-munmap configure option or the "dss:primary" option was
specified). This regression was introduced in 3.1.0 by the
mlockall(2)/madvise(2) interaction fix.
- Fix TLS-related memory corruption that could occur during thread exit if the
thread never allocated memory. Only the quarantine and prof facilities were
- Fix two quarantine bugs:
+ Internal reallocation of the quarantined object array leaked the old
+ Reallocation failure for internal reallocation of the quarantined object
array (very unlikely) resulted in memory corruption.
- Fix Valgrind integration to annotate all internally allocated memory in a
way that keeps Valgrind happy about internal data structure access.
- Fix building for s390 systems.
* 3.3.0 (January 23, 2013)
This version includes a few minor performance improvements in addition to the
listed new features and bug fixes.
New features:
- Add clipping support to lg_chunk option processing.
- Add the --enable-ivsalloc option.
- Add the --without-export option.
- Add the --disable-zone-allocator option.
Bug fixes:
- Fix "arenas.extend" mallctl to output the number of arenas.
- Fix chunk_recycle() to unconditionally inform Valgrind that returned memory
is undefined.
- Fix build break on FreeBSD related to alloca.h.
* 3.2.0 (November 9, 2012)
In addition to a couple of bug fixes, this version modifies page run
allocation and dirty page purging algorithms in order to better control
page-level virtual memory fragmentation.
Incompatible changes:
- Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1).
Bug fixes:
- Fix dss/mmap allocation precedence code to use recyclable mmap memory only
after primary dss allocation fails.
- Fix deadlock in the "arenas.purge" mallctl. This regression was introduced
in 3.1.0 by the addition of the "arena.<i>.purge" mallctl.
* 3.1.0 (October 16, 2012)
New features:
- Auto-detect whether running inside Valgrind, thus removing the need to
manually specify MALLOC_CONF=valgrind:true.
- Add the "arenas.extend" mallctl, which allows applications to create
manually managed arenas.
- Add the ALLOCM_ARENA() flag for {,r,d}allocm().
- Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls,
which provide control over dss/mmap precedence.
- Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".
- Define LG_QUANTUM for hppa.
Incompatible changes:
- Disable tcache by default if running inside Valgrind, in order to avoid
making unallocated objects appear reachable to Valgrind.
- Drop const from malloc_usable_size() argument on Linux.
Bug fixes:
- Fix heap profiling crash if sampled object is freed via realloc(p, 0).
- Remove const from __*_hook variable declarations, so that glibc can modify
them during process forking.
- Fix mlockall(2)/madvise(2) interaction.
- Fix fork(2)-related deadlocks.
- Fix error return value for "thread.tcache.enabled" mallctl.
* 3.0.0 (May 11, 2012)
Although this version adds some major new features, the primary focus is on
internal code cleanup that facilitates maintainability and portability, most
of which is not reflected in the ChangeLog. This is the first release to
incorporate substantial contributions from numerous other developers, and the
result is a more broadly useful allocator (see the git revision history for
contribution details). Note that the license has been unified, thanks to
Facebook granting a license under the same terms as the other copyright
holders (see COPYING).
New features:
- Implement Valgrind support, redzones, and quarantine.
- Add support for additional platforms:
+ FreeBSD
+ Mac OS X Lion
+ MinGW
+ Windows (no support yet for replacing the system malloc)
- Add support for additional architectures:
+ SH4
+ Tilera
- Add support for cross compiling.
- Add nallocm(), which rounds a request size up to the nearest size class
without actually allocating.
- Implement aligned_alloc() (blame C11).
- Add the "thread.tcache.enabled" mallctl.
- Add the "opt.prof_final" mallctl.
- Update pprof (from gperftools 2.0).
- Add the --with-mangling option.
- Add the --disable-experimental option.
- Add the --disable-munmap option, and make it the default on Linux.
- Add the --enable-mremap option, which disables use of mremap(2) by default.
Incompatible changes:
- Enable stats by default.
- Enable fill by default.
- Disable lazy locking by default.
- Rename the "tcache.flush" mallctl to "thread.tcache.flush".
- Rename the "arenas.pagesize" mallctl to "".
- Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB).
- Change the "opt.prof_accum" default from true to false.
Removed features:
- Remove the swap feature, including the "config.swap", "swap.avail",
"swap.prezeroed", "swap.nfds", and "swap.fds" mallctls.
- Remove highruns statistics, including the
"stats.arenas.<i>.bins.<j>.highruns" and
"stats.arenas.<i>.lruns.<j>.highruns" mallctls.
- As part of small size class refactoring, remove the "opt.lg_[qc]space_max",
"arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and
"arenas.[tqcs]bins" mallctls.
- Remove the "arenas.chunksize" mallctl.
- Remove the "opt.lg_prof_tcmax" option.
- Remove the "opt.lg_prof_bt_max" option.
- Remove the "opt.lg_tcache_gc_sweep" option.
- Remove the --disable-tiny option, including the "config.tiny" mallctl.
- Remove the --enable-dynamic-page-shift configure option.
- Remove the --enable-sysv configure option.
Bug fixes:
- Fix a statistics-related bug in the "thread.arena" mallctl that could cause
invalid statistics and crashes.
- Work around TLS deallocation via free() on Linux. This bug could cause
write-after-free memory corruption.
- Fix a potential deadlock that could occur during interval- and
growth-triggered heap profile dumps.
- Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags.
- Fix chunk_alloc_dss() to stop claiming memory is zeroed. This bug could
cause memory corruption and crashes with --enable-dss specified.
- Fix fork-related bugs that could cause deadlock in children between fork
and exec.
- Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
- Fix realloc(p, 0) to act like free(p).
- Do not enforce minimum alignment in memalign().
- Check for NULL pointer in malloc_usable_size().
- Fix an off-by-one heap profile statistics bug that could be observed in
interval- and growth-triggered heap profiles.
- Fix the "epoch" mallctl to update cached stats even if the passed in epoch
is 0.
- Fix bin->runcur management to fix a layout policy bug. This bug did not
affect correctness.
- Fix a bug in choose_arena_hard() that potentially caused more arenas to be
initialized than necessary.
- Add missing "opt.lg_tcache_max" mallctl implementation.
- Use glibc allocator hooks to make mixed allocator usage less likely.
- Fix build issues for --disable-tcache.
- Don't mangle pthread_create() when --with-private-namespace is specified.
* 2.2.5 (November 14, 2011)
Bug fixes:
- Fix huge_ralloc() race when using mremap(2). This is a serious bug that
could cause memory corruption and/or crashes.
- Fix huge_ralloc() to maintain chunk statistics.
- Fix malloc_stats_print(..., "a") output.
* 2.2.4 (November 5, 2011)
Bug fixes:
- Initialize arenas_tsd before using it. This bug existed for 2.2.[0-3], as
well as for --disable-tls builds in earlier releases.
- Do not assume a 4 KiB page size in test/rallocm.c.
* 2.2.3 (August 31, 2011)
This version fixes numerous bugs related to heap profiling.
Bug fixes:
- Fix a prof-related race condition. This bug could cause memory corruption,
but only occurred in non-default configurations (prof_accum:false).
- Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is
excluded from backtraces).
- Fix a prof-related bug in realloc() (only triggered by OOM errors).
- Fix prof-related bugs in allocm() and rallocm().
- Fix prof_tdata_cleanup() for --disable-tls builds.
- Fix a relative include path, to fix objdir builds.
* 2.2.2 (July 30, 2011)
Bug fixes:
- Fix a build error for --disable-tcache.
- Fix assertions in arena_purge() (for real this time).
- Add the --with-private-namespace option. This is a workaround for symbol
conflicts that can inadvertently arise when using static libraries.
* 2.2.1 (March 30, 2011)
Bug fixes:
- Implement atomic operations for x86/x64. This fixes compilation failures
for versions of gcc that are still in wide use.
- Fix an assertion in arena_purge().
* 2.2.0 (March 22, 2011)
This version incorporates several improvements to algorithms and data
structures that tend to reduce fragmentation and increase speed.
New features:
- Add the "stats.cactive" mallctl.
- Update pprof (from google-perftools 1.7).
- Improve backtracing-related configuration logic, and add the
--disable-prof-libgcc option.
Bug fixes:
- Change default symbol visibility from "internal", to "hidden", which
decreases the overhead of library-internal function calls.
- Fix symbol visibility so that it is also set on OS X.
- Fix a build dependency regression caused by the introduction of the .pic.o
suffix for PIC object files.
- Add missing checks for mutex initialization failures.
- Don't use libgcc-based backtracing except on x64, where it is known to work.
- Fix deadlocks on OS X that were due to memory allocation in
- Heap profiling-specific fixes:
+ Fix memory corruption due to integer overflow in small region index
computation, when using a small enough sample interval that profiling
context pointers are stored in small run headers.
+ Fix a bootstrap ordering bug that only occurred with TLS disabled.
+ Fix a rallocm() rsize bug.
+ Fix error detection bugs for aligned memory allocation.
* 2.1.3 (March 14, 2011)
Bug fixes:
- Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix
for OS X in 2.1.2).
- Fix a "thread.arena" mallctl bug.
- Fix a thread cache stats merging bug.
* 2.1.2 (March 2, 2011)
Bug fixes:
- Fix "thread.{de,}allocatedp" mallctl for OS X.
- Add missing jemalloc.a to build system.
* 2.1.1 (January 31, 2011)
Bug fixes:
- Fix aligned huge reallocation (affected allocm()).
- Fix the ALLOCM_LG_ALIGN macro definition.
- Fix a heap dumping deadlock.
- Fix a "thread.arena" mallctl bug.
* 2.1.0 (December 3, 2010)
This version incorporates some optimizations that can't quite be considered
bug fixes.
New features:
- Use Linux's mremap(2) for huge object reallocation when possible.
- Avoid locking in mallctl*() when possible.
- Add the "thread.[de]allocatedp" mallctl's.
- Convert the manual page source from roff to DocBook, and generate both roff
and HTML manuals.
Bug fixes:
- Fix a crash due to incorrect bootstrap ordering. This only impacted
--enable-debug --enable-dss configurations.
- Fix a minor statistics bug for mallctl("swap.avail", ...).
* 2.0.1 (October 29, 2010)
Bug fixes:
- Fix a race condition in heap profiling that could cause undefined behavior
if "opt.prof_accum" were disabled.
- Add missing mutex unlocks for some OOM error paths in the heap profiling
- Fix a compilation error for non-C99 builds.
* 2.0.0 (October 24, 2010)
This version focuses on the experimental *allocm() API, and on improved
run-time configuration/introspection. Nonetheless, numerous performance
improvements are also included.
New features:
- Implement the experimental {,r,s,d}allocm() API, which provides a superset
of the functionality available via malloc(), calloc(), posix_memalign(),
realloc(), malloc_usable_size(), and free(). These functions can be used to
allocate/reallocate aligned zeroed memory, ask for optional extra memory
during reallocation, prevent object movement during reallocation, etc.
more human-readable, and more flexible. For example:
is now:
- Port to Apple OS X. Sponsored by Mozilla.
- Make it possible for the application to control thread-->arena mappings via
the "thread.arena" mallctl.
- Add compile-time support for all TLS-related functionality via pthreads TSD.
This is mainly of interest for OS X, which does not support TLS, but has a
TSD implementation with similar performance.
- Override memalign() and valloc() if they are provided by the system.
- Add the "arenas.purge" mallctl, which can be used to synchronously purge all
dirty unused pages.
- Make cumulative heap profiling data optional, so that it is possible to
limit the amount of memory consumed by heap profiling data structures.
- Add per thread allocation counters that can be accessed via the
"thread.allocated" and "thread.deallocated" mallctls.
Incompatible changes:
- Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above).
- Increase default backtrace depth from 4 to 128 for heap profiling.
- Disable interval-based profile dumps by default.
Bug fixes:
- Remove bad assertions in fork handler functions. These assertions could
cause aborts for some combinations of configure settings.
- Fix strerror_r() usage to deal with non-standard semantics in GNU libc.
- Fix leak context reporting. This bug tended to cause the number of contexts
to be underreported (though the reported number of objects and bytes were
- Fix a realloc() bug for large in-place growing reallocation. This bug could
cause memory corruption, but it was hard to trigger.
- Fix an allocation bug for small allocations that could be triggered if
multiple threads raced to create a new run of backing pages.
- Enhance the heap profiler to trigger samples based on usable size, rather
than request size.
- Fix a heap profiling bug due to sometimes losing track of requested object
size for sampled objects.
* 1.0.3 (August 12, 2010)
Bug fixes:
- Fix the libunwind-based implementation of stack backtracing (used for heap
profiling). This bug could cause zero-length backtraces to be reported.
- Add a missing mutex unlock in library initialization code. If multiple
threads raced to initialize malloc, some of them could end up permanently
* 1.0.2 (May 11, 2010)
Bug fixes:
- Fix junk filling of large objects, which could cause memory corruption.
- Add MAP_NORESERVE support for chunk mapping, because otherwise virtual
memory limits could cause swap file configuration to fail. Contributed by
Jordan DeLong.
* 1.0.1 (April 14, 2010)
Bug fixes:
- Fix compilation when --enable-fill is specified.
- Fix threads-related profiling bugs that affected accuracy and caused memory
to be leaked during thread exit.
- Fix dirty page purging race conditions that could cause crashes.
- Fix crash in tcache flushing code during thread destruction.
* 1.0.0 (April 11, 2010)
This release focuses on speed and run-time introspection. Numerous
algorithmic improvements make this release substantially faster than its
New features:
- Implement autoconf-based configuration system.
- Add mallctl*(), for the purposes of introspection and run-time
- Make it possible for the application to manually flush a thread's cache, via
the "tcache.flush" mallctl.
- Base maximum dirty page count on proportion of active memory.
- Compute various additional run-time statistics, including per size class
statistics for large objects.
- Expose malloc_stats_print(), which can be called repeatedly by the
- Simplify the malloc_message() signature to only take one string argument,
and incorporate an opaque data pointer argument for use by the application
in combination with malloc_stats_print().
- Add support for allocation backed by one or more swap files, and allow the
application to disable over-commit if swap files are in use.
- Implement allocation profiling and leak checking.
Removed features:
- Remove the dynamic arena rebalancing code, since thread-specific caching
reduces its utility.
Bug fixes:
- Modify chunk allocation to work when address space layout randomization
(ASLR) is in use.
- Fix thread cleanup bugs related to TLS destruction.
- Handle 0-size allocation requests in posix_memalign().
- Fix a chunk leak. The leaked chunks were never touched, so this impacted
virtual memory usage, but not physical memory usage.
* linux_2008082[78]a (August 27/28, 2008)
These snapshot releases are the simple result of incorporating Linux-specific
support into the FreeBSD malloc sources.


@ -1,414 +0,0 @@
Building and installing a packaged release of jemalloc can be as simple as
typing the following while in the root directory of the source tree:
make install
If building from unpackaged developer sources, the simplest command sequence
that might work is:
make dist
make install
Note that documentation is not built by the default target because doing so
would create a dependency on xsltproc in packaged releases, hence the
requirement to either run 'make dist' or avoid installing docs via the various
install_* targets documented below.
=== Advanced configuration =====================================================
The 'configure' script supports numerous options that allow control of which
functionality is enabled, where jemalloc is installed, etc. Optionally, pass
any of the following arguments (not a definitive list) to 'configure':
Print a definitive list of options.
Set the base directory in which to install. For example:
./configure --prefix=/usr/local
will cause files to be installed into /usr/local/include, /usr/local/lib,
and /usr/local/man.
Use the specified version string rather than trying to generate one (if in
a git repository) or use existing the VERSION file (if present).
Embed one or more library paths, so that libjemalloc can find the libraries
it is linked to. This works only on ELF-based systems.
Mangle public symbols specified in <map> which is a comma-separated list of
name:mangled pairs.
For example, to use ld's --wrap option as an alternative method for
overriding libc's malloc implementation, specify something like:
Note that mangling happens prior to application of the prefix specified by
--with-jemalloc-prefix, and mangled symbols are then ignored when applying
the prefix.
Prefix all public APIs with <prefix>. For example, if <prefix> is
"prefix_", API changes like the following occur:
malloc() --> prefix_malloc()
malloc_conf --> prefix_malloc_conf
/etc/malloc.conf --> /etc/prefix_malloc.conf
This makes it possible to use jemalloc at the same time as the system
allocator, or even to use multiple copies of jemalloc simultaneously.
By default, the prefix is "", except on OS X, where it is "je_". On OS X,
jemalloc overlays the default malloc zone, but makes no attempt to actually
replace the "malloc", "calloc", etc. symbols.
Don't export public APIs. This can be useful when building jemalloc as a
static library, or to avoid exporting public APIs when using the zone
allocator on OSX.
Prefix all library-private APIs with <prefix>je_. For shared libraries,
symbol visibility mechanisms prevent these symbols from being exported, but
for static libraries, naming collisions are a real possibility. By
default, <prefix> is empty, which results in a symbol prefix of je_ .
Append <suffix> to the base name of all installed files, such that multiple
versions of jemalloc can coexist in the same installation directory. For
example, becomes libjemalloc<suffix>.so.0.
Embed <malloc_conf> as a run-time options string that is processed prior to
the malloc_conf global variable, the /etc/malloc.conf symlink, and the
MALLOC_CONF environment variable. For example, to change the default chunk
size to 256 KiB:
Disable code that silences non-useful compiler warnings. This is mainly
useful during development when auditing the set of warnings that are being
Enable assertions and validation code. This incurs a substantial
performance hit, but is very useful during application development.
Implies --enable-ivsalloc.
Enable code coverage support, for use during jemalloc test development.
Additional testing targets are available if this option is enabled:
These targets do not clear code coverage results from previous runs, and
there are interactions between the various coverage targets, so it is
usually advisable to run 'make clean' between repeated code coverage runs.
Disable statistics gathering functionality. See the "opt.stats_print"
option documentation for usage details.
Enable validation code, which verifies that pointers reside within
jemalloc-owned chunks before dereferencing them. This incurs a minor
performance hit.
Enable heap profiling and leak detection functionality. See the ""
option documentation for usage details. When enabled, there are several
approaches to backtracing, and the configure script chooses the first one
in the following list that appears to function correctly:
+ libunwind (requires --enable-prof-libunwind)
+ libgcc (unless --disable-prof-libgcc)
+ gcc intrinsics (unless --disable-prof-gcc)
Use the libunwind library ( for stack
Disable the use of libgcc's backtracing functionality.
Disable the use of gcc intrinsics for backtracing.
Statically link against the specified libunwind.a rather than dynamically
linking with -lunwind.
Disable thread-specific caches for small objects. Objects are cached and
released in bulk, thus reducing the total number of mutex operations. See
the "opt.tcache" option for usage details.
Disable virtual memory deallocation via munmap(2); instead keep track of
the virtual memory for later use. munmap() is disabled by default (i.e.
--disable-munmap is implied) on Linux, which has a quirk in its virtual
memory allocation algorithm that causes semi-permanent VM map holes under
normal jemalloc operation.
Disable support for junk/zero filling of memory, quarantine, and redzones.
See the "opt.junk", "", "opt.quarantine", and "opt.redzone" option
documentation for usage details.
Disable support for Valgrind.