本文源码基于CPython 3.10

Python实现了自己的内存管理，用以加快内存操作和减少内存碎片化。Python定义了一个阈值，小于等于这个阈值的小内存请求，由Python实现的这套内存管理来分配，大于则直接交给malloc()。

// Objects/obmalloc.c
#define SMALL_REQUEST_THRESHOLD 512

内存管理模型

Python的内存分配器分成如下层级

    Object-specific allocators
    _____   ______   ______       ________
   [ int ] [ dict ] [ list ] ... [ string ]       Python core         |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
    _______________________________       |                           |
   [   Python's object allocator   ]      |                           |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
    ______________________________________________________________    |
   [          Python's raw memory allocator (PyMem_ API)          ]   |
+1 | <----- Python memory (under PyMem manager's control) ------> |   |
    __________________________________________________________________
   [    Underlying general-purpose allocator (ex: C library malloc)   ]
 0 | <------ Virtual memory allocated for the python process -------> |

   =========================================================================
    _______________________________________________________________________
   [                OS-specific Virtual Memory Manager (VMM)               ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
    __________________________________   __________________________________
   [                                  ] [                                  ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |

按Python内存模型，一个典型的调用过程如下：

PyDict_New()          # 3层
  PyObject_GC_New()   # 2层
    PyObject_Malloc() # 2层
      new_arena()     # 1层
      malloc()        # 0层

第0层往下是操作系统和硬件的内存实现，就不在Python的讨论范畴了。

以下内容的前提是，python使用它自己的small-block内存分配器，当然，大多数情况下还是启用了的。

// pyconfig.h
/* Use Python's own small-block memory-allocator. */
#define WITH_PYMALLOC 1    // 默认是打开的

此时PYOBJ_ALLOC指向_PyObject_*APIs，而不是和PYRAW_ALLOC一样指向_PyMem_*APIs

// obmalloc.c
#define MALLOC_ALLOC {NULL, _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree}
#ifdef WITH_PYMALLOC
#  define PYMALLOC_ALLOC {NULL, _PyObject_Malloc, _PyObject_Calloc, _PyObject_Realloc, _PyObject_Free}
#endif

#define PYRAW_ALLOC MALLOC_ALLOC
#ifdef WITH_PYMALLOC
#  define PYOBJ_ALLOC PYMALLOC_ALLOC
#else
#  define PYOBJ_ALLOC MALLOC_ALLOC
#endif
#define PYMEM_ALLOC PYOBJ_ALLOC

几种内存单位

Python抽象出arena，pool，block这三种内存单位，对应关系如下。在使用本文这种内存管理机制的情况下，每次内存请求的目标是获取到一个足够大小的block。

通过数组和多条链表进行管理，使得内存分配开销从O(N)到O(1)。

pool通过freeblock和nextoffset字段以数组+单链表形式对block进行管理
arena通过freepools和address字段以数组+单链表形式对pool进行管理
usedpools[]全局变量通过存储多条双向链表维护当前已使用且还能继续分配block的pool
unused_arena_objects全局变量单链表管理已创建但未与（_PyObject_Arena.alloc()分配的）实际堆内存关联的arena_object
usable_arena全局变量双向链表按拥有空闲pool数量升序维护可用arena
arenas全局变量维护所有已创建的arena_object（不一定已经被_PyObject_Arena.alloc()分配了实际堆内存）

以上的总结性描述暂时还不好理解，我们可以在阅读下面内容时，再来回顾。

block在pool中的组织方式

block这个词语的指代稍微有些混乱，先做一点解释。

/* When you say memory, my mind reasons in terms of (pointers to) blocks */
typedef uint8_t block;

代码中的block就是uint8_t，但是代码写起来都是用block*指针，来指向前面所提到内存单元中的，不同类型的pool中，大小不一，但是都对齐到ALIGNMENT宏的整数倍个字节大小的内存块。以下的block一词，都指内存块，而不是单个uint8_t。

pool的结构

每个pool都维护相同大小的block，在较早的python中，block大小为8的倍数

 * For small requests we have the following table:
 *
 * Request in bytes     Size of allocated block      Size class idx
 * ----------------------------------------------------------------
 *        1-8                     8                       0
 *        9-16                   16                       1
 *       17-24                   24                       2
 *       25-32                   32                       3
 *       33-40                   40                       4
 *       41-48                   48                       5
 *       49-56                   56                       6
 *       57-64                   64                       7
 *       65-72                   72                       8
 *        ...                   ...                     ...
 *      497-504                 504                      62
 *      505-512                 512                      63
 *
 *      0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying
 *      allocator.

但是较新版的源码中，在x86_64机器上是16的倍数了。

#if SIZEOF_VOID_P > 4
#define ALIGNMENT              16               /* must be 2^N */
#define ALIGNMENT_SHIFT         4
#else
#define ALIGNMENT               8               /* must be 2^N */
#define ALIGNMENT_SHIFT         3
#endif

比如ALIGNMENT为16的情况下，size class为0的pool长这样：

头部储存在pool里，保存一些重要信息。

obmalloc.c中pool_header的定义

struct pool_header {
    union { block *_padding;
            uint count; } ref;          // pool中已分配block的数量
    block *freeblock;                   // 指向下一个空闲block
    struct pool_header *nextpool;       // 下一个同size class的pool
    struct pool_header *prevpool;       
    uint arenaindex;                    // 该pool所属的arena
    uint szidx;                         // 当前的pool分配哪一类的空间, 等于usepools中的idx
    uint nextoffset;                    // 从未分配过的block偏移
    uint maxnextoffset;                 // nextoffset最大值，超过说明block都分配过，且如果没有block被释放，说明pool已满
};

typedef struct pool_header *poolp;

了解完pool_header结构体，再来看一个pool详细的样子，单个pool大小是4Kb，为大多数系统的页大小，但在新的CPython中有变化，USE_LARGE_POOLS定义下是16Kb。

我们先关注下面这个pool怎么以数组+单链表形式来维护block

pool三种状态

used：其中部分block被分配了，且至少有一个block没被分配。这也意味着一个pool至少有两个block。因为pool只有在需要内存的时候才被分配，所以used是一个pool的初始状态，而不是empty
full: 所有block都被分配了，pool会被从usedpools[]上取下，此时pool的prevpool和nextpool没有实际意义
empty：所有block都可用，pool会被从usedpools[]上取下，放回对应arena_object的单链表freepools里面，此时pool的prevpool没有实际意义

在pool中申请空间有几种情况，但是有一个原则很简单就是每次去找freeblock指向的内存块，再根据情况更新一些数据。

下面的内容涉及到usedpools[]对pool的管理，但我们主要关注pool怎么管理block

在未释放过block的pool中申请新的空间

每次freeblock都指向一个内容为NULL的block，表明目前 pool 中没有其他之前被申请过但是当前已经被释放了的 block 存在，新申请空间的话，需要从 pool 尾部寻找之前没有用到过的新的空间。

这时候的管理更像是数组形式，nextoffset和 maxnextoffset在这种情况下会被用来查看 pool 尾部是否还有剩余的未使用过的空间。

然后就是更新ref.count，nextoffset，freeblock了。每次freeblock更新为nextoffset指向的地址，所以这里的图中，freeblock指向倒数第3个块，而pool+nextoffset在倒数第2个block

这一部分的代码

static void
pymalloc_pool_extend(poolp pool, uint size)
{
    // freeblock为NULL时，扩展pool尾部
    if (UNLIKELY(pool->nextoffset <= pool->maxnextoffset)) {
        /* There is room for another block. */
        pool->freeblock = (block*)pool + pool->nextoffset;
        pool->nextoffset += INDEX2SIZE(size);
        *(block **)(pool->freeblock) = NULL;
        return;
    }

    /* Pool is full, unlink from used pools. */
    poolp next;
    next = pool->nextpool;
    pool = pool->prevpool;
    next->prevpool = pool;
    pool->nextpool = next;
}


static inline void*
pymalloc_alloc(void *ctx, size_t nbytes)
{
    // ...
    if (UNLIKELY(nbytes == 0)) {
        return NULL;
    }
    if (UNLIKELY(nbytes > SMALL_REQUEST_THRESHOLD)) {
        return NULL;
    }

    uint size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
    poolp pool = usedpools[size + size];
    block *bp;

    if (LIKELY(pool != pool->nextpool)) {
        /*
         * There is a used pool for this size class.
         * Pick up the head block of its free list.
         */
        ++pool->ref.count;
        bp = pool->freeblock;
        assert(bp != NULL);

        // 这里的freeblock赋值还是比较巧妙的，可以思考下
        if (UNLIKELY((pool->freeblock = *(block **)bp) == NULL)) {
            // Reached the end of the free list, try to extend it.
            pymalloc_pool_extend(pool, size);
        }
    }
    // ...
}

申请完毕后

再申请一个

此时nextoffset大于maxnextoffset了。

再申请

nextoffset 的值比 maxnextoffset 的大, 表示尾部已经没有多余的新的未使用过的空间可以使用了, 并且目前情况下当前的pool是满的了, 所以freeblock变成了一个空指针

因为pool已经满了, 我们需要把pool从usedpools全局变量管理的双向链表中移除。本例中这个pool的block都是8字节，所以这个pool本来位于usedpools[0]和usedpools[1]维护的一条双链表中。

在pool中释放block的核心代码如下：

    block *lastfree = pool->freeblock;
    *(block **)p = lastfree;
    pool->freeblock = (block *)p;
    pool->ref.count--;

其实就是把要释放的内存，插入到freeblock指向的单链表中的头部。链表每个节点都是没有用到的block，正好用来指向下一个空闲block的地址。

在满的 pool 中进行释放

释放如图位置的block。

step1，把内存块里第一个block的值设置为和 freeblock相同的值，图中因为是满的，所以设置为了NULL

step2, 让freeblock指向当前正在释放的block

减小ref.count的值

step3，检查pool释放内存之前是否是满的(通过检查当前正在释放的block是否空指针即可确定)，如果是, 把pool重新链接到usedpools中并返回, 如果不是则跳转到 step4

step4

检查当前arena中的所有的pools是否都为空的, 如果是的话, 释放整个arena
如果这是arena中唯一的有空余空间的pool，把arena加回到 usable_arenas 列表中
给usable_arenas进行排序, 确保更多空的pool的arena排在更后面

释放block的pymalloc_free(void *ctx, void *p)源码

// obmalloc.c

/* Free a memory block allocated by pymalloc_alloc().
   Return 1 if it was freed.
   Return 0 if the block was not allocated by pymalloc_alloc(). */
static inline int
pymalloc_free(void *ctx, void *p)
{
    assert(p != NULL);

#ifdef WITH_VALGRIND
    if (UNLIKELY(running_on_valgrind > 0)) {
        return 0;
    }
#endif

    poolp pool = POOL_ADDR(p);
    if (UNLIKELY(!address_in_range(p, pool))) {
        return 0;
    }
    /* We allocated this address. */

    /* Link p to the start of the pool's freeblock list.  Since         // 用过但释放了的block，在此收入freeblock单链表进行管理
     * the pool had at least the p block outstanding, the pool
     * wasn't empty (so it's already in a usedpools[] list, or
     * was full and is in no list -- it's not in the freeblocks
     * list in any case).
     */
    assert(pool->ref.count > 0);            /* else it was empty */
    block *lastfree = pool->freeblock;
    *(block **)p = lastfree;
    pool->freeblock = (block *)p;
    pool->ref.count--;

    if (UNLIKELY(lastfree == NULL)) {
        /* Pool was full, so doesn't currently live in any list:
         * link it to the front of the appropriate usedpools[] list.
         * This mimics LRU pool usage for new allocations and
         * targets optimal filling when several pools contain
         * blocks of the same size class.
         */
        insert_to_usedpool(pool);
        return 1;
    }

    /* freeblock wasn't NULL, so the pool wasn't full,
     * and the pool is in a usedpools[] list.
     */
    if (LIKELY(pool->ref.count != 0)) {
        /* pool isn't empty:  leave it in usedpools */
        return 1;
    }

    /* Pool is now empty:  unlink from usedpools, and
     * link to the front of freepools.  This ensures that
     * previously freed pools will be allocated later
     * (being not referenced, they are perhaps paged out).
     */
    insert_to_freepool(pool);
    return 1;
}

POOL_ADDR(p)的设计：因为pool保证对齐到4KiB（定义了USE_LARGE_POOL时对齐到16KiB）

#ifdef USE_LARGE_POOLS
#define POOL_BITS               14                  /* 16 KiB */
#else
#define POOL_BITS               12                  /* 4 KiB */
#endif
#define POOL_SIZE               (1 << POOL_BITS)

#define _Py_ALIGN_DOWN(p, a) ((void *)((uintptr_t)(p) & ~(uintptr_t)((a) - 1)))

#define POOL_ADDR(P) ((poolp)_Py_ALIGN_DOWN((P), POOL_SIZE))

简便起见，假如POOL_SIZE是0b00001000，一个block的地址是0b00011010，那么这里的计算方式就是

  0b00011010 & ~(0b00001000 - 1)
= 0b00011010 &  ~0b00000111
= 0b00011010 &   0b11111000
= 0b00011000

在有空余空间的 pool 中进行释放

其实步骤和上面一样，比如释放最后一个内存块，位于0x10ae3dff8

Step1 设置当前释放的内存块的值

Step2

更新freeblock指针

减小ref.count

Step3 显然pool之前不是满的，直接跳到Step4

Step4 返回

再来释放倒数第二个，也是同理

Step1

Step2

Step3 pool之前不满，跳到step4

Step4返回

在释放过block的pool中申请新的空间

和在未释放过block的pool中申请新的空间是一样的思路。从代码上来看

// pymalloc_alloc in  obmalloc.c

    ++pool->ref.count;
    bp = pool->freeblock;
    assert(bp != NULL);

    if (UNLIKELY((pool->freeblock = *(block **)bp) == NULL)) {
        // Reached the end of the free list, try to extend it.
        pymalloc_pool_extend(pool, size);
    }

和之前的区别在于这次从freeblock获取到的不是NULL了，就不用进入pymalloc_pool_extend()

pool在arena和usedpools中的组织方式

内存请求的过程

一个arena中存储了许多大小相同的pool，不同arena的pool大小不一定相同。
arenas[]保存了所有arena_object
usedpools[]是一个全局数组，其中的元素是个双向链表，两个元素为一组，将相同大小的pool串起来。一个size没有分配pool时，usedpools[size+size]里面的节点指向的pool的prevpool和nextpool都指向自己。

先来看看usedpools[]，内存请求到达pymalloc_alloc时，小于等于SMALL_REQUEST_THRESHOLD的请求，就是在usedpools里来找合适大小的pool。寻找的方式是

计算请求内存大小所属size class：uint size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT
在usedpools[size + size]上寻找pool

如果没有合适大小的pool，就会从arenas里请求新的pool，添加到对应位置的双向链表上，从这个新pool中申请block。

usedpools的初始定义

// usedpools初始化
#define PTA(x)  ((poolp )((uint8_t *)&(usedpools[2*(x)]) - 2*sizeof(block *)))
#define PT(x)   PTA(x), PTA(x)

static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = {
    PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7)
#if NB_SMALL_SIZE_CLASSES > 8
    , PT(8), PT(9), PT(10), PT(11), PT(12), PT(13), PT(14), PT(15)
#if NB_SMALL_SIZE_CLASSES > 16
    , PT(16), PT(17), PT(18), PT(19), PT(20), PT(21), PT(22), PT(23)
#if NB_SMALL_SIZE_CLASSES > 24
    , PT(24), PT(25), PT(26), PT(27), PT(28), PT(29), PT(30), PT(31)
#if NB_SMALL_SIZE_CLASSES > 32
    , PT(32), PT(33), PT(34), PT(35), PT(36), PT(37), PT(38), PT(39)
#if NB_SMALL_SIZE_CLASSES > 40
    , PT(40), PT(41), PT(42), PT(43), PT(44), PT(45), PT(46), PT(47)
#if NB_SMALL_SIZE_CLASSES > 48
    , PT(48), PT(49), PT(50), PT(51), PT(52), PT(53), PT(54), PT(55)
#if NB_SMALL_SIZE_CLASSES > 56
    , PT(56), PT(57), PT(58), PT(59), PT(60), PT(61), PT(62), PT(63)
#if NB_SMALL_SIZE_CLASSES > 64
#error "NB_SMALL_SIZE_CLASSES should be less than 64"
#endif /* NB_SMALL_SIZE_CLASSES > 64 */
#endif /* NB_SMALL_SIZE_CLASSES > 56 */
#endif /* NB_SMALL_SIZE_CLASSES > 48 */
#endif /* NB_SMALL_SIZE_CLASSES > 40 */
#endif /* NB_SMALL_SIZE_CLASSES > 32 */
#endif /* NB_SMALL_SIZE_CLASSES > 24 */
#endif /* NB_SMALL_SIZE_CLASSES > 16 */
#endif /* NB_SMALL_SIZE_CLASSES >  8 */
};

上面这个代码有点巧妙，我们看下usedpools@0x00007FFED35FF410刚初始化后里面长啥样

回顾一下pool_header定义就知道是怎么回事了。

比如0+0位置上维护的双链表，&usedpools[0+0]在0x00007FFED35FF410，而usedpools[0+0]为0x00007FFED35FF400，(pool_header*)0x00007FFED35FF400->nextpool 跨过两个指针大小的内存，刚好就指向&usedpools[0+0]了，也就是usedpools[size+size]->nextpool==&usedpools[size+size]

总之初始化的值使得正好所有的usedpools[size+size]的pool的prevpool和nextpool字段都指向自己了。

现在假设我们一开始请求一个10字节的内存块。

初始时没有能用的arena和pool，于是在new_arena()中：第一次创建的是16个新的arena，分配了16个arena_object结构体的空间，保存到全局变量arenas[]。但并没有立即分配16个ARENA_SIZE的空间作为arena。然后初始化unused_arena_objects指向的arena_object，此时也就是arenas[0]。

第一个可用的arena的第一个空闲pool被插入used_pools里。sentinel就是哨兵的意思，idx1指向的sentinel其实存的是used_pools[0]的地址（没理解的话查看上面内容）。

然后按照前面讲到的在未释放过block的pool中申请新的空间获得请求所需要的内存块。

arena中对pool的管理

arena管理pool的方式和pool管理block的方式非常相似。也是数组+单链表。

arena_object结构体：

源码如下，注释说得比较清楚

/* Record keeping for arenas. */
struct arena_object {
    /* The address of the arena, as returned by malloc.  Note that 0
     * will never be returned by a successful malloc, and is used
     * here to mark an arena_object that doesn't correspond to an
     * allocated arena.
     */
    uintptr_t address;

    /* Pool-aligned pointer to the next pool to be carved off. */
    block* pool_address;

    /* The number of available pools in the arena:  free pools + never-
     * allocated pools.
     */
    uint nfreepools;

    /* The total number of pools in the arena, whether or not available. */
    uint ntotalpools;

    /* Singly-linked list of available pools. */
    struct pool_header* freepools;

    /* Whenever this arena_object is not associated with an allocated
     * arena, the nextarena member is used to link all unassociated
     * arena_objects in the singly-linked `unused_arena_objects` list.
     * The prevarena member is unused in this case.
     *
     * When this arena_object is associated with an allocated arena
     * with at least one available pool, both members are used in the
     * doubly-linked `usable_arenas` list, which is maintained in
     * increasing order of `nfreepools` values.
     *
     * Else this arena_object is associated with an allocated arena
     * all of whose pools are in use.  `nextarena` and `prevarena`
     * are both meaningless in this case.
     */
    struct arena_object* nextarena;
    struct arena_object* prevarena;
};

当然，我这可以再翻译一下和做点解释

address

用来保存分配给arena地址。

pool_address

pool的地址，指向下一个要取出的pool的地址

因为malloc()返回给arena的地址不一定对齐到POOL_SIZE，所以如果真的没有对齐，那么arena会舍弃前后加起来共POOL_SIZE大小的空间（图中POOL_SIZE为4K），保证最开始时，pool_address指向已经对齐好的pool，随时可以分配。

arena的对齐实现：

在未释放过pool的arena中申请新的空间

按顺序，前面的所有pool都是used，pool_address负责指向空闲内存

申请一个新的pool，从pool_address上取

nfreepools减小，pool_address更新为下一个空余的pool

再申请一个，满了。

在arena中释放空间

pool变为empty时，pool从usedpools中移除，插入到与之对应的arena中，freepools所管理的单链表中。

比如我们释放pool3

pool3->nextpool设置为arena->freepools相同值。因为是满了之后第一个释放的，所以这个时候是0

freepools更新为pool3的地址，nfreepools增加

比如再释放一个pool63

在释放过pool的arena中申请新的空间

在arena中申请新的空间，先判断freepools是不是NULL。这里的情况下，nfreepools大于0，freepools不为空，从这个单链表中取pool。

更新freepools指针，和nfreepools

arena的管理

arena由arena_object结构体数组arenas[]和链表unused_arena_objects和usable_arenas进行管理。

单链表**unused_arena_objects**

维护arenas中所有pool都没被使用或都被释放了的arena_object，凡是放入这个链表的arena_object，其arena占用的pool都会被释放，但arena_object结构体本身不会被释放。

在 Python 2.5 之前, arena申请过的空间是从来不会被释放的, 这个策略是在 Python 2.5 之后引入的。

双向链表**usable_arenas** 维护有空闲pool的arena的arena_object

第一次内存请求前没有分配arena_object，usable_arenas == NULL，第一次分配时，分配16个arena_object存入arenas[]，之后每次扩容arenas，总共分配的arena_object数量翻倍。

由于是第一次分配，所以先构建unused_arena_objects单链表，留在这个链表中的arena_object一定是没分配实际内存的，也就是address字段是0

（白色为不关联实际内存，绿色表示有可用pool，红色表示所有pool已满）

new_arena()内从unused_arena_objects处获取arena_object并为之真正分配ARENA_SIZE的内存，做好初始化：对齐arena中的pool到POOL_SIZE，设置一些字段。

usable_arenas是个双向链表（下图没体现出来），维护包含空闲pool的arena，管理的arena_object按其nfreepools升序排列，获取pool时，优先从重度使用的arena中分配，这样可以让比较空的arena有更多释放内存的可能性。

16个arena都用完了的时候，arena_objects数量翻倍，然后做前面同样的初始化。

扩容用的realloc，不用管指针了？是的。因为既然前面的arena_objects对应的arena都满了，usable_arenas和unused_arena_objects就都为NULL了，自然也没有nextarena，prevarena指针指向原来的内存，realloc后，构建新的unused_arena_objects单链表

返回新的arena给usable_arenas

new_arena()源码

// obmalloc.c

/* Allocate a new arena.  If we run out of memory, return NULL.  Else
 * allocate a new arena, and return the address of an arena_object
 * describing the new arena.  It's expected that the caller will set
 * `usable_arenas` to the return value.
 */
static struct arena_object*
new_arena(void)
{
    struct arena_object* arenaobj;
    uint excess;        /* number of bytes above pool alignment */
    void *address;
    static int debug_stats = -1;

    if (debug_stats == -1) {
        const char *opt = Py_GETENV("PYTHONMALLOCSTATS");
        debug_stats = (opt != NULL && *opt != '\0');
    }
    if (debug_stats)
        _PyObject_DebugMallocStats(stderr);

    if (unused_arena_objects == NULL) {
        uint i;
        uint numarenas;
        size_t nbytes;

        /* Double the number of arena objects on each allocation.
         * Note that it's possible for `numarenas` to overflow.
         */
        numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS;     // 将最大arenas数量翻倍
        if (numarenas <= maxarenas)
            return NULL;                /* overflow */
#if SIZEOF_SIZE_T <= SIZEOF_INT
        if (numarenas > SIZE_MAX / sizeof(*arenas))
            return NULL;                /* overflow */
#endif
        nbytes = numarenas * sizeof(*arenas);
        arenaobj = (struct arena_object *)PyMem_RawRealloc(arenas, nbytes); // realloc内存
        if (arenaobj == NULL)
            return NULL;
        arenas = arenaobj;

        /* We might need to fix pointers that were copied.  However,        // Realloc后的指针无需处理
         * new_arena only gets called when all the pages in the             // 因为前面的arena满了，才需要realloc，
         * previous arenas are full.  Thus, there are *no* pointers         // 既然满了，就没有指针(unused_arena_objects 和 usable_arenas链表中的指针)
         * into the old array. Thus, we don't have to worry about           // 指向原来的arenas数组里。
         * invalid pointers.  Just to be sure, some asserts:
         */
        assert(usable_arenas == NULL);
        assert(unused_arena_objects == NULL);

        /* Put the new arenas on the unused_arena_objects list. */
        for (i = maxarenas; i < numarenas; ++i) {
            arenas[i].address = 0;              /* mark as unassociated */
            arenas[i].nextarena = i < numarenas - 1 ?
                                   &arenas[i+1] : NULL;
        }

        /* Update globals. */
        unused_arena_objects = &arenas[maxarenas];                          // unused_arena_objects指向arenas新扩容出来的部分头部
        maxarenas = numarenas;
    }

    /* Take the next available arena object off the head of the list. */
    assert(unused_arena_objects != NULL);
    arenaobj = unused_arena_objects;
    unused_arena_objects = arenaobj->nextarena;
    assert(arenaobj->address == 0);
    address = _PyObject_Arena.alloc(_PyObject_Arena.ctx, ARENA_SIZE);
#if WITH_PYMALLOC_RADIX_TREE
    if (address != NULL) {
        if (!arena_map_mark_used((uintptr_t)address, 1)) {
            /* marking arena in radix tree failed, abort */
            _PyObject_Arena.free(_PyObject_Arena.ctx, address, ARENA_SIZE);
            address = NULL;
        }
    }
#endif
    if (address == NULL) {
        /* The allocation failed: return NULL after putting the
         * arenaobj back.
         */
        arenaobj->nextarena = unused_arena_objects;
        unused_arena_objects = arenaobj;
        return NULL;
    }
    arenaobj->address = (uintptr_t)address;

    ++narenas_currently_allocated;
    ++ntimes_arena_allocated;
    if (narenas_currently_allocated > narenas_highwater)
        narenas_highwater = narenas_currently_allocated;
    arenaobj->freepools = NULL;
    /* pool_address <- first pool-aligned address in the arena
       nfreepools <- number of whole pools that fit after alignment */
    arenaobj->pool_address = (block*)arenaobj->address;
    arenaobj->nfreepools = MAX_POOLS_IN_ARENA;
    excess = (uint)(arenaobj->address & POOL_SIZE_MASK); // 当前arena地址没对齐到pool大小
    if (excess != 0) {
        --arenaobj->nfreepools;                          // 为了对齐pool，可用pool会减少一个
        arenaobj->pool_address += POOL_SIZE - excess;    // 进行对齐操作
    }
    arenaobj->ntotalpools = arenaobj->nfreepools;

    return arenaobj;
}

一些API

obmalloc中的调用链

_PyObject_Malloc和_PyObject_Calloc调用pymalloc_alloc，申请的内存大于SMALL_REQUEST_THRESHOLD的内存时，pymalloc_alloc返回NULL，于是前者又分别会调用PyMem_RawMalloc和PyMem_RawCalloc

参考源码_PyObject_Malloc和_PyObject_Calloc的实现

small request的调用链

PyObject_Malloc(size_t size)
    -> _PyObject.malloc  == (PYOBJ_ALLOC == PYMALLOC_ALLOC).malloc == _PyObject_Malloc
    _PyObject_Malloc(void *ctx, size_t nbytes)
        -> pymalloc_alloc(void *ctx, size_t nbytes) get memory from pool

bigger request的调用链

PyObject_Malloc(size_t size)
    -> _PyObject.malloc  == (PYOBJ_ALLOC == PYMALLOC_ALLOC).malloc == _PyObject_Malloc
    _PyObject_Malloc(void *ctx, size_t nbytes)
        get NULL from pymalloc_alloc
        -> PyMem_RawMalloc(size_t size)
            -> malloc(size_t size) of CRT

PyMem_系列

obmalloc.c中：

#define MALLOC_ALLOC {NULL, _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree}
#ifdef WITH_PYMALLOC
#  define PYMALLOC_ALLOC {NULL, _PyObject_Malloc, _PyObject_Calloc, _PyObject_Realloc, _PyObject_Free}
#endif

#define PYRAW_ALLOC MALLOC_ALLOC
#ifdef WITH_PYMALLOC
#  define PYOBJ_ALLOC PYMALLOC_ALLOC
#else
#  define PYOBJ_ALLOC MALLOC_ALLOC
#endif
#define PYMEM_ALLOC PYOBJ_ALLOC

默认是定义了WITH_PYMALLOC的，也就是说默认选用_PyObject_*函数族而不是_PyMem_Raw*

WITH_PYMALLOC

PYMEM_ALLOC→PYOBJ_ALLOC→PYMALLOC_ALLOC→_PyObject_*函数族

没有WITH_PYMALLOC

PYMEM_ALLOC→PYOBJ_ALLOC→MALLOC_ALLOC→_PyMem_Raw*函数族

TODO

bpo-37029: keep usable_arenas in sorted order without searching (#13… · python/cpython@1c263e3 (github.com)

内存管理模型#

几种内存单位#

block在pool中的组织方式#

pool的结构#

pool三种状态#

在未释放过block的pool中申请新的空间#

在满的 pool 中进行释放#

在有空余空间的 pool 中进行释放#

在释放过block的pool中申请新的空间#

pool在arena和usedpools中的组织方式#

内存请求的过程#

arena中对pool的管理#

在未释放过pool的arena中申请新的空间#

在arena中释放空间#

在释放过pool的arena中申请新的空间#

arena的管理#

一些API#

obmalloc中的调用链#

TODO#