dlmopen从入门到放弃

发表于 2025-01-12 更新于 2025-12-02 分类于 C/C++ 阅读次数：本文字数： 15k 阅读时长 ≈ 14 分钟

截止25年1月12日，打算使用dlmopen来装载serverless平台第三方动态库的计划，在挣扎了2周后正式宣布破产

总的来说dlmopen虽然已经有几十年历史了，但是在当前还是非常不成熟的，有很多细节问题没有解决

想用上dlmopen，需要深入理解glibc的实现原理，和patches的可能bug斗智斗勇，这个投入产出比很低

背景

serverless平台的设计原理属于机密，略过不提

总得来说，最终的实现方案上，模块的执行引擎，可以同时加载多个模块。而对不同的模块，需要做完整的符号隔离。

由于每个模块可以加载自己的第三方库，因此也需要符号隔离。

模块的符号隔离能力是依赖编译引擎和执行引擎的，而第三方库的符号隔离，只能依赖dlmopen了

dlopen的特色是对打开的动态库存在缓存，多次打开会指向同一个虚拟内存段，使用dlmopen的话，第一个参数是命名空间，使用命名空间隔离才允许多次打开到不同的虚拟内存段

dlopen实验

实验编译动态库liba.so和libb.so，让libb.so依赖liba.so

测试各种情况下，什么时候指向同一个实例，什么时候指向不同实例

liba.so

#include <stdio.h>

// 全局变量
int global_var = 42;

// 获取全局变量的值
int get_global_var() {
    return global_var;
}

// 设置全局变量的值
void set_global_var(int value) {
    global_var = value;
    printf("global_var set to %d\n", global_var);
}

// 打印全局变量的地址
void print_global_var_address() {
    printf("Address of global_var: %p\n", (void*)&global_var);
}

编译

1	gcc -shared -fPIC -o liba.so a.c

libb.so

#include <stdio.h>

// 声明 liba.so 中的函数
extern int get_global_var();
extern void set_global_var(int value);
extern void print_global_var_address();

void libb_function() {
    //依赖下liba.so，实际上不会调用
    print_global_var_address();
    get_global_var();
    set_global_var(100);
}

编译

1	gcc -shared -fPIC -o libb.so b.c -L. -la && patchelf --set-rpath . libb.so

主程序

#include <stdio.h>
#include <dlfcn.h>

typedef int (*get_global_var_t)();
typedef void (*set_global_var_t)(int);
typedef void (*print_global_var_address_t)();

int main() {
    // 加载 liba.so 的两个不同路径
    void *handle1 = dlopen("./liba.so", RTLD_LAZY);
    //void *handle2 = dlopen("./copy_of_liba.so", RTLD_LAZY);
    //void *handle2 = dlopen("./liba.so", RTLD_LAZY);
    //void *handle2 = dlopen("/root/cpp_test/dlopen_so1/liba.so", RTLD_LAZY);
    //void *handle2 = dlopen("./liba1.so", RTLD_LAZY);
    void *handle2 = dlopen("./libb.so", RTLD_LAZY);

    if (!handle1 || !handle2) {
        fprintf(stderr, "Error loading library: %s\n", dlerror());
        return 1;
    }

    // 获取函数指针
    get_global_var_t get_global_var1 = (get_global_var_t)dlsym(handle1, "get_global_var");
    set_global_var_t set_global_var1 = (set_global_var_t)dlsym(handle1, "set_global_var");
    print_global_var_address_t print_address1 = (print_global_var_address_t)dlsym(handle1, "print_global_var_address");

    get_global_var_t get_global_var2 = (get_global_var_t)dlsym(handle2, "get_global_var");
    set_global_var_t set_global_var2 = (set_global_var_t)dlsym(handle2, "set_global_var");
    print_global_var_address_t print_address2 = (print_global_var_address_t)dlsym(handle2, "print_global_var_address");

    // 打印全局变量地址
    printf("Handle 1:\n");
    print_address1();
    printf("Handle 2:\n");
    print_address2();

    // 修改 handle1 的全局变量
    printf("Setting global_var via handle1 to 100...\n");
    set_global_var1(100);

    // 获取全局变量的值
    printf("Value of global_var via handle1: %d\n", get_global_var1());
    printf("Value of global_var via handle2: %d\n", get_global_var2());

    // 关闭动态库
    dlclose(handle1);
    dlclose(handle2);

    return 0;
}

如果是同一个实例，global_var的地址是一样的，值也是一样的

会打印如下

Handle 1:
Address of global_var: 0x7fe80bf67028
Handle 2:
Address of global_var: 0x7fe80bf67028
Setting global_var via handle1 to 100...
global_var set to 100
Value of global_var via handle1: 100
Value of global_var via handle2: 100

如果是不同的实例则相反，会打印如下

Handle 1:
Address of global_var: 0x7f876b22d028
Handle 2:
Address of global_var: 0x7f876b228028
Setting global_var via handle1 to 100...
global_var set to 100
Value of global_var via handle1: 100
Value of global_var via handle2: 42

查看/proc/{pid}/maps也能观察到加载了几次

实验结果

复制liba.so	./copy_of_liba.so	打开两个实例
打开同一个liba.so	./liba.so	打开同一个实例
打开绝对路径liba.so	/root/cpp_test/dlopen_so1/liba.so	打开同一个实例
软链liba.so到liba1.so	./liba1.so	打开同一个实例
打开libb.so	./libb.so	打开同一个实例

因此，除非描述符是指向不同的，否则dlopen打开的，即使是libb.so依赖进来的，也只会有同一个实例

dlmopen缺陷

打开的命名空间数量限制

这个问题man dlmopen就可以看到，顺便了解一下dlmopen的用法

#define _GNU_SOURCE
#include <dlfcn.h>

void *dlmopen (Lmid_t lmid, const char *filename, int flags);

接着是对这个函数的解释

This function performs the same task as dlopen()—the filename and flags arguments, as well as the return value, are the same, except for the differences noted below.

该函数执行与 dlopen() 相同的任务——filename 和 flags 参数以及返回值与 dlopen() 相同，除了下面提到的差异。

The dlmopen() function differs from dlopen() primarily in that it accepts an additional argument, lmid, that specifies the link-map list (also referred to as a namespace) in which the shared object should be loaded. By comparison, dlopen() adds the dynamically loaded shared object to the same namespace as the shared object from which the dlopen() call is made. The Lmid_t type is an opaque handle that refers to a namespace.

dlmopen() 函数与 dlopen() 主要的不同在于它接受一个额外的参数 lmid，该参数指定要加载共享对象的链接映射列表（也称为命名空间）。相比之下，dlopen() 会将动态加载的共享对象添加到调用 dlopen() 时所在共享对象的同一命名空间中。Lmid_t 类型是一个指向命名空间的不透明句柄。

The lmid argument is either the ID of an existing namespace (which can be obtained using the dlinfo(3) RTLD_DI_LMID request) or one of the following special values:

lmid 参数可以是一个已有命名空间的 ID（可以通过 dlinfo(3) 的 RTLD_DI_LMID 请求获得），或者是以下特殊值之一：

LM_ID_BASE: Load the shared object in the initial namespace (i.e., the application's namespace).

LM_ID_BASE：在初始命名空间（即应用程序命名空间）中加载共享对象。

LM_ID_NEWLM: Create a new namespace and load the shared object in that namespace. The object must have been correctly linked to reference all of the other shared objects that it requires, since the new namespace is initially empty.

LM_ID_NEWLM：创建一个新命名空间，并在该命名空间中加载共享对象。对象必须正确链接以引用其所需的所有其他共享对象，因为新命名空间初始时为空。

If filename is NULL, then the only permitted value for lmid is LM_ID_BASE.

如果 filename 为 NULL，那么 lmid 唯一允许的值是 LM_ID_BASE。

随后指出了它的限制

NOTES

dlmopen() and namespaces

A link-map list defines an isolated namespace for the resolution of symbols by the dynamic linker. Within a namespace, dependent shared objects are implicitly loaded according to the usual rules, and symbol references are likewise resolved according to the usual rules, but such resolution is confined to the definitions provided by the objects that have been (explicitly and implicitly) loaded into the namespace.

一个链接映射列表定义了一个用于动态链接器解析符号的隔离命名空间。在一个命名空间内，依赖的共享对象根据通常的规则被隐式加载，符号引用同样根据通常的规则解析，但这种解析仅限于那些已经（显式或隐式）加载到命名空间内的对象所提供的定义。

The dlmopen() function permits object-load isolation—the ability to load a shared object in a new namespace without exposing the rest of the application to the symbols made available by the new object. Note that the use of the RTLD_LOCAL flag is not sufficient for this purpose, since it prevents a shared object's symbols from being available to any other shared object. In some cases, we may want to make the symbols provided by a dynamically loaded shared object available to (a subset of) other shared objects without exposing those symbols to the entire application. This can be achieved by using a separate namespace and the RTLD_GLOBAL flag.

dlmopen() 函数允许对象加载隔离——能够在一个新命名空间中加载一个共享对象而不将这些对象提供的符号暴露给应用程序的其余部分。注意，使用 RTLD_LOCAL 标志不足以实现这一目的，因为它阻止一个共享对象的符号对任何其他共享对象可用。在某些情况下，我们可能希望将一个动态加载的共享对象提供的符号提供给（某些子集的）其他共享对象，而不将这些符号暴露给整个应用程序。这可以通过使用一个独立的命名空间和 RTLD_GLOBAL 标志来实现。

The dlmopen() function also can be used to provide better isolation than the RTLD_LOCAL flag. In particular, shared objects loaded with RTLD_LOCAL may be promoted to RTLD_GLOBAL if they are dependencies of another shared object loaded with RTLD_GLOBAL. Thus, RTLD_LOCAL is insufficient to isolate a loaded shared object except in the (uncommon) case where one has explicit control over all shared object dependencies.

dlmopen() 函数还可以提供比 RTLD_LOCAL 标志更好的隔离。特别是，以 RTLD_LOCAL 加载的共享对象，如果它们是另一个以 RTLD_GLOBAL 加载的共享对象的依赖项，则可能被提升为 RTLD_GLOBAL。因此，RTLD_LOCAL 在隔离一个加载的共享对象时是不足够的，除非在（不常见的）情况下可以对所有共享对象的依赖项进行显式控制。

Possible uses of dlmopen() are plugins where the author of the plugin-loading framework can't trust the plugin authors and does not wish any undefined symbols from the plugin framework to be resolved to plugin symbols. Another use is to load the same object more than once. Without the use of dlmopen(), this would require the creation of distinct copies of the shared object file. Using dlmopen(), this can be achieved by loading the same shared object file into different namespaces.

dlmopen() 的可能用途包括插件，其中插件加载框架的作者不能信任插件作者，并且不希望插件框架中的任何未定义符号解析到插件符号。另一个用途是多次加载同一个对象。如不使用 dlmopen()，这将需要创建共享对象文件的不同副本。使用 dlmopen()，这可以通过将同一个共享对象文件加载到不同的命名空间中来实现。

The glibc implementation supports a maximum of 16 namespaces.

glibc 实现支持最多 16 个命名空间。

最后这一行表示，glibc最多只支持16个命名空间

其他实现支持无限命名空间：https://github.com/robgjansen/elf-loader，但是这个实现只有19个star，而且很多年没更新了

gdb无法解析到动态库内的符号

例如这样的代码制造coredump

#include <stdio.h>

void say_hello() {
    int *ptr = 0;
    *ptr = 1;
    printf("Hello from shared library!\n");
}

然后用dlmopen打开

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <link.h>

int main() {
    // 加载动态库到新的命名空间
    void *handle = dlmopen(LM_ID_NEWLM, "./libhello.so", RTLD_LAZY);
    //void *handle = dlopen("./libhello.so", RTLD_LAZY);
    if (!handle) {
        fprintf(stderr, "Error loading library: %s\n", dlerror());
        return 1;
    }

    // 获取函数指针
    void (*say_hello)() = dlsym(handle, "say_hello");
    if (!say_hello) {
        fprintf(stderr, "Error finding symbol: %s\n", dlerror());
        dlclose(handle);
        return 1;
    }

    // 调用函数
    say_hello();

    // 关闭动态库
    dlclose(handle);
    return 0;
}

gdb执行会丢失符号：

(gdb) r
Starting program: /root/cpp_test/dlmopen/main 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fc7131 in ?? ()
(gdb) bt
#0  0x00007ffff7fc7131 in ?? ()
#1  0x00005555555551ed in main ()

换成dlopen则没有问题

(gdb) r
Starting program: /root/cpp_test/dlmopen/main 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fc7131 in say_hello () from ./libhello.so
(gdb) bt
#0  0x00007ffff7fc7131 in say_hello () from ./libhello.so
#1  0x000055555555527a in main ()

有不少人反馈了这个问题https://stackoverflow.com/questions/51592455/debugging-strategies-for-libraries-open-with-dlmopen

因为当年使用dlmopen的人太少了，所以gdb没支持，21年gdb才终于修复了这个bug

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=8d56636a0ecbe6c38bf52b0683326ee21693c548

至少要升级到13.1的gdb版本才可以

gdb的静态编译也是个难题，我摸索了很久都没成功，万幸有现成的github项目：https://github.com/guyush1/gdb-static

打开的命名空间是完全空的，而不是fork当前的

这个问题在https://stackoverflow.com/questions/33497784/unresolved-symbol-with-only-dlmopen-and-not-dlopen也有记载

光看man文档，我一开始没有深刻理解dlmopen

直到遇到了业务模块如下的case：

动态库代码

#include <stdio.h>
#include <pthread.h>

void *function(void *arg){
    return NULL;
}

void say_hello() {
    printf("Hello from shared library!\n");
    pthread_t thread1;
    pthread_create(&thread1,NULL,function,(char*)"111");
}

编译这个动态库，并且查看未定义符号

~ gcc -fPIC -shared -o libhello.so hello.c
~ ldd -r libhello.so 
    linux-vdso.so.1 (0x00007ffff11b2000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd512448000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fd51264f000)
undefined symbol: pthread_create    (./libhello.so)

主程序

在刚才代码基础上微调，让主程序依赖pthread库

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <link.h>
#include <pthread.h>

int main() {
    void* ptr = (void*) pthread_create; //让主程序依赖pthread库
    // 加载动态库到新的命名空间
    void *handle = dlmopen(LM_ID_NEWLM, "./libhello.so", RTLD_LAZY);
    if (!handle) {
        fprintf(stderr, "Error loading library: %s\n", dlerror());
        return 1;
    }

    // 获取函数指针
    void (*say_hello)() = dlsym(handle, "say_hello");
    if (!say_hello) {
        fprintf(stderr, "Error finding symbol: %s\n", dlerror());
        dlclose(handle);
        return 1;
    }

    // 调用函数
    say_hello();

    // 关闭动态库
    dlclose(handle);
    return 0;
}

编译，查看依赖

~ gcc -g -Wall main.c -ldl -lpthread
~ ldd -r a.out 
    linux-vdso.so.1 (0x00007fff12d72000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fab172a7000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fab17284000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fab17092000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fab172c2000)

可以发现pthread库已经被依赖进来了，但是运行报错

1
2
3

~ ./a.out 
Hello from shared library!
./a.out: symbol lookup error: ./libhello.so: undefined symbol: pthread_create

阿这就很麻烦了，那说明需要把已有的动态库全部手动扫描出来，这可以使用dl_iterate_phdr完成，这个函数只会扫描默认命名空间的动态库

#include <iostream>
#include <link.h>

int callback(struct dl_phdr_info *info, size_t size, void *data) {
    if (info->dlpi_name && *info->dlpi_name) {
        std::cout << "Loaded library: " << info->dlpi_name << std::endl;
    }
    return 0;
}

int main() {
    dl_iterate_phdr(callback, nullptr);
    return 0;
}

但是动态库之间具有依赖关系，得用ldd -r分析缺失符号和每个动态库的已有符号，做出一个依赖树来，然后按顺序加载

这确实让人头疼，不过我很快就发现完全不用担心这个问题，dlmopen本身还有别的问题

和pthread不兼容

主要表现在dlmopen一旦重复打开pthread动态库以后，在执行popen时，cpu满载，strace看没有系统调用，堆栈如下

#0  0x00007f04f463991e in __reclaim_stacks () from target:/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f04edbeec32 in fork () from target:/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f04edb8a774 in _IO_proc_open () from target:/lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f04edb8a9f8 in popen () from target:/lib/x86_64-linux-gnu/libc.so.6
#4  0x0000000000cebd5b in __python_detail::PythonImpl::load_from_file_impl (async=<optimized out>) at python/python_loader.cpp:344
#5  0x00007f04ef925399 in uv__async_io (loop=0x4cf94e0 <__python_detail::py_loop>, w=<optimized out>, events=<optimized out>) at ../deps/uv/src/unix/async.c:150
#6  0x00007f04ef938198 in uv__io_poll (loop=loop@entry=0x4cf94e0 <__python_detail::py_loop>, timeout=-1) at ../deps/uv/src/unix/linux-core.c:431
#7  0x00007f04ef925c83 in uv_run (loop=loop@entry=0x4cf94e0 <__python_detail::py_loop>, mode=mode@entry=UV_RUN_DEFAULT) at ../deps/uv/src/unix/core.c:381
#8  0x0000000000cdf38e in __python_detail::PythonImpl::py_loader_impl_thread (ptr=0x7f03f817ad20) at python/python_loader.cpp:200
#9  0x00007f04f463a6db in start_thread () from target:/lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007f04edc2ba3f in clone () from target:/lib/x86_64-linux-gnu/libc.so.6

随后搜索资料看到了这个：https://stackoverflow.com/questions/55008283/dlmopen-and-c-libraries

It is a bug in the glibc version of libpthreads, which results in libraries loaded with dlmopen returning duplicates for pthread_key_create, resulting in thread-specific storage being clobbered (same key means same memory location, it's like malloc returning the same memory area multiple times).

这是 libpthreads glibc 版本中的一个错误，导致使用 dlmopen 加载的库返回 pthread_key_create 的重复项，从而导致特定于线程的存储被破坏（相同的键意味着相同的内存位置，就像 malloc 多次返回相同的内存区域）。

简单来说，就是由于完全的符号隔离，pthread_key_create这个动态库返回的内存地址是相同的，导致资源管理上出现了问题

相当于把malloc封装在动态库中，使用dlmopen加载到不同的命名空间，这导致两次malloc返回了相同的地址，那肯定会造成逻辑混乱

验证如下

#define _GNU_SOURCE
#include <assert.h>
#include <dlfcn.h>
#include <pthread.h>

typedef int (*pthread_key_create_t)(pthread_key_t *, void (*)(void*));
int main()
{
  pthread_key_t k1, k2;
  int rc;

  rc = pthread_key_create(&k1, NULL);
  void *h = dlmopen(LM_ID_NEWLM, "libpthread.so.0", RTLD_LAZY);
  assert(h != NULL);

  pthread_key_create_t fn = (pthread_key_create_t)dlsym(h, "pthread_key_create");
  assert(fn != NULL);

  rc = fn(&k2, NULL);
  assert(rc == 0);
  assert(k2 != k1);

  return 0;
}

执行结果

~ gcc -g -Wall test.c -ldl -pthread
~ ./a.out 
a.out: test.c:21: main: Assertion `k2 != k1' failed.
Aborted (core dumped)

这个问题对dlmopen的语义似乎是合理的，但是对使用者来说很不合理

在这个issue中，https://sourceware.org/bugzilla/show_bug.cgi?id=26955

设计者认为这个问题存在两种方案：

一个是glibc以上的完全隔离，这个还没实现（在等一个叫Vivek的人完成他的patches）
一个是当前的方案，完全隔离

找了一下Vivek的patches：dlmopen 的 RTLD_SHARED 概念验证实现：

The following patchset attempts an implementation for this: If an object is loaded with the new RTLD_SHARED flag we instead ensure that a "master" copy exists (and is flagged as no-delete) in the main namespace and a thin wrapper or clone is placed in the target namespace.

该补丁集尝试实现以下功能：如果一个对象是通过新的 RTLD_SHARED 标志加载的，我们将确保在主命名空间中存在一个“主”副本（并标记为不可删除），并在目标命名空间中放置一个精简的包装或克隆。

dlmopen will implicitly apply RTLD_SHARED to the libc/libpthread group

dlmopen 将隐式地将 RTLD_SHARED 应用于 libc/libpthread 组

不错，这正是我需要的

但是进度堪忧，根据https://patchwork.ozlabs.org/project/glibc/list/?submitter=73922可以看到，作者在更新了3年以后，到21年就沉寂了

最新的patches是v14版本的[v14,0/7] Implementation of RTLD_SHARED for dlmopen，但是这个版本还有一些bug，作者留言：

Looks like this has regressed wrt v12 - I suspect the reorganisation of the has_gnu_unique code, but not sure yet. Looking into it.

他有一个自己的动态库加载项目https://github.com/dezgeg/libcapsule/tree/master，这个项目倒还在高强度更新

总结

dlmopen虽然很多年了，但是还需要再沉淀沉淀

2025-01-24

动态库TLS踩坑：cannot allocate memory in static TLS block

最近业务报障，使用taf框架编译成动态库，从3.4.5.8-notrace版本换成3.4.6.0-notcmalloc版本后

dlopen打开动态库报错cannot allocate memory in static TLS block，业务认为是不带tcmalloc导致的

我的最终结论是taf框架的动态库编译强制开启了静态TLS，这个线程私有变量区是存在上限的，trace代码有超大的线程私有变量，导致dlopen失败

总结了一下何时会开启静态TLS，并且如何解决

初步定位并解决问题
2024-05-01

分析一个由于线程私有变量生命周期导致的coredump

最近实在太忙啦，好几个月没写博客了，趁着五一放假补一篇

最近运维同学调整了告警策略，将连续coredump才告警，改成了每次coredump必告警

业务部门顿时向我报障了taf框架的coredump

一开始core在了tcmalloc，因为tcmalloc不会第一时间coredump，所以内存问题会跑一段时间才出现
2022-09-12

hook的妙用

hook是一个非常有用的黑魔法

协程基于它的最大应用之一

我总结了一下hook的原理，和我遇到的hook场景

hook原理

我第一次看到hook这个词是在破解论坛上，大致意思是将指定函数替换成自己的，然后再去执行这个指定函数

对于编译器而言，符号的全部相关数据都在Elf32_Sym或Elf64_Sym，以Elf32_Sym为例了解一些编译器的知识，然后再分析C++的语法

typedef struct
{
  Elf32_Word    st_name;   /* Symbol name (string tbl index) */
  Elf32_Addr    st_value;  /* Symbol value */
  Elf32_Word    st_size;   /* Symbol size */
  unsigned char st_info;   /* Symbol type and binding */
  unsigned char st_other;  /* Symbol visibility under glibc>=2.2 */
  Elf32_Section st_shndx;  /* Section index */
} Elf32_Sym;

st_info这个结构是用于区分各种类型的，也是这一篇博文的重点，这个char结构8位

高 4 位用于表示 st_bind：

值	静态链接行为
`STB_LOCAL`	允许在不同的目标文件中，存在多个相同名字的符号，符号之间相互隔离（无关联）。
`STB_GLOBAL`	只允许一个目标文件存在 `GLOBAL` 符号定义，其它目标文件允许相同名字的未定义的符号引用，即不允许多个定义。
`STB_WEAK`	允许在不同的目标文件中存在多个相同名字的符号。具体来说，如果同时存在一个 `GLOBAL` 符号和其它同名 `WEAK` 符号的定义，那就挑选 `GLOBAL` 符号（ignores the weak ones）；如果存在多个同名 `WEAK` 符号的定义，事实标准是挑选 linker 解析过程中接触到的第一个符号。

我这里特意强调了静态链接行为，因为静态链接和动态链接虽然对LOCAL的逻辑是一致的，但是动态链接器在符号重定位的时候，对GLOBAL和WEAK是一视同仁的

2025-09-08

tcmalloc导致coredump问题踩坑

自研serverless平台存在一个问题很多年了，引入cpython以后，就不能使用tcmalloc了

否则会直接coredump，这个问题不解决，使用平台的同学就没办法进行内存泄露分析

在一个多部门组成的python和C++的混合脚本上，问题爆发了，由于申请内存是一个部门的模块，释放内存又是另外一个部门的模块，跨部门协作下的内存排查太过困难了

因此还是需要从平台侧解决这个问题

coredump问题

背景