You may be familiar with "the LD_PRELOAD trick". This "trick" is used to implement things like heaptrack. By interposing a third library between an application and libc's malloc/free you can track the state of the heap and recognize errors like double frees and memory leaks. But this doesn't work for libraries loaded with RTLD_LOCAL, which is the default behavior of dlopen. Why not? Let's look at how this sort of linking works normally first, and then we can figure out why it goes wrong with RTLD_LOCAL.
Unless a program is completely statically linked, it will contain undefined symbols in its symbol table. You can see these by running
readelf -s on the program. Even the most trivial programs, such as
/bin/true, will have some.
~/dev/scratch$ readelf -s /bin/true Symbol table '.dynsym' contains 59 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND free@GLIBC_2.2.5 (2) 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND abort@GLIBC_2.2.5 (2) 3: ....
Ndx for the undefined symbols. This output says that
/bin/true expects to find the
free@GLIBC_2.2.5 symbol in another library at runtime.
These other libraries come from
DT_NEEDED entries in the ELF object's .dynamic section.
readelf -d will show you these.
~/dev/scratch$ readelf -d /bin/true Dynamic section at offset 0x8c98 contains 27 entries: Tag Type Name/Value 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] ....
/bin/true only lists a single library here, more complicated binaries can list several. Those libraries can have their own
DT_NEEDED entries that point to other libraries.
libc.so.6 for instance will itself require the dynamic linker
ldd on a binary will show you the full list.
~/dev/scratch$ ldd /bin/true linux-vdso.so.1 (0x00007fffe6099000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f65a0af3000) /lib64/ld-linux-x86-64.so.2 (0x00007f65a0d1f000)
linux-vdso.so.1 library is provided by the kernel and is not important here).
ELF objects specify the symbols they need and they specify any additional shared libraries they need, but they don't actually specify which symbols come from which objects. Instead, the dynamic linker searches all the objects starting with the original executable and then proceeding through the
DT_NEEDED entries in turn. The
LD_DEBUG environment variable can be used to have the dynamic linker explain what it is doing.
For instance, for that
abort@GLIBC_2.2.5 symbol at the top of this post,
LD_DEBUG=all outputs the following:
52530: symbol=abort; lookup in file=/bin/true  52530: symbol=abort; lookup in file=/lib/x86_64-linux-gnu/libc.so.6  52530: binding file /bin/true  to /lib/x86_64-linux-gnu/libc.so.6 : normal symbol `abort' [GLIBC_2.2.5]
We see that the dynamic linker first looked for it in
/bin/true, and then when it was not found there, it looked for it in
libc.so.6. The symbol is defined there, so that one was used. Here's another symbol:
52530: symbol=_dl_argv; lookup in file=/bin/true  52530: symbol=_dl_argv; lookup in file=/lib/x86_64-linux-gnu/libc.so.6  52530: symbol=_dl_argv; lookup in file=/lib64/ld-linux-x86-64.so.2  52530: binding file /lib/x86_64-linux-gnu/libc.so.6  to /lib64/ld-linux-x86-64.so.2 : normal symbol `_dl_argv' [GLIBC_PRIVATE]
This one was not found in
/bin/true, nor was it found in
libc.so.6, so the search continued to the third library where it was finally found in
The magic of
LD_PRELOAD is that it lets you insert additional libraries near the beginning of this search list. Symbols in the
LD_PRELOADed libraries will then be preferred to symbols in the normally loaded libraries. For example:
~/dev/scratch$ LD_PRELOAD=~/dev/obj-rr/lib/rr/librrpreload.so ldd /bin/true linux-vdso.so.1 (0x00007ffeab1e9000) /home/khuey/dev/obj-rr/lib/rr/librrpreload.so (0x00007f2deb0fc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2deaedd000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f2deaed7000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f2deaeb4000) /lib64/ld-linux-x86-64.so.2 (0x00007f2deb11c000)
librrpreload.so library and you can see it appears in the search list ahead of
ld-linux-x86-64.so.2 (it also brings along its own new dependencies on
librrpreload.so contained a definition for the
abort@GLIBC_2.2.5 symbol, then the dynamic linker would use the version in
librrpreload.so and not the version in
While sometimes replacing a symbol is sufficient, often
LD_PRELOAD is used to wrap a symbol (as in heaptrack) with some additional logic. This requires a way for the wrapper function in the preload library to find the "original" symbol that would have been used if the preload library had not been present. The dynamic linker exposes a function dlsym that can do exactly that: it takes a
handle argument that can have the special value
RTLD_NEXT, which roughly means "find the next location of this symbol after me". So, a malloc-wrapping library can call
dlsym(RTLD_NEXT, "malloc") to get the "normal" malloc, do any custom processing it wants, and then forward the call to the "normal" symbol.
You might wonder how dlsym implements the "after me" part of that description. After all, there's no "me" parameter to dlsym. dlsym actually looks at the return address on the stack to determine what the calling ELF object is. It can then find that object in the search list and resume the search after that object.
The way dynamic symbol lookup works may seem a bit brittle. Libraries have to be careful not to step on each others toes by using the same symbol names, or the order in which they are searched needs to be managed to ensure that symbol lookups bind to the right values. But if
LD_PRELOAD is not used, the symbol search results are all effectively determined at build time, so bugs in dynamic linking tend to be rare. Dynamic loading with dlopen changes that though.
dlopen allows for the construction of different library load orders at runtime. dlopen also introduces the concept of "scopes". There are
RTLD_LOCAL (the default) options for dlopen.
RTLD_GLOBAL adds the loaded library to the normal symbol search list (now called the global scope or scope 0) as if it had been loaded with
DT_NEEDED at application startup.
RTLD_LOCAL, on the other hand, adds the loaded library to a search list that is specific to the current ELF object (called the local scope or scope 1).
RTLD_LOCAL is also transitive, meaning that the
DT_NEEDED dependencies of a library opened with
RTLD_LOCAL will themselves be loaded as
RTLD_LOCAL and not added to the global scope.
This is very poorly documented in the man pages but it is visible when using
LD_DEBUG to see what the dynamic linker is doing.
The existence of the local scope is fine for most purposes. Typically, when a library is loaded via dlopen, symbols in it will be looked up by using dlsym with the handle to that specific library, so the search order does not matter.
RTLD_LOCAL also ensures that symbols loaded as part of one dynamically loaded library don't interfere with another dynamically loaded library loaded later. And since the global scope is searched first before the local scope, symbol lookups during the dlopen behave as one would expect.
Where this does cause problems though is in conjunction with symbol wrapping and
LD_PRELOAD. Suppose a symbol in one of the
RTLD_LOCAL-loaded libraries is interposed by an
LD_PRELOADed library (which, by definition, is in the global scope). The symbol search as part of the dlopen will bind to the preloaded library, because that is at the beginning of the global scope and far earlier than anything in the local scope. When that symbol is actually executed, the preloaded library will use dlsym to try to find the "normal" symbol to forward the call to. But the local scope for the preloaded library is different than the local scope for the binary that called dlopen. None of the
RTLD_LOCAL-loaded libraries will be in scope for the preloaded library, and the
dlsym(RTLD_NEXT, ...) call will fail, leaving the preloaded library unable to forward the call.
This happened to us in rr issue #3304.
librrpreload.so wraps certain
libstdc++.so symbols (to disable rdrand in
std::random_device). In this issue the primary application was python, which is a C, not C++, program, so it does not load
DT_NEEDED. The user's python script loads the a python extension, which is a binary that is dlopened with
RTLD_LOCAL. That extension, in turn, does use
libstdc++.so, which is transitively loaded with
RTLD_LOCAL. The conditions for this failure are now present: when the interposed function is called execution ends up in
librrpreload.so's wrapper, and when it attempts to
dlsym(RTLD_NEXT, ...) the symbol, that fails, because
libstdc++.so is not present in the global scope.
dlsym determines which binary's scopes to use the same way it determines the current binary for
RTLD_NEXT, by using the return address on the stack. This precludes an actual solution because in order to do the correct dlsym lookup the preload library needs we need to use the scope of the library that loaded the
RTLD_LOCAL libraries. But for
RTLD_NEXT we need to start searching from the preloaded library, and since both are determined by the same address, there's no way to do both.
Barring a future dynamic linker API, we've settled for recognizing this situation and printing an error message telling the user to force
libstdc++.so into the global scope (and ironically enough, the easiest way to do that is via