Hooking The Linux System Call Table

Hooking the Linux System Call Table | Tyler Nichols https://www.tnichols.org/2015/10/19/Hooking-the-Linux-Syst...
Tyler Nichols Home Archives About 
Search 
Home Archives About
Hooking the Linux System Call Table

 2015-10-19  LINUX  KERNEL  SYSTEM CALLS  HOOKING
The Linux kernel maintains a table of pointers that reference various functions made
available to user space as a way of invoking privileged kernel functionality from
unprivileged user space applications. These functions are collectively known as system
calls.
Any legitimate software looking to hook kernel space functions should �rst consider
using existing infrastructure designed for such uses like the Linux kernel tracepoints
framework or the Linux security module framework. Rootkits are about the only
reasonable application of these techniques, for some value of reasonable.
This code is an unintentional by-product of a project I was on at work. Considering the

pedagogical value of such an endeavor, I decided to strip out all the code responsible
for hooking the syscall table, distill it down into a single loadable kernel module that
can be easily understood on its own, and write it up.
You can �nd the source code here.
This code was written and tested on Ubuntu 14.04 LTS using the standard Ubuntu
Linux 3.13.x kernel.
And without further ado, let’s get started.
Introduction
1 of 16 2018-12-26, 8:46 p.m.

Hooking the Linux system call table from within a loadable kernel module is not all
that di�cult. After all, we are running with kernel privileges. We can do whatever we
want. We can dereference and overwrite any memory address at will.
This, of course, doesn’t mean that our reckless memory overwriting isn’t going to
cause problems. Because it almost certainly will, done improperly.
That said, there are a few basic things we need to do:
1. Locate the system call table

2. Mark the segment of memory containing the system call table as writeable
By default, the syscall table is marked as read-only
3. Find the o�set of the pointer to the function we want to hook in the syscall table
We will be targeting the “write” system call in this tutorial
4. Overwrite the appropriate 32-bit/4-byte pointer in the syscall table with a 32-bit
address pointing to a function we de�ne, thus completing the hook
5. Mark the syscall table as read-only
Even rootkits should clean up after themselves; don’t leave the place
looking like a pig sty
Step number 1 is going to be the most di�cult step by far.
The Main Challenge
There are some �imsy mechanisms in place to discourage LKMs (loadable kernel
modules) from tampering with the syscall table for (hopefully obvious) security
reasons.
First and foremost, the static portion of the Linux kernel - i.e. the portion that doesn’t
reside in loadable kernel modules - does not export the syscall table symbol.
Why?
Because LKMs have no earthly business messing with the syscall table. The only valid
reason for an LKM to overwrite system call pointers is to corrupt the behavior of the
operating system, most often for concealment of malicious software.
Since the kernel does not export the syscall table symbol, we need to �nd it ourselves.
We do this by manually reading in and scanning the System.map-$(uname -r) �le,
2 of 16 2018-12-26, 8:46 p.m.

looking for the “sys_call_table” address. Once we have retrieved the address, we
simply need to �nd the appropriate o�set for it based on the system call we’re trying
to hook, dereference it, and write to it.
This tutorial will show you how to hook system calls from a loadable kernel module
(LKM) in the Linux kernel, complete with a code walkthrough. The code presented here
has been tested and is known to work reliably.
Implementation
Although this code base has a few hundred lines to it, it’s actually very simple.
Much of the code simply handles logistics - nothing more. The two largest functions in
this example are responsible for 1) acquiring the version of the currently running
kernel so we can identify the correct System.map-$(uname -r) �le to read from and 2)
reading in the System.map-$(uname -r) �le line by line, checking each full line read to
see if it begins with “sys_call_table”.
That’s it.
Once we’ve got the address of the sys call table, it’s trivial to overwrite. Let’s take a
look.
General Structure
There are a few things going on in this application. Much of the code comprises helper
functions that read �les and parse strings. Other than the helpers, we have our
newwrite() function that is going to be the function we hook into the sys call table and our
standard \_init and __exit functions for loadable kernel module.
Important De�nes
Toward the top of the code, you will notice:
#define PROC_V "/proc/version"

#define BOOT_PATH "/boot/System.map-"
3 of 16 2018-12-26, 8:46 p.m.

#define MAX_VERSION_LEN 256
PROC_V is the �le path to the /proc virtual �lesystem location that contains version
information of the currently running kernel.
BOOT_PATH is the �le path to the System.map-$(uname -r) �le that we are looking for
sans appended version information. We have to retrieve the kernel version before we
can �nish constructing this string.
MAX_VERSION_LEN is the maximum length of the version information bu�er used to

store information read from the PROC_V address. We also use this de�ne as a
maximum bu�er length for storing newline-separated strings in System.map-$(uname
-r) as we parse it looking for the “sys_call_table” entry.
__init and __exit macros
In Linux loadable kernel modules, the function decorated with the __init macro is the
entry point to the module when it’s loaded and the function decorated with the __exit
macro is the destructor function that’s executed when the module is unloaded.
Since it only takes a couple lines of code to place our hooks in this simple example, we
perform our dirty work directly in these functions. We’ll come back to these functions
in a few minutes.
Helper Functions
char *acquire_kernel_version (char *buf)
Reads version info from PROC_V and chops it down to just the string we want. We
need our version info to be in the same format that’s produced by $(uname -r).
First things �rst, we declare some variables:
struct file *proc_version;

char *kernel_version;
mm_segment_t oldfs;
4 of 16 2018-12-26, 8:46 p.m.

Next, we have to change the legal virtual address space of this process to include the
kernel data segment. If we skip this step, the call to read the �le will fail the user space
virtual address check performed by the kernel. In short, this allows us to read �le
contents into kernel memory later on:
oldfs = get_fs();
set_fs (KERNEL_DS);
Once we’re setup to read data into kernel space without causing a fault, we open the
PROC_V �le for reading and prepare our bu�er:
proc_version = filp_open(PROC_V, O_RDONLY, 0);

if (IS_ERR(proc_version) || (proc_version == NULL)) {
return NULL;
}
memset(buf, 0, MAX_VERSION_LEN);
And �nally, we read in the entire contents of PROC_V up to a maximum size of

MAX_VERSION_LEN:
vfs_read(proc_version, buf, MAX_VERSION_LEN, &(proc_version->f_pos));
We then tokenize the version information to extract just the information we want. The
piece of information we want is located in the third space-separated column that is
output by PROC_V:
kernel_version = strsep(&buf, " ");

Close out the �le elegantly:
5 of 16 2018-12-26, 8:46 p.m.

filp_close(proc_version, 0);
Set the legally addressable virtual memory segment back to user space:
set_fs(oldfs);
Return the pointer to the �nal token produced by our calls to strsep():
return kernel_version;
And there we have it. We can now rely on this helper to gather the version information
we need for us.
int �nd_sys_call_table (char *kern_ver)
Given the $(uname -r)-style kernel version, this function builds the System.map- �le
name by appending kern_ver to BOOT_PATH, opens the �le for reading, and reads the
�le line by line.
First, we declare the variables we’re going to need, as usual:
char system_map_entry[MAX_VERSION_LEN];
int i = 0;
/*
* Holds the /boot/System.map-<version> file name as we build it
*/
char *filename;
/*
* Length of the System.map filename, terminating NULL included
*/
size_t filename_length = strlen(kern_ver) + strlen(BOOT_PATH) + 1;
6 of 16 2018-12-26, 8:46 p.m.

/*
* This will point to our /boot/System.map-<version> file
*/
struct file *f = NULL;
mm_segment_t oldfs;
Here is the old memory address segment trick to switch from allowing only user space
references to also allowing kernel space references:
oldfs = get_fs();
set_fs (KERNEL_DS);
Some basic logging:
printk(KERN_EMERG "Kernel version: %s\n", kern_ver);
Allocate space for the System.map �le name so we can build it:
filename = kmalloc(filename_length, GFP_KERNEL);

if (filename == NULL) {
printk(KERN_EMERG "kmalloc failed on System.map-<version> filename allocatio
return -1;
}
Zero out the memory in preparation for constructing the �le name just to be safe:
memset(filename, 0, filename_length);
Build the “/boot/System.map-“ �le name:
7 of 16 2018-12-26, 8:46 p.m.

strncpy(filename, BOOT_PATH, strlen(BOOT_PATH));

strncat(filename, kern_ver, strlen(kern_ver));
Open the System.map �le for reading:
f = filp_open(filename, O_RDONLY, 0);

if (IS_ERR(f) || (f == NULL)) {
printk(KERN_EMERG "Error opening System.map-<version> file: %s\n", filename)
return -1;
}
Zero out the system_map_entry bu�er to be safe. The system_map_entry bu�er is

going to be used to store each line in the System.map �le as we iterate through it so
we can check it for the sys_call_table entry:
memset(system_map_entry, 0, MAX_VERSION_LEN);
We read the �le one character at a time until we have read an entire line. We
determine that we’ve read an entire line by 1) checking for a newline (‘\n’) character or
2) checking to see if we have read in the maximum amount of data that our bu�er can
hold, i.e. MAX_VERSION_LEN bytes.
Once we have read in an entire line, we do a basic string comparison to see if the �rst
part of our system_map_entry bu�er matches the string “sys_call_table”. If it does, we
allocate some space to store the following address in. The System.map �le is in the
format:
<symbol name> <address>
so we tokenize (strsep()) the system_map_entry bu�er, which returns a pointer to the

second space-separated column in the line we’ve just read. That is, we get a pointer
straight to the address of the “sys_call_table” symbol, as per the System.map format
shown above.
8 of 16 2018-12-26, 8:46 p.m.

Once we’ve got that pointer, we simply copy it into sys_string and then invoke kstrtoul
on sys_string to convert sys_string - which contains a string representation of the hex
address of the “sys_call_table” symbol as pulled from System.map- - to an unsigned
long (4 byte/32 bit) address using base 16 (hex) representation and write the value to
our global syscall_table pointer:
while (vfs_read(f, system_map_entry + i, 1, &f->f_pos) == 1) {

/*
* If we've read an entire line or maxed out our buffer,
* check to see if we've just read the sys_call_table entry.
*/
if ( system_map_entry[i] == '\n' || i == MAX_VERSION_LEN ) {
// Reset the "column"/"character" counter for the row
i = 0;
if (strstr(system_map_entry, "sys_call_table") != NULL) {

char *sys_string;
char *system_map_entry_ptr = system_map_entry;
sys_string = kmalloc(MAX_VERSION_LEN, GFP_KERNEL);

if (sys_string == NULL) {
filp_close(f, 0);
set_fs(oldfs);
kfree(filename);
return -1;
}
memset(sys_string, 0, MAX_VERSION_LEN);
strncpy(sys_string, strsep(&system_map_entry_ptr, " "), MAX_VERSION_
kstrtoul(sys_string, 16, &syscall_table);

printk(KERN_EMERG "syscall_table retrieved\n");
kfree(sys_string);
9 of 16 2018-12-26, 8:46 p.m.

break;
}
memset(system_map_entry, 0, MAX_VERSION_LEN);
continue;
}
i++;
}
Once we’re done doing all that, we clean up after ourselves by closing out our �le
handle, changing the addressable virtual memory segment back to user space, and
returning.
filp_close(f, 0);
set_fs(oldfs);
kfree(filename);
return 0;
At this point, the syscall_table pointer - which was declared to be global to the module
- now contains the address of the system call table as taken from /boot/System.map-
and is ready to be dereferenced.
Placing the hooks
The __init onload function is the entry point to the module and is where our primary
logic resides since it’s so simple. After we allocate require storage, we invoke the
�nd_sys_call_table() function with the result of an invocation to
acquire_kernel_version() passed in as an argument. By combining the two helpers
discussed previously, we are able to collect all the prerequisite information we need to
place our hooks:
char *kernel_version = kmalloc(MAX_VERSION_LEN, GFP_KERNEL);
10 of 16 2018-12-26, 8:46 p.m.

find_sys_call_table(acquire_kernel_version(kernel_version));
After �nd_sys_call_table() returns, the global unsigned long syscall_table variable that
we declared at the top of our C �le is populated and ready for manipulation.
However, there is one little caveat left: the memory address where sys_call_table
resides is not writeable. The processor itself will raise an exception if you try to write
to it all willy-nilly.
So what do we do? We use the Linux paravirtualization system to change the 16th bit
of the CR0 register. The CR0 register is one of the control registers in the x86
processor that a�ects basic CPU functionality. The 16th bit of the CR0 register is the
“Write Protect” bit that indicates to the processor that it cannot write to read-only
memory pages, even when running as root. This is why the CPU will raise an exception
if you try to write to syscall_table right o� the bat.
Even though the CPU will refuse to write to read-only memory pages when the WP bit
of the CR0 register is set, we are the kernel. We can just toggle that bit and continue
on our way.
Using the write_cr0 and read_cr0 macros along with a logical bitmask for setting the
WP bit (16th bit in CR0 register) to 0, we can trivially disable write protection as shown
below.
Once that’s done, we simply dereference the appropriate o�set for the system call we
want to overwrite by using the kernel-de�ned _NR* indices, of which there is exactly 1
for each and every system call in the system. Using these prede�ned o�sets, we write
the address of our new_write() function over the address of the system call write()
function:
if (syscall_table != NULL) {
write_cr0 (read_cr0 () & (~ 0x10000));
original_write = (void *)syscall_table[__NR_write];
syscall_table[__NR_write] = &new_write;
write_cr0 (read_cr0 () | 0x10000);
printk(KERN_EMERG "[+] onload: sys_call_table hooked\n");
11 of 16 2018-12-26, 8:46 p.m.

} else {
printk(KERN_EMERG "[-] onload: syscall_table is NULL\n");
}
kfree(kernel_version);
return 0;
Once we overwrite our target system call function pointer, we re-enable write protect
in the CR0 register and exit the __init function successfully.
Removing the hooks
In order to keep our system in a clean and stable state, we want to remove our hooks
gracefully when the module is unloaded. The __exit onunload() function behaves very
similarly to the __init onload function since it also has to toggle the write protect bit in
the CR0. The onunload function even writes to the exact same o�set into the
sys_call_table array as the onload function did.
The only di�erence is that the onunload function writes the address of the original
write() function over the address of our new_write() function, putting everything back
to the way it was before we came along:
if (syscall_table != NULL) {
write_cr0 (read_cr0 () & (~ 0x10000));
syscall_table[__NR_write] = original_write;
write_cr0 (read_cr0 () | 0x10000);
printk(KERN_EMERG "[+] onunload: sys_call_table unhooked\n");
} else {
printk(KERN_EMERG "[-] onunload: syscall_table is NULL\n");
}
printk(KERN_INFO "Goodbye world!\n");
#C #Kernel #Linux #hooking #syscall table #tutorial  Share
12 of 16 2018-12-26, 8:46 p.m.

NEWER
Basic Loadable Linux Kernel Module Example
OLDER
Understanding Cryptographic Primitives
RECENTS
LINUX  RASPBERRY PI
RASPBERRY PI OPENVPN SERVER BEHIND UBUNTU SERVER ROUTER
2016-07-16
LINUX  KERNEL
BASIC LOADABLE LINUX KERNEL MODULE EXAMPLE
2016-07-15
LINUX  KERNEL
HOOKING THE LINUX SYSTEM CALL TABLE
2015-10-19
CRYPTOGRAPHY  PRIMITIVES
UNDERSTANDING CRYPTOGRAPHIC PRIMITIVES
2015-09-27
CRYPTOGRAPHY  ENCRYPTION
ENCRYPTING AND SIGNING USING LIBGCRYPT
2015-09-26
CATEGORIES
 cryptography (2)
 encryption (1)
 gcrypt/libgcrypt (1)
 primitives (1)
13 of 16 2018-12-26, 8:46 p.m.

 linux (3)
 Raspberry Pi (1)
 Raspbian (1)
 Jessie (1)
 OpenVPN (1)
 kernel (2)
 modules (1)
 system calls (1)
 hooking (1)
TAGS
 AES (1)
 C (2)
 HMAC (1)
 Kernel (1)
 Linux (3)
 OpenVPN (1)
 PBKDF2 (1)
 Raspberry Pi (1)
14 of 16 2018-12-26, 8:46 p.m.

 Raspbian Jessie (1)
 Tutorial (1)
 cryptography (2)
 digital signing (2)
 encryption (2)
 gcrypt (1)
 hooking (1)
 kernel (1)
 key derivation/stretching (1)
 libgcrypt (1)
 module (1)
 syscall table (1)
 tutorial (3)
TAG CLOUD
AES C HMAC Kernel Linux OpenVPN PBKDF2 Raspberry Pi Raspbian Jessie Tutorial cryptography
digital signing encryption gcrypt hooking kernel key derivation/stretching libgcrypt module syscall
table tutorial
ARCHIVES
 July 2016 (2)
 October 2015 (1)
15 of 16 2018-12-26, 8:46 p.m.

 September 2015 (2)
© 2015 - 2017 Tyler Nichols

Powered by Hexo. Theme by PPO�ce
16 of 16 2018-12-26, 8:46 p.m.

Hooking The Linux System Call Table

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Hooking The Linux System Call Table

Enviado por

Direitos autorais:

Formatos disponíveis

Hooking the Linux System Call Table | Tyler Nichols https://www.tnichols.org/2015/10/19/Hooking-the-Linux-Syst...

Tyler Nichols Home Archives About 

Home Archives About

Hooking the Linux System Call Table

This code is an unintentional by-product of a project I was on at work. Considering the

You can �nd the source code here.

And without further ado, let’s get started.

1 of 16 2018-12-26, 8:46 p.m.

That said, there are a few basic things we need to do:

1. Locate the system call table

Step number 1 is going to be the most di�cult step by far.

The Main Challenge

2 of 16 2018-12-26, 8:46 p.m.

Toward the top of the code, you will notice:

#define PROC_V "/proc/version"

3 of 16 2018-12-26, 8:46 p.m.

#define MAX_VERSION_LEN 256

MAX_VERSION_LEN is the maximum length of the version information bu�er used to

__init and __exit macros

char *acquire_kernel_version (char *buf)

First things �rst, we declare some variables:

struct file *proc_version;

4 of 16 2018-12-26, 8:46 p.m.

proc_version = filp_open(PROC_V, O_RDONLY, 0);

And �nally, we read in the entire contents of PROC_V up to a maximum size of

vfs_read(proc_version, buf, MAX_VERSION_LEN, &(proc_version->f_pos));

kernel_version = strsep(&buf, " ");

Close out the �le elegantly:

5 of 16 2018-12-26, 8:46 p.m.

int �nd_sys_call_table (char *kern_ver)

First, we declare the variables we’re going to need, as usual:

6 of 16 2018-12-26, 8:46 p.m.

Some basic logging:

printk(KERN_EMERG "Kernel version: %s\n", kern_ver);

filename = kmalloc(filename_length, GFP_KERNEL);

Build the “/boot/System.map-“ �le name:

7 of 16 2018-12-26, 8:46 p.m.

strncpy(filename, BOOT_PATH, strlen(BOOT_PATH));

Open the System.map �le for reading:

f = filp_open(filename, O_RDONLY, 0);

Zero out the system_map_entry bu�er to be safe. The system_map_entry bu�er is

<symbol name> <address>

so we tokenize (strsep()) the system_map_entry bu�er, which returns a pointer to the

8 of 16 2018-12-26, 8:46 p.m.

while (vfs_read(f, system_map_entry + i, 1, &f->f_pos) == 1) {

if (strstr(system_map_entry, "sys_call_table") != NULL) {

sys_string = kmalloc(MAX_VERSION_LEN, GFP_KERNEL);

strncpy(sys_string, strsep(&system_map_entry_ptr, " "), MAX_VERSION_

kstrtoul(sys_string, 16, &syscall_table);

9 of 16 2018-12-26, 8:46 p.m.

Placing the hooks

char *kernel_version = kmalloc(MAX_VERSION_LEN, GFP_KERNEL);

10 of 16 2018-12-26, 8:46 p.m.

11 of 16 2018-12-26, 8:46 p.m.

Removing the hooks

printk(KERN_INFO "Goodbye world!\n");

#C #Kernel #Linux #hooking #syscall table #tutorial  Share

12 of 16 2018-12-26, 8:46 p.m.

13 of 16 2018-12-26, 8:46 p.m.

 system calls (1)

14 of 16 2018-12-26, 8:46 p.m.

 Raspbian Jessie (1)

 digital signing (2)

init and exit macros

char acquire_kernel_version (char buf)