Forays into Systems Programming: "Who Watches the Watchmen?"-Writing a GNU/Linux x86_64 Debugger with Rust and the Nix Crate

What does it mean to be robust? If a given piece of code cannot handle an edge case or correctly handle it’s memory or other resources, or resolves to terminate itself due to behavior deemed dangerous by the host, or by the it’s own decisions, then what may be said about that code other than being flimsy, brittle, and weak? Systems programming requires one to adhere to a specific design that avoids these traits, but low-level C programming does not facilitate such a design easily.

A debugger stands as a product of systems programming that might show what a robust program should be like. It reigns in buggy messes of code well enough to allow developers to peak inside and tweak running processes. However, a debugger, like any piece of a system, falls prey to the same mistakes as any other piece of software programmed in low-level languages. What is one to do when the debugger itself has bugs? In the words of Alan Moore, “who watches the watchmen?”

For the case of a Rust developer, an extremely pessimistic compiler and fine-grained, exhaustive control flow.

In this series of blog posts, the goal is to not only educate how one would right such a minimum viable debugger, but learn about x86_64 instruction set architecture, as well as some GNU/Linux particulars. In addition, the Rust language will also be explored as well, with thorough digression concerning coding constructs, design patterns, and relevant third-party Rust code.

This post is divided into three parts: the first being the general groundwork and debugger logic basics, the second being a refactoring of code, exploring more of the language and well-established Rust design patterns. The third delves into a complicated feature (breakpoints), which delves into the ISA and operating systems specifics, as well as DWARF debugging information. Readers are welcome to read in any order, but it is recommended to progress canonically from this post to the last. Also, feel free to follow along with the codebase here at this repo https://github.com/acolite-d/tracee-db/blob/main/README.md. The first commit will be showcased in this opening blog post.

General Workflow of a Debugger

Most programmers know the common use cases of a debugger. Stepping through a program instruction by instruction, setting breakpoints to peak at program state at specific lines or functions, viewing said state by inspecting registers, variables, stack frames so on. But some might wonder what technology allows this to happen. For the case of a Linux environment, the backbone of a debugger constitutes of variety of ptrace system calls.

ptrace, or “process trace” is a complex facility that, given a target resource identifier, a message, and arguments, a “tracer” process can be able to view and manipulate the state of a running process, or “tracee.” While the common system calls excel at doing one thing very well, like reading or writing to file, forking a process, obtaining resources IDs, ptrace is unique in that it is a multi-faceted interface built into a system call, making it a swiss-army knife of sorts. The resulting functionality you tap into depends on the message you send to it, PTRACE_STEP, PTRACE_POKEDATA, PTRACE_PEEKDATA, PTRACE_GETREGS, etc. Through this interface, debuggers like gdb can allow users to track a program’s state at runtime.

Thankfully, we only need to use a handful of messages to produce a very basic debugger. All of the following use a single ptrace system call each!

“Stepping” through a process, or executing one instruction at a time
Continue running a process from where it left off
Viewing CPU registers contents of a currently running process
Reading a word from tracee memory at a given address
Writing a word to tracee memory with a given address and value

Laying the Groundwork with the Nix Crate, Forking to Tracer-Tracee

However, before we can start using ptrace, we first need to establish the tracer and tracee processes. When one invokes a debugger like gdb or lldb, they usually do so by specifying the executable they wish to debug. This is sometimes done at invocation of the debugger, gdb <executable-to-debug> or even by running process ID with –pid=PID. There is also the ability to run debugger without specifying anything at all, and dynamically prompt the user at runtime for that which they wish to debug.

So, let’s begin by first creating a Rust binary crate for our project using cargo new –bin traceedb. Navigating to our Cargo.toml, we can add the nix crate as a dependency, requesting features for ptrace, as well as process and signal for forking and handling signals. Nix is a nifty third-party asset that helps us write across various distributions of Linux and tap directly into a lot of APIs that not only include ptrace, but also handling processes and signals. Documents are provided in link.

[package]
name = "traceedb"
version = "0.1.0"
edition = "2021"


[dependencies]
# https://docs.rs/nix/0.27.1/nix/
nix = { version = "0.27.1", features = ["process", "ptrace", "signal"] }

Now, given an executable from the command line arguments, we could begin execution of the debugger by first forking our process into two, run execv in the child with the given executable from command line arguments, and allow the parent to run debugging logic with the derived child PID from fork. The result is a tracer-tracee relationship where the parent process is the debugger that has the handle to it’s child, which currently runs the executable specified.

Those familiar with C and common UNIX system calls for processes, this is accomplished through something like:

#include <unistd.h>

int main(int argc, char** argv)
{
    if (argc < 2) {
        fprintf(stderr, "Expected a program name as argument\n");
        return -1;
    }

    pid_t child = fork();
    
    if (child == 0) {
        run_target(argv[1]);
    } 
    else if (child > 0) {
        run_debugger(child_pid);
    }
    else {
        fprintf(stderr, "Failed to fork!");
        return -1;
    }

    return 0;
}

Now, when I was a younger programmer, I always thought fork() was extremely ambiguous when looking at it’s invocation in C code. The return value of fork implicitly dictates whether you are parent or child. Parent returns positive value that is the PID of child. The child returns child as 0, and anything non-positive indicates a failure to fork. But who would know that looking at the C code itself. If one were to introduce someone to such a system call without first showing them any relevant man pages or concepts of process genesis, it is not hard to imagine that people would be confused.

The nix crate, however, makes everything a bit more transparent when it comes to forking.

use nix::unistd::{execv, fork, ForkResult, Pid};

fn main() {
  let prog_name = env::args().map(|arg| CString::new(arg).unwrap()).nth(1);

  match unsafe { fork() } {
      Ok(ForkResult::Parent { child_pid, .. }) => {
          println!("Spawned child process {}", child_pid);
          run_debugger(child_pid); // TODO
      }

      Ok(ForkResult::Child) => {
        println!("Running traceable target program {:?}", prog_name);

        ptrace::traceme().expect("Ptrace failed, cannot debug!");
        
        // Replace child process with target executable from here
        let _ = execv(prog_name, &[] as &[&CStr])
            .expect("Failed to spawn process");
      }

      Err(_) => panic!("Failed to fork process, exiting..."),
  }
}

The return value of fork with nix is a Rust enumeration wrapped around a fallible Result wrapper that succinctly dictates the control flow and possible failure of forking. It describes both the fact that one result is for the child and the other for parent, but also indicates that the operation could fail, producing either Ok(ForkResult) or Err(Errno). Using powerful match syntax, strong typing, and Rust’s stance on satisfying all possible paths on a match, you have the following advantages:

// If you do not handle the case where the fork() might fail and produce error
// in the match, the compiler will error out and tell you to exhaust all possbile
// paths

match unsafe { fork() } {
    Ok(ForkResult::Parent { child_pid, .. }) => {
        println!("Spawned child process {}", child_pid);
        run_debugger(child_pid); // TODO
    }

    Ok(ForkResult::Child) => {
      println!("Running traceable target program {:?}", prog_name);

      ptrace::traceme().expect("Ptrace failed, cannot debug!");
      
      // Replace child process with target executable from here
      let _ = execv(prog_name, &[] as &[&CStr])
          .expect("Failed to spawn process");
    }
    
    // w/o "Err(_) =>" match arm, all possible paths not handled, you get
    // compiler error
    
    //    error[E0004]: non-exhaustive patterns: `Err(_)` not covered
    //   --> src/main.rs:34:15
    //    |
    //34  |         match unsafe { fork() } {
    //    |               ^^^^^^^^^^^^^^^^^ pattern `Err(_)` not covered
     
     
   // In C, you would still compile and not handle return error of Fork!
   // So your code would think it did fork, but in the end it would have
   // a negative PID that when used would cause problems.
   
   // In addition, you can have more specific arms to handle other failure
   // cases, but a failure to fork is most of the time a sign to just stop
   // https://docs.rs/nix/latest/nix/errno/enum.Errno.html
   
   Err(Errno::EPERM) => //code to handle permission error on fork call"
   
   Err(Errno::EIO) => // IO handling
   
   Err(_) => // code to handle all other errors, the "throwaway arm"
}

Not only does Rust make everything explicit enough to dispense with a cursory glance in the man pages, it also slaps one’s wrist in the event where you do not handle the cases for failure of the fork call. This explicitness and desire to exhaust all possible code paths branching from the return of a system call is why using Rust for systems level programming is so attractive.

Let’s move on to the next piece of the debugger. We have established the tracer-tracee relationship, with the child replacing itself with the program we want to debug (tracee), and the parent obtaining the handle to that child (tracer). Now we must create a means to listen to tracee for a change in it’s behavior. When the process is stopped, we should take that as an opportunity to ask the user for a command to execute, whether it be stepping through or exiting or breakpoints or anything else. But processes may come in a variety of states, so we have to be careful to handle these cases as well as we can.

This is where, again, Rust shines.

fn run_debugger(target_pid: Pid) {
    println!("Entering debugging loop...");

    'await_process: loop {
        let wait_status = waitpid(target_pid, None);
        match wait_status {
            Ok(WaitStatus::Stopped(_, Signal::SIGTRAP))
            | Ok(WaitStatus::Stopped(_, Signal::SIGSTOP)) => 
            'await_user: loop {
            
                // Here is where we want to accept user commands
                // i.e. when the tracee stops in place for the 
                // developer to do things. For now though,
                // we will continue execution with step
                
                ptrace::step(target_pid, None)
                    .expect("Failed to step through process!");
                break 'await_process
            }
            
            // handle other states of tracee here ...
        }
    }
}

Here we set up some match blocks, along with labeled loops. The first loop awaits a change in the process by listening in on it’s status with a wait() call. The second loop used labeled loops here explicitly for the case of breaking to a particular loop. In C, a default break returns control flow to the inner loop. To break to the outer loop, one might have to rely on gotos or an alternative scheme. Here, we can break to different loops based on the logic we want. Specifying both a loop for process and for user is important for commands, which may require us to break to wait to process (step, continue commands) or just accept another command (inspecting process, viewing registers, reading, and writing do not require us to wait).

The second part of this debugger loop is the result of wait and how we handle different return values. nix::sys::wait::wait(), like the aforementioned fork wrapper for nix, has a very verbose return value that again is wrapped around a result, but has a plethora of different returns with invariants.

pub enum WaitStatus {
    Exited(Pid, i32),
    Signaled(Pid, Signal, bool),
    Stopped(Pid, Signal),
    PtraceEvent(Pid, Signal, c_int),
    PtraceSyscall(Pid),
    Continued(Pid),
    StillAlive,
}

They vary in many different ways as you can see. In the end, our debugger’s robustness will be very dependent on how handle (or do not handle) these different return values of wait. In addition, most have a Signal attached to them that refers to the signal that caused them to enter their current state. These are also characterized as an enumeration for nix.

#[repr(i32)]
#[non_exhaustive]
pub enum Signal {
    SIGHUP,
    SIGINT,
    SIGQUIT,
    SIGILL,
    SIGTRAP,
    SIGABRT,
    SIGBUS,
    SIGFPE,
    SIGKILL,
    SIGUSR1,
    SIGSEGV,
    SIGUSR2,
    SIGPIPE,
    SIGALRM,
    SIGTERM,
    SIGSTKFLT,
    SIGCHLD,
    SIGCONT,
    SIGSTOP,
    SIGTSTP,
    SIGTTIN,
    SIGTTOU,
    SIGURG,
    SIGXCPU,
    SIGXFSZ,
    SIGVTALRM,
    SIGPROF,
    SIGWINCH,
    SIGIO,
    SIGPWR,
    SIGSYS,
}

This all looks very intimidating, but is ultimately a wrapper around existing return values of these UNIX system calls. We would have worked with same complexity in C with the standard UNIX library, albeit without the power of Rust. This nix crate affords us the ability for fine-tuned, explicit handling of the state of our tracee process in a Linux environment through the use of a few Rust enumerations (tagged unions on a lower level). We do not have to handle every possible variant of the above outright (that would be a lot of work!), we can start out small and have a few throwaway arms of the match statement that kill the process with a panic! triggered by a todo!() macro.

    'outer: loop {
        let wait_status = waitpid(target_pid, None);

        match wait_status {
            Ok(WaitStatus::Stopped(_, Signal::SIGTRAP))
            | Ok(WaitStatus::Stopped(_, Signal::SIGSTOP)) => 
            'await_user: loop {
                
                ptrace::step(target_pid, None)
                    .expect("Failed to step through process!");
                break 'await_process
            }
            
            

            Ok(WaitStatus::Exited(_, ..)) => {
                println!("The target program finished execution.");
                break;
            }

            Ok(_unhandled) => {
                dbg!(_unhandled);
                todo!();
            }

            Err(_) => {
                panic!("failed to wait for target program!");
            }
        }
    }

The _unhandled throwaway match arm, when reached, prints the debug string of the wait status, along with killing the process and informing us that the tracee reached a state that we haven’t handled yet in our tracer. Considering that we haven’t implemented an arm for any Stopped(Pid, Signal), our code for our debugger is nowhere near robust enough to consider usable! These statuses consist of processes that have been forcibly stopped by behavior like, segmentation faults caused by pointer dereferencing, floating point errors, division by zero, and so on. These are states that programs, particularly buggy ones, may enter on a frequent basis! Let’s, at the very least, implement an arm that will handled the dreaded segfault.

//...


    Ok(WaitStatus::Stopped(_, Signal::SIGSEGV)) => {
        eprintln!("Child process received SIGSEGV, check your pointers!");
        break;
    }
    
// ...

You could then implement arms in a similar way for a floating point exeception (SIGFPE) and so on. For each of the cases you cover, you increase the overall robustness of the debugger. Debuggers need to be robust, they need to handle the possible scenarios triggered by errant code! Rust facilitates all of this with ease when compared to C/C++.

Let’s test what we have so far. Here is a small bit of assembly that I will use to demonstrate the debugger code.

# test.S
.global  _start                 
.data                           
msg:
    .ascii    "John Dorman here!\n" 
    len = . - msg   

.text                           

_start:
    mov    $4, %eax              
    mov    $1, %ebx             
    mov    $msg, %ecx            
    mov    $len, %edx                                          
    int    $0x80                                 
    mov    $1, %al               
    mov    $0, %ebx              
    int    $0x80
    
# Prints "John Dorman here!" to screen.
# to compile, choose an assembler NASM, GAS, link and run
# ex. as --64 test.S -o test.o && ld -m elf_x86_64 test.o -o test

Running cargo run — test with the compiled assembly will print the message denoting the tracee has finished executing the 8 instructions it has. However, if you were to make a modification that would cause a segfault, say, changing the third instruction to a null pointer dereference mov $(0x0), %ecx. Then it executes the segfault path we just created.

$traceedb cargo run -- test
Entering debugging loop...
Running traceable target program "test"
The target program finished execution.

# triggering a null dereference, segfault
traceedb$ cargo run -- test
Child process received SIGSEGV, check your pointers!

Parsing User Input, Implementing Trivial Commands

Now that we have a program that can effectively set up the tracer-tracee relationship, along with a main debugger loop with tracee status, signal handling, we can start defining our commands that a user can use to view debug his program. Let’s start with a simple Rust enumeration.

enum Command {
    Step, // "step" or "s"
    Continue, // "continue" or "c"
    ViewRegisters // "register" or "reg"
    HelpMe, // "help" or "h"
    Quit, // "quit" or "q"
    Unknown, // anything typed by user that is not one of strings above
}

The idea here is using a enumeration and Rust pattern matching to trigger the correct functionality per command. The user will enter commands via a prompt at standard input, where this string input will be matched for right enumeration value. In typical gdb/lldb fashion, will allow for verbose and non-verbose forms, so one could enter either “step” or “s” at command line, and each will be parsed into Command::Step. However, if the user enters some nonsense into prompt, we should return Command::Unknown where we can inform the user that they might be having a stroke.

fn prompt_user_input() -> Command {
    print!("> ");
    stdout().flush().unwrap();

    let mut user_input = String::new();

    while let Err(_) = stdin().read_line(&mut user_input) {
        eprintln!(
            "Err: Failed to read user input!"
        );
        user_input.clear();
    }

    // Remove any whitespace around command with trim()
    let input_str = user_input.trim();

    match input_str.as_str() {
        // Match against the string, match against both versions of command
        "s" | "step"        => Command::Step,
        "c" | "continue"    => Command::Continue,
        "q" | "quit"        => Command::Quit,
        "h" | "help"        => Command::HelpMe,

        // If we fail to match against the above, then the input 
        //is not understood and unknown
        _ => Command::Unknown,
    }
}

This is a naive way to implement a parser for a debugger commands, where we assume commands are single words with no arguments. It will suffice for now. However, as commands grow more complex, an actual lexer-parser may be necessary.

Every time our tracee stops, say, when the user steps through the program instruction, or when a breakpoint in the code is hit, the user is given the opportunity to enter a command. In addition, some commands simply do not run the tracee at all, and immediately prompt the user for another command, (viewing registers). Based one command input, we execute the right code to reflect user wishes. Going back to our tracer loop, let’s use this newly created prompt_user_input function at the point where our tracee is trapped or stops.


    match wait_status {
        Ok(WaitStatus::Stopped(_, Signal::SIGTRAP))
        | Ok(WaitStatus::Stopped(_, Signal::SIGSTOP)) 
          => 'await_user: loop {
          match accept_user_input() 
            Command::Quit => {
                todo!() // The todo!() macro will compile, amounts to panic!()
            }

            Command::Step => {
                todo!()
            }

            Command::Continue => {
                todo!()
            }
            
            Command::ViewRegisters => {
                todo!()
            }
            
            Command::HelpMe => {
                todo!()
            }

            Command::Unknown => {
                todo!()
            }
        }
  }

In Rust, we can leave todo!() blocks in places where we have work to do, and still compile the program. Reaching any of these branches of the match will trigger and Rust panic and cause the program to come to a stop. This

Now, as previously mentioned, most of the simpler commands can be implemented by just one ptrace system call. Simply execute the syscall you need, then break to the correct loop. In this case, all of these except the case where user enters some nonsense force us to await the process for a signal, while t

  Command::Quit => {
      ptrace::kill(target_pid)
          .expect("Failed to kill process!");
      break 'await_process;
  }
  
  Command::Step => {
      ptrace::step(target_pid, None)
          .expect("Failed to step through process!");
      break 'await_prcoess;
  }

  Command::Continue => {
      ptrace::cont(target_pid, None)
        .expect("Failed to continue running the process");
        
      break 'await_process;
  }
  
  Command::ViewRegisters => {
    let regs = ptrace::getregs(target_pid)
          .expect("Failed to get register status using ptrace!");
    
    // There are more registers than this, tdl
    println!(
        "%RIP: {:#0x}\n\
        %RAX: {:#0x}\n%RBX {:#0x}\n%RCX: {:#0x}\n%RDX: {:#0x}\n\
        %RBP: {:#0x}\n%RSP: {:#0x}\n%RSI: {:#0x}\n%RDI: {:#0x}",
        regs.rip, regs.rax, regs.rbx, regs.rcx, regs.rdx,
        regs.rbp, regs.rsp, regs.rsi, regs.rdi
    );
    
    break 'await_user;
  }
  
  Command::Unknown => {
      eprintln!("Err: Unknown command, please input an available command!");
      break 'await_user;
  }
  
  // Command::HelpMe is ommitted because it is just printing 
  // a generic help message about available commands

Let’s go ahead and test some of these commands now. Note that you can run step exactly 9 times, which is the number of instructions in the assembly file. Also note the location of the print statement, as well as register values like instruction pointer.

nextdb$ cargo run -- test
Spawned child process 109456
Entering debugging loop...
Running traceable target program "test"
> s
> s
> s
> s
> s
John Dorman here!
> s
> s
> s
The target program finished execution.

traceedb$ cargo run -- test
Spawned child process 111066
Entering debugging loop...
Running traceable target program "test"
> s
> s
> s
> s
> s
John Dorman here!
> reg
%RIP: 0x401016
%RAX: 0x12
%RBX 0x1
%RCX: 0x402000
%RDX: 0x12
%RBP: 0x0
%RSP: 0x7ffc141ba870
%RSI: 0x0
%RDI: 0x0
> c

Everything looks good! Some other simple commands include reading and writing words to and from tracee. This requires some additional data for these commands, and some argument parsing. The commands for reading and writing accept hex values for addresses and values. These are in turn converted to the type c_void for parameters in ptrace::read and ptrace::write.

enum Command {
    Step,
    Continue,
    ViewRegisters,
    Read(*mut c_void),
    Write(*mut c_void, *mut c_void),
    Quit,
    Unknown,
}


// in prompt_user_input

// Get an iterator split by whitespace for every term following command
let mut term_iter = user_input.split_whitespace();

// Get the first of the iter as the command, let the resultant iterator
// be an argument iterator
let (command, mut args_iter) = (term_iter.nth(0).unwrap(), term_iter);

"r" | "read" => {
    if let Some(read_addr) = args_iter.nth(0) {
        if let Ok(parsed_addr) = usize::from_str_radix(read_addr, 16) {
            Command::Read(parsed_addr as *mut c_void)
        } else {
            println!("Failed to parse to usize!");
            Command::Unknown
        }
    } else {
        println!("Failed to get operand!");
        Command::Unknown
    }
},

"w" | "write" => {
    if let (Some(write_addr), Some(write_word)) = (args_iter.nth(0), args_iter.nth(0)) {
        if let (Ok(parsed_addr), Ok(parsed_word)) = (
                usize::from_str_radix(write_addr, 16),                 
                usize::from_str_radix(write_word, 16)
            ) {
            Command::Write(
                parsed_addr as *mut c_void, 
                parsed_word as *mut c_void
            )
        } else {
            println!("Failed to parse write!");
            Command::Unknown
        }
    } else {
        println!("Insufficient args!");
        Command::Unknown
    }
}

// in debugger loop for matching against user input
Command::Read(addr) => {
    let res = ptrace::read(target_pid, addr)
        .expect("Failed to send PTRACE_PEEK message!");
        
    println!("@{:#0x}: {:#0x}", addr as usize, res);
}

Command::Write(addr, word) => {
    unsafe { 
        ptrace::write(target_pid, addr, word)
        .expect("Failed to send PTRACE_POKE message!"); 
    }
}

You can test these commands by just writing to some real location in process memory space, like stack, and reading that back.

$traceedb cargo run -- test
Spawned child process 111927
Entering debugging loop...
Running traceable target program "test"
> s
> s
> s
> reg 
%RIP: 0x40100f
%RAX: 0x4
%RBX 0x1
%RCX: 0x402000
%RDX: 0x0
%RBP: 0x0
%RSP: 0x7fff7982d2a0
%RSI: 0x0
%RDI: 0x0
> write 7fff7982d2a0 feefee
> read 7fff7982d2a0
@0x7fff7982d2a0: 0xfeefee
>

Moving On

We now have 236 lines of code that produce a rudimentary model of a debugger. So far we have been consumed with actual practicality of what we were doing that we haven’t really considered how our code might evolve and change overtime, and whether or not such changes would be easily applied to the current code. Our Rust code is also need of some polish in a general sense. In the next blog post, we will attempt to refactor some of the more shoddier aspects of the debugger thus far!

Writing a GNU/Linux x86_64 Debugger in Rust (Part 2): A Rust-ic Refactoring

Forays into Systems Programming: “Who Watches the Watchmen?”–Writing a GNU/Linux x86_64 Debugger with Rust and the Nix Crate

General Workflow of a Debugger

Laying the Groundwork with the Nix Crate, Forking to Tracer-Tracee

Parsing User Input, Implementing Trivial Commands

Moving On

Comments

Leave a Reply Cancel reply