FCDL

Author

lch361

Version

2.0.0-x86_64

License

GPL-2.0

Note: This is an in–depth overview of my project, from a developer’s standpoint. If you’d like to know what FCDL is, its features and functionality, see README.md.

History of development

The development began in 2023, back in the high school, when I was actively exploring Linux and free software written around it. Eventually, I discovered the Linux Kernel Module Programming Guide1. I always thought of OS development as an extremely hard, but possible work, but this book was a soft introduction into the world of kernel programming. That’s when I decided to seriously learn how to write Linux Kernel modules.

Unfortunately, at the time, I was using a binary Linux distribution with precompiled kernel. That makes kernel development possible yet tedious, and I would like having sources to explore the kernel API anyway. So I created a QEMU virtual machine with Gentoo in it, made a backup in case the kernel code messes up the system, and started developing in it.

Thus, FCDL 1.0.0-x86_64 was born. FCDL is the name of my module, standing for “File Creation/Deletion Logger” (yeah, I think I wasn’t very original with naming back at that time…). Version 1.0.0 is an initial stable release, and x86_64 signifies that my module was made specifically for one CPU architecture (developing for multiple architectures is complicated, and in next chapters you’ll see why).

Of course, while I achieved my first kernel module, it obviously wasn’t perfect and had many problems with its design. So a couple years later, with more experience in programming and engineering, I returned to this project and made heavy improvements. The module became more performant, had more functionality and got an actual README.md describing and documenting its usage. This is how FCDL 2.0.0-x86_64 was developed. At this day, I consider this project a finished work.

I am writing this article, sharing my knowledge and experience with other explorers, because software isn’t just code and executable — it’s an invaluable knowledge and experience.

Architecture

For catching filesystem events, specific system calls are being intercepted. For outputting information about events, a character device is used. Data flows in kernel space like this:

System calls -> buffer -> character device
Data flow diagram for the whole FCDL module

Next sections describe each entity with more details.

Intercepting system calls

FCDL uses following system calls2:

  • openat — opens a file at specific path;
  • unlinkat — deletes a file at specific path;
  • renameat2 — moves file from one path to another;

For intercepting kernel functions, and thus, syscalls, Kprobes were used3.

According to the syscall(2) man page2, for x86_64:

  • System calls put their arguments into the following registers: rdi, rsi, rdx, r10, r8, r9.
  • System calls put their return code into rax.

FCDL needs to only save events from successful system calls, but also needs to view system call arguments. So system calls should be intercepted both at the start and at the end. Furthermore, registers may change after syscall executes, so regular Kprobes won’t do. That’s why Kretprobes with entry_handler and shared data pointing at the old registers were chosen for this task (see implementation).

Together, Kretprobes and system calls form the following algorithm:

System call flowchart

Writing through the character device

Kernel API allows to create character devices with custom operations, these will be invoked in response to specific system calls. FCDL defines following operations for /dev/fcdl:

  • open

    Enable pushing events to the buffer, disable module removing. If there already exists a process with opened character device, return error: FCDL was designed to have one reader, otherwise it would be complicated to send one event to multiple readers.

  • read

    Peek and pop one event from the buffer, send it to the userspace.

  • release (close)

    Disable pushing events to the buffer. Clean it afterwards, enable module removing.

Buffer is being enabled/disabled, so when nobody is reading /dev/fcdl, OS works as usual.

Synchronizing the buffer

All events that appear faster than character device can read and send them to the userspace are stored in the buffer. The buffer itself is implemented as a circular buffer data structure, because it also acts like a FIFO queue: this module should prioritize events that occured earlier. Let start be a field storing the first element’s index, and end storing the last element’s index.

Let our buffer operations be:

  • push — add one element at the end;
  • peek — view one element at the start;
  • pop — remove one element at the start.

Additionally, I also wanted for these operations to be blocking:

  • push would wait until buffer is not full;
  • peek would wait until buffer is not empty.

That’s why my circular buffer implementation has two semaphores signaling these states. It also has mutex for operations mutating the buffer.

Considering previously shown data flow diagram, our buffer is also a multi–producer, single–consumer channel. This allows us to implement synchronization more efficiently: producers (system call kretprobes) only use push operation, consumer (character device) only uses peek and pop.

Mutual exclusion of buffer operations
First operationSecond operationAre they mutually exclusive?Explanation
peekpushNoPush only modifies end, keeping start immutable. Peek only reads start.
poppushYesBoth operations read both start and end, push modifies end, pop modifies start. Mutex is required to avoid data races to both variables.
peekpopYesPeek reads start, pop modifies it. We’ve got a single consumer, so these operations are never concurrent, but only consecutive.

That’s why only push and pop lock and unlock mutex in my implementation.

Improving performance with futures

Those of you bold/curious enough to explore my mess source code could notice this interesting struct used in buffer:

struct event_future {
	struct event value;
	struct semaphore initialized;
};

Actually, my buffer doesn’t store only event objects. It couples them together with a semaphore, forming a future: a value that is not initialized, but will be ready in the future.

Considering our previously stated buffer operations:

  1. push will add an unitialized future;
  2. future can be initialized without locking the whole buffer;
  3. peek will wait for the future to be initialized;
  4. pop after peek will leave the semaphore at 0 (uninitialized).

Item 2 is specifically important, because it allows to make FCDL’s concurrence much faster. Below are two diagrams showing the difference between:

  1. locking buffer with every push, while it would be initializing the item;
  2. locking buffer with every push, but only returning the future.
Kretprobe timeline, old version, without futures
Kretprobe timeline, new version, with futures

New version spends far less time locking the buffer, which makes it significantly more performant.

Developing userspace interface

In Linux kernel, it is discouraged to do pretty formatting: translating UNIX timestamps into formatted time string, separating entries by '\n', etc. That’s because primary users of the kernel API are userspace programs, and rarely users themselves.

This is why /dev/fcdl outputs information about events in binary format described in README.md, leaving the formatting job to userspace. If a program or library wants to acquire and use this file for itself, they can use the binary output directly. If a user wants to read the events, they can use the fcdl-cli utility: developed by me as a quick way to demonstrate capabilities of FCDL, and as a reference implementation.

Conclusion

FCDL is not just a Linux kernel module. It’s not just an experimental free program that everyone can use. It was a journey. My journey to learn the inner workings of OS, system calls, kprobes, character devices. My experience with implmenenting and improving the multithreaded code. My experience with reading the documentation and trying to implement everything in the right way. And lastly, my will to share this experiment with others.

If you think that some improvements can be made for FCDL’s design, or just found a bug in the behaviour of this module, feel free to leave a comment, contacting me, or even better — leave issues and pull requests in my repository. Thank you for reading!

References