RustOS: A Microkernel Built in Rust
RustOS is designed as a simplistic operating system for the x86_64 architecture for educational purposes. It is built from the ground up in Rust and avoids the standard library (no_std) to interface directly with x86_64 hardware. It uses the Limine bootloader and supports basic features such as preemptive multitasking, memory management, and a custom FAT32 filesystem driver.
Technical Stack
- Languages: Rust (Nightly), x86_64 Assembly
- Bootloader: Limine
- Development Tools: Cargo, QEMU (for testing), custom Makefile
How I went about building RustOS
And how you can build your own OS too
To start out I decided what programming language I would use. I went with Rust as it is fast memory safe, and I wanted to improve my skills with this language. Next I used a lot of online resources to learn about OS development. I found the OSDev Wiki to be incredibly helpful. From there I picked Limine as the bootloader. I then set up QEMU to test the kernel. As well I used make to automate the process of building and running the kernel.
When starting out the hardest part is having little to no feedback for what is happening. Therefore I tried to get the framebuffer working first. To do this I ported over the Drawing In a Linear Framebuffer tutorial from the OSDev Wiki to Rust.
Once this was working it was very helpful as I could now see what was happening in the kernel. I also implemented serial communication and a logger to help with debugging.
// Writing a byte to the serial port
unsafe {
core::arch::asm!(
"out dx, al",
in("dx") SERIAL_PORT,
in("al") byte,
options(nomem, nostack, preserves_flags)
);
}
Once this was implemented I wanted to get the keyboard working. However like all things in OS development it was not nearly as simple as I had hoped. What I was missing was the interrupt descriptor table (IDT). This handles all the interrupts for the CPU.
This table essentially maps interrupt numbers to the functions that should be called when that interrupt is triggered. For instance when the keyboard is pressed it sends an interrupt to the CPU. The IDT then tells the CPU to call interrupt handler function. The most complex thing it has to do is save the state of the CPU before calling the interrupt handler and restore it after. That is what the following assembly code does.
isr_common_stub:
fxsave [rip + INTERRUPT_FPU_SNAPSHOT]
mov byte ptr [rip + INTERRUPT_FPU_SNAPSHOT_VALID], 1
push r15; push r14; push r13; push r12
push r11; push r10; push r9; push r8
push rbp; push rdi; push rsi; push rdx
push rcx; push rbx; push rax
mov rdi, rsp
call exception_handler
// On return, rax contains the new rsp
mov rsp, rax
// Restore registers
pop rax; pop rbx; pop rcx; pop rdx
pop rsi; pop rdi; pop rbp; pop r8
pop r9; pop r10; pop r11; pop r12
pop r13; pop r14; pop r15
cmp byte ptr [rip + INTERRUPT_FPU_SNAPSHOT_VALID], 0
je 2f
fxrstor [rip + INTERRUPT_FPU_SNAPSHOT]
mov byte ptr [rip + INTERRUPT_FPU_SNAPSHOT_VALID], 0
2:
add rsp, 16
iretq
Now that the IDT was working I could implement the keyboard driver. This is pretty straight forward (nice surprise). The following code is what I came up with. It just checks to see if the interrupt number is 33 (keyboard interrupt) and then reads the scancode from the keyboard. It also checks if the scancode is for the mouse and if so calls the mouse handler.
if num == 33 {
let status = ps2_status();
let scancode: u8;
unsafe {
asm!("in al, dx", out("al") scancode, in("dx") 0x60 as u16);
}
// If AUX bit is set, this byte belongs to mouse; keep it out of keyboard queue.
if (status & 0x20) != 0 {
crate::io::mouse::on_irq_byte(scancode);
} else {
crate::io::keyboard::push_scancode(scancode);
}
}
This pushes the scancode to a queue. The keyboard driver then reads from this queue and converts the scancode to a character.
Memory Management
Memory management known by its other name suffering is a very important part of an operating system. It is also one of the most complex parts of an operating system. However getting memory managment working expecially when using Rust unlocks a lot of potential.
For my OS I decided to use a 3 tiered memory management system.
The Bitmap Allocator
The bitmap allocator is used to track physical memory. It uses the data from Limine to determine which regions of memory are free and which are in use. The bitmap stores this information in a vector of booleans with each boolean representing a 4KB block of memory. Then it performs a next fit algorithm to find a free block of memory when requested.

This way we can keep track of which memory is free and which is in use. However now we have a couple of problems we need to solve.
- How do we allocate larger chunks of memory?
- How do we assign specific memory addresses to specific tasks?
- How do we prevent tasks from accessing each other’s memory?
Paging System
The paging system is used to solve the problems mentioned above. It is a system that allows us to assign specific memory addresses to specific tasks and prevent tasks from accessing each other’s memory.
In RustOS we use a Higher Half Direct Map. All this means is that we map the physical memory to the higher half of the virtual memory. This way we can easily access physical memory by just adding the offset to the virtual address. It also supports memory protection using page table flags.
bitflags! {
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct PageTableFlags: u64 {
const PRESENT = 1 << 0; // Page is present in memory
const WRITABLE = 1 << 1; // Page is writable
const USER_ACCESSIBLE = 1 << 2; // User-mode can access this page
const WRITE_THROUGH = 1 << 3; // Write-through caching enabled
const NO_CACHE = 1 << 4; // Caching disabled
const ACCESSED = 1 << 5; // Set by CPU on read
const DIRTY = 1 << 6; // Set by CPU on write
const HUGE_PAGE = 1 << 7; // Map 2MB/1GB page
const GLOBAL = 1 << 8; // Global page (not flushed from TLB)
const NO_EXECUTE = 1 << 63; // Execute disable
}
}
Then when we need to map a page the following happens:
- Index Calculation: The virtual address is bit-shifted to find the index for each level of the page table.
- Table Traversal: It walks down the levels
- Fault Tolerance: If a table at any level (P3, P2, or P1) doesn’t exist yet we do the following:
- Allocate a new physical frame from the Bitmap Allocator
- Zero out that frame (avoids garbage data)
- Link the parent table entry to this new physical frame.
- Final Entry: The P1 entry is set to point to the target phys address with the specified flags
- TLB Invalidation: It runs the invlpg assembly instruction to tell the CPU that a translation has changed, clearing the Translation Lookaside Buffer (TLB) cache for that address.
Dynamic Heap Allocator
The top level is the Global Allocator. For this kernel I used the LockedHeap from the linked_list_allocator crate. It is a simple allocator that uses a linked list to keep track of free blocks of memory.
The allocator supports dynamic growth by requesting more memory from the kernel when needed. This is done by calling the sys_sbrk method.
pub fn sys_sbrk(increment: isize) -> *mut u8 {
let heap_size = ALLOCATOR.0.lock().size();
let old_end_of_heap = HEAP_START + heap_size as u64;
if increment == 0 {
return old_end_of_heap as *mut u8;
}
if increment > 0 {
let mut mapper = paging::get_active_mapper();
let new_end_of_heap = old_end_of_heap + increment as u64;
let mut map_ptr = (old_end_of_heap + 0xFFF) & !0xFFF; // align to page boundary
while map_ptr < new_end_of_heap {
let phy_fram = allocate_frame().expect("Failed to allocate frame for sys_sbrk");
let flags: paging::PageTableFlags =
paging::PageTableFlags::PRESENT | paging::PageTableFlags::WRITABLE;
mapper.map(map_ptr, phy_fram, flags);
map_ptr += 0x1000;
}
unsafe {
ALLOCATOR.0.lock().extend(increment as usize);
}
}
old_end_of_heap as *mut u8
}
The Result
We can now use Rust’s Global Allocator atribute to enable standard Rust collection types like Vec, String, and Box.
Preemptive Multitasking
The first step to get preemptive multitasking working is to get the timer interrupt working. After that we can now implement the scheduler. All the scheduler does is keep track of the current task and swap the task when the timer interrupt is called. It is a little more complex than that as we need to save the state of the current task and restore the state of the next task.
PIT Timer Interrupts
The PIT (Programmable Interval Timer) is a hardware timer that can be used to generate interrupts at regular intervals. For RustOS I used a frequency of 1000Hz. This means that the timer interrupt is called 1000 times per second.
pub const TICKS_PER_SECOND: u64 = 1000;
const PIT_BASE_FREQUENCY: u64 = 1193180;
pub fn init_timer() {
let divisor: u16 = (PIT_BASE_FREQUENCY / TICKS_PER_SECOND) as u16;
unsafe {
// Command port: Select Channel 0, Square Wave Mode
core::arch::asm!("out 0x43, al", in("al") 0x36u8, options(nomem, nostack));
// Data port: Send low byte then high byte of divisor
core::arch::asm!("out 0x40, al", in("al") (divisor & 0xFF) as u8, options(nomem, nostack));
core::arch::asm!("out 0x40, al", in("al") ((divisor >> 8) & 0xFF) as u8, options(nomem, nostack));
}
}
The Context Switch
This is handled in 3 steps.
-
Start Preemption/Save State: When we get an interrupt to switch tasks we need to save the state of the current task and restore the state of the next task. To do this the assembly code already saves the registers to the stack. So we just save the stack pointer to Task struct.
-
Save FPU/SSE State: Next we handle FPU/SSE state. These registers are important for floating point arithmetic and vector operations and are not saved by the assembly code. So we need to save them manually. If we don’t do this the FPU/SSE state of the previous task will be used by the next task which can cause some weird results.
-
Pick Next Task: The scheduler.schedule function iterates through a VecDeque of tasks. It uses a Round Robin approach, but with a “Sleep” check:
-
Restore State: We restore the FPU/SSE state of the next task and then restore the stack pointer of the next task.
-
Resume Execution: We return from the interrupt and the next task resumes execution.
The Scheduler
The system uses a simple round robin scheduler. This is one of the simplest schedulers to implement and it is a good starting point for a preemptive multitasking system. I made 2 small modifications to the round robin scheduler which is to add a sleep state. This allows a task to request to be put to sleep and not have it be scheduled until it is woken up. The second change was adding a yield function. This allows a task to request to be put to sleep and not have it be scheduled until it is woken up.
// The yield function allows a task to request to be put to sleep and not have it be scheduled until it is woken up.
pub fn yield_now() {
unsafe {
core::arch::asm!("int 0x20"); // Manually trigger the Timer/Scheduler IRQ
}
}
// The idle task is a task that is run when there are no other tasks to run.
pub fn idle_task() -> ! {
loop {
// Increment a global counter of "idle time"
unsafe { crate::globals::IDLE_TICKS.fetch_add(1, core::sync::atomic::Ordering::Relaxed) };
// Use 'hlt' to stop the CPU until the next interrupt
// This is crucial: 'hlt' makes the loop pulse at the same rate as the timer
unsafe {
core::arch::asm!("hlt");
}
}
}