operation-hack/STARTING_IN_C.md
segfault ee4a8a65e1 Add a getting started guide
This commit adds a very simple guide to getting started with a kernel in C. It uses GRUB, Multiboot 2, and CLANG as the compiler.
It does not provide instructions on building a cross-compiler, and instead uses CLANG's builtin `-target` flag.
2025-02-27 17:59:50 +00:00

7.3 KiB

Starting with Multiboot in C

This is a simple guide for getting started with kernel development in C using Multiboot 2. Please do not copy this tutorial's code. If you absolutely have to have the same code in your kernel, please type it out. Typing something, even from a tutorial, lets you understand it far better. If you have any questions, feel free to ask someone.

Setting up Dependencies

First, install clang and basic tools we will be using. On Debian or Ubuntu based systems, you can run this:

sudo apt install build-essential clang nasm qemu-system-x86 xorriso

Lets break it down.

  • build-essential provides basic build tools like make, which we will be using for build management. With Make, we only need to run make to build our kernel. Pretty simple, right?
  • clang is the C-Lang compiler. It is fast, and based off of the LLVM toolchain. While you could also use gcc, Clang makes it very easy to cross compile. More on that later.
  • nasm is my Assembler of choice. We will be writing the Assembly for this tutorial in x86 Intel-syntax assembly, and personally, I like NASM.
  • qemu-system-x86 is the emulator. QEMU is like a virtual machine, but very easy to set up.
  • xorriso is a tool for generating ISOs. We will not be calling it directly, but grub-mkrescue, a tool we will be using, invokes it.

Now, make a directory for your project and cd into it. Let's call it myos, though you can change this easily.

Hello C

In this section, we are going to be executing a basic C program. It won't print anything to the screen for now, we'll get to that later. I am choosing Multiboot 2 for our bootloader, as it is fairly simple, proven-to-work, and easy to set up.

Make a file called start.asm in your project directory.

This will be our header code to tell Multiboot 2 about our kernel. Type this into start.asm.

; Multiboot 2 magic number. This is defined in the Multiboot spec.
MAGIC equ 0xE85250D6

; States our architecture as i386
ARCH equ 0

LEN equ (header_end - header_start)
CHECKSUM equ 0x100000000 - (MAGIC + ARCH + LEN)

section .multiboot2
; Align to 8 bytes (64 bits)
align 8
header_start:
	dd MAGIC
	dd ARCH
	dd LEN
	dd CHECKSUM  

	; REQUIRED MULTIBOOT 2 END TAG
	dw 0
	dw 0
	dd 8
; end symbol so we can calculate length dynamically
header_end:

Note that in the future, you might want to add more Multiboot tags. Ensure every tag is 8 byte aligned, or it will give an error. You can do this by putting an align 8 instruction between each tag.

Now, we need to add a stack. You should know what a stack is, but if not, look over here.

section .bss
align 16
stack_bottom:
	resb 16384
stack_top:

And now, for the initial code.

; Tells NASM that our primary C kernel exists elsewhere and will be linked.
extern kmain

section .text
global _start:function (_start.end - _start)
_start:
	; Initialise the stack pointer
	mov esp, stack_top
	
	; Push the magic number to the stack.
	push eax
	
	; Push the Multiboot 2 structure to the stack.
	push ebx

	; Call the main kernel function
	call kmain

	; In case kmain returns, we must disable interrupts,
	; then loop. We loop by calling the `hlt` instruction,
	; which on x86, pauses until the next interrupt.
    
    ; Disable interrupts
	cli
.hang:
    ; Halt until next interrupt
	hlt

    ; In case of an NMI, then we loop
	jmp .hang
.end:

Now hopefully the above code made at least some sense to you. If not, Google stuff. We can assemble the file with this command: nasm -felf32 start.asm -o start.o

Now, for the C code. For now, we will just loop:

void kmain() {
	while (1) {}
}

You might notice two things about this code's main function -- first, the return value type is void, not int. Second, it is called kmain. This is because it is not a normal main function like you might see in a userspace C program. This is because normally, main is called by the operating system. In this case, we are the operating system. There are a lot more details here I won't really get into, but if you are interested, you can Google it. We can save this file in main.c.

To compile the C code, you can use this command: clang -target i686-elf -c main.c -o main.o -std=gnu99 -ffreestanding -O2 -Wall -Wextra

You might notice that I am using the CLANG compiler as opposed to the more common one, GCC. This is because CLANG is cross-target by default, so we don't have to compile a cross compiler. You should probably make a cross-compiler anyway, but using CLANG's -target flag is fine for now.

Now, you should have two files, kernel.o and start.o, and we need to link these two object files together. In order to do that, we need a linker script. Let's save this in linker.ld.

ENTRY(_start)

SECTIONS
{
	/*
	Begin loading the kernel at 2 MB
	*/
	. = 2M;

	/*
	Put the Multiboot 2 header before, as it is needed to be fairly early 
	in the executable for the bootloader to find it. Then, we put .text,
	which is where the executable code is stored.
	*/
	.text BLOCK(4K) : ALIGN(4K)
	{
		*(.multiboot2)
		*(.text)
	}

	/*
	Read only data.
	*/
	.rodata BLOCK(4K) : ALIGN(4K)
	{
		*(.rodata)
	}

	/*
	Read and write data (initialized)
	*/
	.data BLOCK(4K) : ALIGN(4K)
	{
		*(.data)
	}

	/*
	Read and write data (uninitialized) and the stack (defined in start.asm)
	*/
	.bss BLOCK(4K) : ALIGN(4K)
	{
		*(COMMON)
		*(.bss)
	}
}

Now, we can link it with the command clang -target i686-elf -T linker.ld -o kernel.elf -ffreestanding -O2 -nostdlib start.o kernel.o -lgcc. This will link it into an ELF file called kernel.elf. Note that we use CLANG again for linking. On my system at least, it internally calls the host ld with the correct flags for the target. Again, this is sub-optimal, and you should be building your own cross compiler instead.

Now, make a file called grub.cfg and type out the following code:

menuentry "MyOS" {
	multiboot2 /boot/kernel.elf
}

Then, you can run this small script or paste the commands into your shell one by one to build the ISO. You might have encountered ISOs before, and they can be bootable disk images.

mkdir -p isodir/boot/grub
cp kernel.elf isodir/boot/
cp grub.cfg isodir/boot/grub/
grub-mkrescue -o myos.iso isodir

And now, you get to run it in a virtualizer called QEMU with the command qemu-system-x86-64 -cdrom myos.iso. And for the moment of truth...you see a black screen!

This is expected behavior, as our kernel's main function is empty except for the infinite loop.

Hello A

While a black screen with a blinking cursor is interesting and all, let's try something a little more cool. Try changing your C code to something like this:

void kmain() {
	// 0xB8000 is the location of the VGA text buffer.
	uint16_t *buffer = 0xB8000;

	// The letter 'A' ORed with the color.
	// 0x30 is 3, which is the color code for cyan, left shifted by four.
	buffer[0] = 'A' | 0x30;

	// Infinite loop
	while (1) {}
}

You should see the cyan letter A in the top left corner!

Now, at this point, your directory is probably a mess. This is what a simple but organized kernel tree might look like:

.
├── config/
|   ├── grub.cfg
|   └── linker.ld
├── Makefile
├── src/
│   └─── asm/
│        └── start.asm

Guidelines for organizing and writing a Makefile coming soon!