How to create an operating system to x86 within a month from 0?
Tutorial by Geri
This tutorial will explain how to make an OS within a month from 0. It does not requires special knowledge
to do it, since all source code snippet and information is available on internet with simply searches.
Of course it requires special knowledge in the way that you must be a REAL programmer to do it
(and not some random sql scripter) you must have REAL knowledge about computers (RAM, hdd, video cards)
it will be not enough if you just have read from them, you must have experience.About x86:
x86 is an obsolote crap, a dinosaur that alreday died out, but we have a lot of skeletons still around
as it was dominated the market from the 90s to the begining of 2010s, then ARM architecture killed it
and nowdays 95% of the chips sold is ARM. Now we must understand that x86 was able to gain such a popularity
among corporations in the past becouse it is basically a hardware obfuscator that had extremely difficult
opcodes and due to the instruction set, it was almost impossible to reverse engineere/disassemble a program written for it.
Of course nowdays we have high level languages, such as C, so just basically disassembling a
program will not magically recreate the source code in any ways - and we alreday have disassemblers for
x86 long time ago, so its architecture is not an adventage in this question.
So x86 is basically an extrwemely complex instruction set, the encoding of the opcodes are extremely complex
and never officially fully released even, only fragments, the whole architecture designed in the hope
of any accurate replicas will be impossible, and it would give intel an endless domination on its market.
However, corporations was able to reverse engineere the architecture, and create compatible solutions. There
were a lot of x86 manufacturers, however, understanding the x86 platform to just list what opcode does what,
and what special io location, special interrupt does, would be probably one million pages in a book, and
we not yet even talk about the graphics chips which introduced the same **** again. x86 is basically
developed in the ways of every retard was able to jump up, and add some bullshit instruction on his
will that has random lenght, random parameters, randmly manipulates registers, flags, memory locations.
x86 opcodes are extremely complex, they can have prefixes, they work with mempry segments (this means
locations being added to other locations and being multiplied based on additional bytes in the opcode),
the opcode that descripbes the opcode id also can have various length, the register/segment bytes
are acting differently on almost every instruction and can result pulling in additional data for
the opcode to interpret the instruction itself. This is not documented, not deterministic, its
not folowing any logic or sense. This was one of the reasons it was hard to disassemble the
software written for it in early times. Later they added FPU (not like fixed point numbers werent
enough good tough), new opcodes, SIMD (mmx, sse, 3dnow), 64-bit mode, other SIMD craps like
AVX (that is probably not even used in the entrie world), and removed some from the olds.
Today x86 cpus are not native x86 cpus, becouse they size would be bigger than your head if they
would implement all of this garbage ****. Instead, they are RISC cpus with built in hardware emulator
to decode and execute the x86 instructions, they efficiency is far below 1%. Oh, and we not even
talked about the fact x86 cpus have multiple operation modes including the original 16 bit real
mode, where everything starts up, the protected mode with two kinds of security model
(one for system level, and one for applications). Multiple kind of interrupt handlers, bios
calls, multiple disk access models being written from the 80s, input device conventions.Then why not ARM?
People believe that ARM is a free platform that can be alternative to x86. Its a misconception.
ARM is not a free platform, its basically not even a platform. On ARM, every hardware developer
creates his own basic IO system, own interrupt, dma system, own graphics and input systems, and
the ARM cpu generations are not backward compatible in the system levels. So you basically cant
just create an operating system for ARM, becouse every single hardware needs a different kernel
with different drivers (manufacturers typically compile they own linux versions with they own
drivers). Also ARM got diseased from the same cancer that hit x86: random retards started to
jump up, and add they new opcodes. Originally arm had a fixed lenght instruction set which is
now only the remember of the past, becouse thumb mode added another length to instructions.
They also added FPU (for WHY the ****?!), SIMD (vfp of how the **** is it called), vfp3,
64 bit with backward user mode compatibility, just as extremely complex paging and interrupt
systems like we have on x86, and the same shitty GPU-s with billions of transistors to paint
a fucking pixel on the screen, becouse they could. Now with ARM you get the same slow, big
1 billion transistor hardware opcode emulator **** that you get with x86, but you cant even
write an operating system on it, since it has no unified standard IO to deal with the hardware
and software in the computer to sell a bootable OS image.
(Hopefully both architecture will be killed by subleq.)WHAT I AM DOING WITH MY LIFE?
You must decide for what purpose you are writing the OS. You can just randomly autize on
phone books, you dont need to write an operating system. You certainly should not waste
years into any platforms JUST becouse you want to impress your mom about what brave things
you can print on the screen just using a floppy (btw in 2017 you should not use a floppy ).
1. You want to sell your OS
You must think on what market you want to sell your OS, There is a lot of markets. Your
OS will fly with a space rocket to the Mars? Your OS will run a pantsu seller automat in
Japan? It will drive a game console on TV? The possibilities are basically unlimited, and
a good OS can run ALL of these, but you still must have a conception to SELL your OS
on some market.
2. You want your own code becouse of trust reasons
Maybe you, or your corporation want own kernel code, becouse the system security is a
3. You want to learn, or show up with some good thing in your resume/university appliance
In this case better wont put half naked ascii anime girls in the boot loader like i did.
Or do it, who gives a ****.
Whatever is the purpose are, the principles are the same:
1. your OS have to work on EVERY computer that can be expected to run it. You must decide
a amout of RAM, a CPU with a certain speed, an age of the CPU. If your anwser for this
question is like 2 gbyte of ram, 20 gbyte of hard disk space, 24 col monitor, well then
you basically alreday doing it wrong.
2. again: your OS have to work on EVERY computer that can be expected to run it. It cant
hang becouse you expected something to be implemented in Y ways, but its implemented in X
ways, or its not implemented at all.
3. You must test on a lot of computers, not just 1 or 2. Remember, if you bring it to
show it to a potential buyer, and it not works fully, then there is no excuses.
Its capitalism, your product is measured by what it produces. If nothing, then no
business for you.
4. If there is a feture that works on 100% of computers, but there is a 2x faster
alternative that works on 99% of computers, but you cant detect WHEN its supported,
then you are doomed with the first solution. Basically 1% sounds not too lot, but it is
when you ignoratly implement 100 features like this, at the end you will have no
hardware that is able to run your computer since all of them will lack something.Decide the type of your OS
Nowdays, the most os is optimized for touchscreens and single task operation. This
not means that your OS will be like this. There could be a very lot of OS creating
convention (maybe you come up with your own):
1. DOS/unix style: this means basically in most cases a text mode only operating
system with possibly one task at the time. This means navigating with cd commands
entering commands to copy files.
Adventages of this is the fact its very simple
to do it, the software is very small, and its portable for very old computers
due to the very low memory (few 10kbyte) and cpu demands.
Disadvantages that you will not possibly able to sell it, and there is
alreday tousands of nix clones out (most of the without any usable stuff however)
2. BASIC: this means having a BASIC interpreter (or other programmng language) as
the operating system. When the computer turns on, a BASIC is coming up, where you
typing your simply program code that can be runned (or other programs to be executed
from disk). This was popular in the Z80/commodore times in the 80s (in east europe
even in the 90s), and was also very popular for educational computers.
Adventages: Very simple design with small memory footprint, and you can port it
very easily to other architectures and keeping compatibility.
Disadvantages: Its difficult to do it user-friendly, after all its 2017, it MUST
be done in user friendly ways. Another disadvantage is the speed of the interpreter
(assume you will not write a full compiler).
3. Using GNU userland as tools/gui
Using gnu userland basically makes your software to be a linux protocol clone,
that is slower than linux, has nothing working properly, basically makes your
system to be a packet manager of gnu programs having tousands of dependencies
to paint an icon, There is alreday hundreds of OS-es designed for this
purpose, and cant really achieve anything with this conception. Also you will
have to write special drivers for X (for example graphics drivers) to
allow it to work. You will also need gcc or llvm to be runned on your system
for this if you want be able to compile the stuffs for it.
4. Building your own desktop operating system
Desktop operating systems are very good, but creating paint, wave player,
games, and basically the whole gui system will not fit into one month.
This is not a way to go at all, unless you alreday have experience
and half-done gui-s that can be used from previous projects (like
5. Use DAWN operating system as a GUI
Dawn operating system is an OS designed to be runned on SUBLEQ architecture
to fight against hardware imperialism. Its designed in ways to allow easy
implementation for hardware and emulator creators, it requires no interrupts,
DMA, the hardware is easy. SUBLEQ architecture is only one instruction
(substract then jump if less or equal, B-=A;if(b<=0) jmp C; else jmp+=24).
The hardware emulation is very easy, it can be find in start/help/
hardware.txt, everything is pretty much just a few memory location where you
pass the byte for example when the disk is accessed.
Using DAWN operating system ( http://DawnOS.tk
) gives you compatibility
with other kernels using the Dawn, since Dawn runs they stuff
inside itself on the subleq architecture. It has GUI, and from C compiler
to paint every basic thing that you will need, it also have SMP support.
Disadvantage is basically the emulation itself, it must be done efficiently
to have decent performance. In this tutorial we assume you choosed this.x86 cpu modes
Your OS can run in 16 bit mode, where you can access 640k RAM with segmentation
(16bit segment*16+16 bit prefix), or if you discard the segments, you will
have 64k that can be enough for a basic or unix clone, but will not be
enough for running DawnOS, since it needs 384 mbyte of RAM. However, there is
a mode of x86 called UNREAL mode, where you can use 32 bit memory addresses
and 32 bit registers, so you can see up approx to 3 gbyte RAM (no, not all the 4).
There is a C compiler called SmallerC, its created by Alexey Frunze, its capable
of doing the unreal mode and can produce general .exe files that can be loaded and
tested with dos, or can be booted with alexfrus boot loader called bootprog.
Its possible to make your own boot loader, since alexfru was aware to create
a flexible boot loader, however, he only created a stub boot loader that must be
modifyed from disk to disk (or you should create CHS autodetection),
Or you can use gcc to build your software, and use grub boot loader (also used mostly
for Linux distributions). This will require 32 bit mode, and you loose the ability
to use bios interrupts (unless you start to step back to real mode, or implement
v86 tasks, which you will just not do within a month). With smallerc, you will
be able to use bios interrupts, and you will not have to deal with protected
mode, however, you loose performance, and you will need to add inline assembly.
You can use an assembler, you basically dont need a C compiler. Assembler is very
handy in most of the cases, you dont need to be an ASM pro to create an OS,
and everything is well documented, so you can just find everything on the net you
need to do - and its no special things are needed to do an OS, its almost identical
to do any more conventional program in it.
Deciding the compiler also decides the memory model you will use. From this point, you
will need to rewrite the entrie kernel if you not choosed wisely, and you will not
fit into the 1 month time limit possibly.Protected mode vs real mode on x86
The biggest diferents of this two mode is:
1. in real mode, you can use bios based interrupts (for graphics, disk access)
2. in protected mode, you can use paging. this means the cpu have hardware accelerated
memory address translation for the software(s) running on your kernel, so they have virtual
memory addresses being addressed into real addresses. This is done by passing arrays
to control registers.
3. so doing the memory management in protected mode is a lot of difficult, but when (if)
you need to do address translations or software relocations in real mode, that will be
difficult. both will bring up problems. However, for having a Dawn emulator, or having
a BASIC interpreter-operating system, you dont need memory management or address translation
at all, so you can go with real mode.
4. However, in real mode, you cant have 64 bit mode. To have 64 bit mode, you will need to
have protected mode, and then you enter the long (64 bit) mode. Having 64 bit mode
is however only supported on the newer x86 processors, begining from the last generation of
Pentium4 (pentium D) CPU-s, and Athlon64 (k8).
Lets assume you choosed DawnOS and SmallerC to create your OS. Decide the minimal hardware REQ:
-DawnOS needs 256 mbyte of memory, and additional 64m is reserved for hardware access. This does
not means that you will need to have all of this memory, since if you write an emulator, you can
cache the data out to the hdd, but since basically the very old computers would be too slow to
emulate Dawn on decent speeds aniway, we can pull a line and say that 384 mbyte of RAM is a realistic
-384 mbyte of RAM is not realistically possible to inserted into computers 486 cpu-s, or earlyer
designs, so they can be ruled out immediately. They would not have enough speed aniway. 5x86
or newer architecture CPU-s can have this amout of ram, however. Basically even if an 586 cpu
would be possibly too slow to emulate Dawn (with a full naive emulation), we have to pull the limit
to support an 586 class cpu. This means we cant have SIMD (SSE), If we do, we must do with cpuid.
If the compiler supports multiple cpu-s, we should specify that we are building to 586 cpu-s.
CMOV instruction is also not supported on these old architecture CPU-s. From the 586 we can have
multiple manufacturers and models including Intel, AMD, Cyrix, VIA, Transmeta, SIS, IDT, Rise,
DMP, RDC, NexGen.
-Some older CPU-s lack some features you maybe would use - for example, lack an FPU, or lack
the CPUID instruction. These are extremely rarely used nowdays - such as the NexGen, or old
Cyrix models without CPUID. It is not worth caring for these CPU-s alone, but cpus for industrial
and embedded systems can use similar cores to these, or even can use these old designs by
licensing them. This basically makes it logical to buy one of these old CPU-s, and test if
they are able to run your code or not. Even if it seems illogical to care about these old
designs, it can save you a lot of head ache if your operating system can work in environments
where an unknown x86 clone with extremely old feature set is working. Dont spend too much
time on these CPU-s, but be aware of them, and be aware that your OS can at least boot on them.
-Select a recommended machine demand. This should be a realistic and reasonable machine demand
that will be able to run your operating system in responsive ways. If you do an unix clone, a
bootable BASIC os, this should be much slower than an operating system that is running
a GNU userland, or emulating a DAWN os. Test what is the minimal that still can run the OS
properly, and DONT make assumptions, always test how it works on the specified lowest machine
to ensure it actually will work usable. Starting the OS on x86:
Your boot loader is limited to 512 byte, and it must loaded to the first sector to the HDD.
This called MBR. This should be able to load your operating system kernel from the disk (i suggest
to place your OS into the first 1 mbyte of your disk, and the rest can be any file system).
You must aware that the boot sector must be specially prepared, the end contains the partition
tables, and it needs some special datas to be recognized as a real bootable disk. A lot of bioses
have issues with this, and you will have to play a lot to get it working everywhere. X86 is not
deterministic for such cases, and its basically not properly documented. There is a lot of loaders
on internet to teach you how to do it, but you will still do tests and randomly try to get it working
on various computers to reach the best results.
Some people may use double stage boot loaders, when the first stage just loads the second stage
that enters to protected mode (if such things are needed), sets up graphics, etc. This is not
necessary. Your boot loader should use CHS. https://en.wikipedia.org/wiki/INT_13HDisk access:
x86 has multiple ways to access disk. The first way is the CHS tuple (int 0x13). It works through a bios
interrupt, and the disk can be addressed in cylinder/sector/head variables to read/write one
512 byte sector to specified memory address. This is the most compatible way, it can access to
2 floppy and 2 hard disks, but the access size is limited to approx 8 gbyte data area (1024*63*255*512).
Also you must first ask the bios with the same interrupt to get the maximal CHS values.
Additionally, on HDD-s you can use LBA, which limits the computer to somewhat 123 gbyte, but it is
extended later to have more maximal size. You also can implement an ata handler where you can
copy the data from/to io ports to access the disk. I recommend CHS, becouse it will be implemented
on all computers properly. If you run out from 8 gbyte, for bigger addresses you can use the LBA
method to access. CHS tuple will give you performance of 5-10 mbyte/second, which should be far enough
for generic usage. If you need faster than this, you will have to write DMA. But then you will
not fit into the one month limit. If the interrupt fails, try another one, especially on real hardware. Memory management:
As previously i alreday have mentioned it, you need to decide beethwen using real mode or protected mode,
and i listed both they advantages and disadvantages. There is additional problems you should be aware
1. One is the A20 gate, which means a memory barrier for dos programs to not reach memory locations
above the 1 mbyte. smallerc will do this for you in unreal mode, but if you use assembly, you must investigate
to disable it (and you must design a a careful method to see if it is set or not aniway, since the most
modern intel chips lacks this).
2. The x86 -just like other architectures - are having ,,holes'' in the memory, where the hardware devices
are mapped in. The first hole begins from 640k, and goes up to 1Mbyte, where the vga devices and some
other legacy hardwares are mapped in. The second begins from 15 mbyte and goes up to 16 mbytes, where the
ISA hardwares are mapped in. The third begins from 3 gbyte, and goes up to 4 gbyte, where all of the rest
hardware is mapped in (for example the SVGA frame buffer).Graphics:
You will not write drivers for the graphics cards, but there is a prety much standardized VESA VBE for SVGA
cards that works almost the same ways on every computer. This can be accessed by an interrupt offered by
the video card bios. If vesa is not present, you can fall back to traditional VGA methods to render.http://www.delorie.com/djgpp/doc/ug/graphics/vesa.html
Some SVGA may will give a banked frame buffer, where you must switch the memory banks to access the
whole area. You must detect this through the vesa init (it will give you data back). If vesa init seems
to fail then its an old/incomplete vesa implementation, and you must call for a true color 640x480
mode to see if its possible to set it, or not. If its still wrong, then use VGA fallback.
To access graphics, you need the int 0x10.Mouse and keyboard:
Mouse and keyboard are emulated by the BIOS to ps2 compatible mode. Or at least the keyboard, becouse
the mouse will pretty much be broken when its an usb mouse. USB is difficult, you will not be able to
do it within the 1 month time limit. Keyboard comes on port 0x60, port 0x64 is the port to get if the
packet is sent by the mouse (portin(0x64)&32) or keyboard. Keyboard will send simple scancodes to deal
with. Mouse must be first enabled thought enabling the secondary ps2 port, check the corresponding
tutorial to ask the mouse to start issuing the mouse packets to you. Debugging the mouse will need at
least 2-3 computers, since emulators will just fool you believing if i works, but its not.Sound:
Sound - i have not implemented this, becouse i dont care - can be done for AC97. AC97 is not a chip,
its actually a protocol. The first version of AC97 will usually work with almost all of the integrated
sound hardwares, that can be worked in fix 44 khz stereo mode (which will be probably fine for you).
You maybe will have to have interrupts for this (or maybe you can do without it too).
Remember that an init may causes hang of your system, so the users must be able to disable it. x86 percormance issues with locations:
NOTE: on x86 you may can have performance issues, when the code/data on certain memory locations.
Beware especially to 2k, and 4k data boundaries. Unaligned memory access is supported on x86
but it can cause you code speed to be fall by 80% if you accidentally hit a location unaware of
this. High level languages will hide this, but compilers like smallerc, or most assemblers will not do
it for you. Basically if you think your code is being slow for some unknown reason, just try padding
the code with nops before your function a bit. SMP:
Cores will awake for
"MOV ESI, 0xFEE00300\n"
"MOV EAX, 0x000C4500\n"
"MOV [ESI], EAX\n"
*a bit of wait loop here*
"MOV DWORD EAX, start pointer goes here\n"
"MOV [ESI], EAX\n"
*a bit of wait loop here*
"MOV [ESI], EAX\n"
*a bit of wait loop here*
where your start point is 0x000C4600|((new eip/16)/256.
The 0th core is unaffected by this sequence. New IP must be a 4k boundary. The core can detect its ID number by cpuid by
apic id, and can decrase power consumption by the PAUSE instruction.
If cpuid returns 0 as apic id, its an old dual core configuration, and you must assume 1.
User should be able to skip SMP init, if you implement it, becouse it may will cause troubles on some computers.
This will bring up all the cores, even the disabled ones, so dont be suprised if you get 8 cores on a 6 core machine.Time:
Memory location 0x46c contains the values from a 18 hz timer.Conclusion:
Implementing and debugging the mouse will need 2-3 days. Keyboard ascii charset conversion will need 1-2 days.
SVGA/VGA/VBE will need 2-3 days. Disk access will need 2-3 days. Boot loader will take 1 week. The rest of
the OS will take additional 1-2 weeks (if you use Dawn, if you write a simple BASIC interpreter, or a simple
unix clone). Your OS will be bootable and usable within a month on basically almost all computers around.
Do not excpet anything working perfectly. There is 10 of tousands of motherboards, cpu models, graphics chips
aand pheripheries, all of them will behave differently. If you have your OS running in qemu is not enough, real
life tests will have to be done to ensure everything is fine (for example you expect the caracter to be white
color, but on real world computer it will be maybe interpreted as the blinking bit, etc).
Do not spend too much time on x86 becouse its died out, after you success, you maybe try some arm platform
which is less braindead (but remember, it does not will have unified io systems at all).