So, to simplify the filing system a bit, I'm taking both approaches. Using 3 letter names for the actual directories, and creating human-readable symlinks to them. The root directory will have a '.hidden' file with the same format used by nautilus which lists app, dev, env, etc. directories to be hidden. This introduces a problem though in that symlinks don't work on several filing systems like FAT32. So, to work around this limitation, I want to add a new '.symbolic' file that lists symbolic files. These are just normal text files with a path stored in them. When you try to access a file that doesn't exist, the system will look for a '.symbolic' file and check if the name of that file is listed in it before it decides to toss back an error.
When you try to copy or move symlinks from one filesystem to another, the conversion will be done automatically.
In other news, I'm highly considering the use of PE files as my executable format. The reasons being:
- They're much faster to load when they don't need to be rebased.
- They allow the use of ordinal numbers for symbols, which can be used to make the files smaller and load faster
- The files have DOS stubs located at the start of the file; on 8/16-bit embedded systems, I can just use this instead of the whole PE file, resulting in much more compact binaries. On the other hand, ELF files do not support 8/16-bit architectures (and no, simply ignoring the upper 16 bits of all addresses is not 'support', it's wasteful)
- It is very easy to embed metainformation and resources into PE files. While you can do this for ELF files, there's not really a strict standard about this (that I know of), and exposing those resources to the program itself is tedius.
- PE is like, the 'official' executable format of .NET and Mono. As far as I know, ELF does not currently support this.
EDIT:
On second note, I've decided to stick with ELF, but with an exception. There will be a 'tiny-elf' file prepended to each file that is used for 8 and 16-bit architectures, like how the MZ executable is prepended to PE/COFF.
This 'tiny-elf' format was made specifically with 8/16-bit architectures in mind, both with and without a MMU. For that reason, the format was designed to be very conscious of size. Each file begins with the following 16-byte header:
Code:
u16 signature ; "‰e" (0x8965)
u8 headersize ; headersize + 1 = header size, max = 256 octets
u8 endianness ; endianness of the machine (and file)
u16 machinetype
u4 protection ; link only and allowed rwx page protection
u4 extrasections ; which extra optional sections to use
u8 emptysections ; bitfields for which sections are empty
u16 entrypoint ; entry point for the program
u16 checksum
u32 filesize ; size of the file in octets
The endianness can be unknown (0x00), big (0x01), or little (0x02). All other fields in the file will use this endianness.
The machine type must be an 8 or 16-bit machine id. Two special id's are reserved for unknown (0x00) and custom (0x01). For custom id's, how the files are identified is up to implementation. It may be used in the case where the number of 16-bit unique id's has ran out so that a field in an extended header may be used.
The uppermost bit of the 'protection' field specifies whether the file is for linking only. For programs and overlays, this bit should set to 0. For libraries and relocatable object files, it should be set to 1. The meanings of the allowed protection flags are as follows:
Code:
r-- read protection allowed, has .rwtext
rw- read and write protections allowed, has .rwtext and .text
r-x read and execute protections allowed, has .rwtext and .data
rwx all protections allowed, has .text + .rodata + .rwtext + .data
-w- same as 'rw-'
-wx same as 'rwx'
--x same as 'r-x'
--- same as 'r--'
The 'r' flag pretty much exists only for aethetic reasons and is ignored. The 'extrasections' field specifies which optional extra sections are used:
Code:
bit 3 - .bss
bit 2 - .rel.*
bit 1 - .rela.*
bit 0 - .symtab
The 'emptysections' field tells whether each section is empty (1) or not empty (0). Each bit represents the following sections:
Code:
bit 7 = index 0 = .text (read-only text section)
bit 6 = index 1 = .rodata (read-only data section)
bit 5 = index 2 = .rwtext (writable text section)
bit 4 = index 3 = .data (writable data section)
bit 3 = index 4 = .bss
bit 2 = index 5 = .rel.* (static relocation table)
bit 1 = index 6 = .rela.* (dynamic relocation table)
bit 0 = index 7 = .symtab (symbol table)
The purpose of this is apparent for the '.bss' section, but it also has other uses. For example, someone using a JIT compiler might want to allocate space for a '.rwtext' section at run time while keeping their main code in the '.text' section.
The checksum is the sum of all 16-bit values in the file (within the bounds of the 'filesize' field).
After the header comes an array of section info entries. These entries always appear in the indexed order for the sections and have the following format:
Code:
u16 loadaddress ; address to load the section at
u16 size ; size of the section in octets
After the array of section info entries is the raw data for each section in the same order.
Each relocation in the relocation tables has the following format:
Code:
u8 sectionindex ; index of the section that the relocation is in
u8 type ; the type of relocation
u16 offset ; offset of the relocation
The only currently specified relocation type is null (0x00) for entries removed from the table.
The '.symtab' section contains symbols for all of the relocations in the format of null-terminated strings.
Anything after the offset pointed to by the 'filesize' field is ignored by 8/16-bit loaders. This way, a normal ELF or ELF64 file can immediately follow the stub.
---------------------------
EDIT#2:
Many people may think this is a dumb idea, but I'm going to redefine the *.c extension to mean 'compilable file' instead of 'C code'. There are well over 100 different programming languages that spam the use of several file extensions I need like '.f', 'm', and 'r'. It makes far more sense in my eyes to just specify all of them as generic 'compilable files' and selecting the proper language when passing them to the compiler.