Page 1 of 1
function args overwritten
Posted: Sun Apr 21, 2024 5:24 pm
by maxtyson123
https://github.com/maxtyson123/MaxOS/tr ... Management
After sucessfully mapping an address:
Code: Select all
physical = {MaxOS::memory::physical_address_t *} 0xfee00000
virtual_address = {MaxOS::memory::virtual_address_t *} 0xffffffff7ee00000
My logging function causes the OS to break, however the reason for this is odd.
Using GDB, I inspected my function:
Code: Select all
void _kprintf_internal(uint8_t type, const char* file, int line, const char* func, const char* format, ...)
...
type = {uint8_t} 0 '\000' [0x0]
file = {const char *} 0xffffffff801b0ba0 "/home/max/MaxOS/kernel/src/memory/physical.cpp"
line = {int} 293 [0x125]
func = {const char *} 0xffffffff801b0c63 "map"
format = {const char *} 0xffffffff801b0c97 "Mapped: 0x%x to 0x%x\n"
Which is correct, but when diving a layer deeper into the first line of the function (which is to print a header)
Code: Select all
pre_kprintf(const char* file, int line, const char* func, uint8_t type)
...
file = {const char *} 0xffffffff80103f7d ""
line = {int} -1 [0xffffffff]
func = {const char *} 0xffffffff801c5a18 "p\226\033\200\377\377\377\377"
type = {uint8_t} 224 '\340' [0xe0]
Which suggest to me that some how ive overriten something somewhere with my page mapping?
Registers at the time of crash
Code: Select all
rax 0xffffffff801b0ba0 -2145711200
rbx 0x2 2
rcx 0x0 0
rdx 0xffffffff801b0c63 -2145711005
rsi 0x125 293
rdi 0xffffffff801b0ba0 -2145711200
rbp 0xffffffff801c21d0 0xffffffff801c21d0
rsp 0xffffffff801c20f8 0xffffffff801c20f8
r8 0xffffffff801b0c97 -2145710953
r9 0xfee00000 4276092928
r10 0x0 0
r11 0x0 0
r12 0x0 0
r13 0x0 0
r14 0x0 0
r15 0x0 0
rip 0xffffffff801010cc 0xffffffff801010cc <MaxOS::hardwarecommunication::InterruptManager::HandleException0x03()>
eflags 0x200083 [ ID IOPL=0 SF CF ]
cs 0x8 8
ss 0x10 16
ds 0x10 16
es 0x10 16
fs 0x10 16
gs 0x10 16
fs_base 0x0 0
gs_base 0x0 0
k_gs_base 0x0 0
cr0 0x80010011 [ PG WP ET PE ]
cr2 0x0 0
cr3 0x1bc000 [ PDBR=444 PCID=0 ]
cr4 0x20 [ PAE ]
cr8 0x0 0
efer 0x500 [ LMA LME ]
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]
Re: function args overwritten
Posted: Sun Apr 21, 2024 6:00 pm
by maxtyson123
What I've noted is that whenever the ACPI is mapped (either to identity or to the higher half) the mapping of the apic fails (this is the mapping mentioned in this post). However when I don't map the ACPI the APIC can be mapped and read from (but then obviously I encounter a page fault trying to read from the ACPI later on)
Re: function args overwritten
Posted: Sun Apr 21, 2024 7:17 pm
by Octocontrabass
maxtyson123 wrote:Which suggest to me that some how ive overriten something somewhere with my page mapping?
You can use "info mem" and "info tlb" in the QEMU monitor to check the page mappings. This works even when GDB has halted your kernel, so if there are any unexpected mappings, you can track down exactly which part of your code is creating them.
Re: function args overwritten
Posted: Sun Apr 21, 2024 7:35 pm
by maxtyson123
After doing that I can see that the mappings are correct but the issue still persists:
Code: Select all
ffffffff7ee00000: 00000000fee00000 --------W (Thats my APIC)
fffffffffffe1000: 000000007ffe1000 --------W (Thats my ACPI)
Re: function args overwritten
Posted: Sun Apr 21, 2024 7:40 pm
by Octocontrabass
maxtyson123 wrote:After doing that
Did you try both? They're a bit buggy, so sometimes one or the other will disagree with the CPU.
And while you're looking, check for anything else unusual, like the same memory being mapped more than once.
maxtyson123 wrote:I can see that the mappings are correct but the issue still persists:
I notice you're not using the va_end macro. That's undefined behavior. Maybe the mappings are fine and it's undefined behavior that causes your kernel to blow up.
Re: function args overwritten
Posted: Sun Apr 21, 2024 7:50 pm
by maxtyson123
Did you try both? They're a bit buggy, so sometimes one or the other will disagree with the CPU.
And while you're looking, check for anything else unusual, like the same memory being mapped more than once.
I did do both, and the memory regions for both are present, the only odd thing I noticed was that in these regions they were larger than my page size:
Code: Select all
ffffff0000000000-ffffff0000002000 0000000000002000 -rw
ffffff7f80000000-ffffff7f80001000 0000000000001000 -rw
ffffff7fbfc00000-ffffff7fbfc01000 0000000000001000 -rw
ffffff7fbfdfe000-ffffff7fbfe00000 0000000000002000 -rw
ffffff7fbfffd000-ffffff7fc0000000 0000000000003000 -rw
ffffff7fffbf7000-ffffff7fffbf8000 0000000000001000 -rw
ffffff7fffc00000-ffffff7fffc02000 0000000000002000 -rw
ffffff7ffffff000-ffffff8000000000 0000000000001000 -rw
I couldn't find any double ups, but here is the raw output if you need
https://pastebin.com/kLu4AuRH
Octocontrabass wrote:I notice you're not using the va_end macro. That's undefined behavior. Maybe the mappings are fine and it's undefined behavior that causes your kernel to blow up.
I changed my code to use va end and it still is not working, which is as expected as the it fails in the function before the params are set up. Thank you for pointing that out to me though.
Re: function args overwritten
Posted: Sun Apr 21, 2024 8:22 pm
by Octocontrabass
It looks like physical address 0x210000 has been mapped 146 times. That's a bit more than double.
Use GDB to stop your kernel at different points so you can check "info mem"/"info tlb" and see where your page tables are getting overwritten.
Octocontrabass wrote:which is as expected as the it fails in the function before the params are set up.
Undefined behavior can cause failures anywhere, even in places you don't expect.
Re: function args overwritten
Posted: Sun Apr 21, 2024 9:13 pm
by maxtyson123
Octocontrabass wrote:
It looks like physical address 0x210000 has been mapped 146 times. That's a bit more than double.
Ahh I see, I was looking at the wrong column oops.
The TLB only seems to be "dirty" like that after it crashes, for instance, it is as expected after writing the new page, and even during interrupt handling of the crash.
Also to be noted with this interrupt handling it just cycles through all the exceptions (ie starts at 0x1 and after holding done F7 {short cut for step in} it gets to 0x1D)
This is where it crashes if that's helpful.
Re: function args overwritten
Posted: Sun Apr 21, 2024 9:19 pm
by maxtyson123
I Also notice that we running with debugger it is filled with 0x20..... instead of 0x21......
Re: function args overwritten
Posted: Sun Apr 21, 2024 9:37 pm
by Octocontrabass
maxtyson123 wrote:The TLB only seems to be "dirty" like that after it crashes, for instance, it is as expected after writing the new page, and even during interrupt handling of the crash.
Are you sure? There are tools you can use to quickly compare text to find differences.
maxtyson123 wrote:Also to be noted with this interrupt handling it just cycles through all the exceptions (ie starts at 0x1 and after holding done F7 {short cut for step in} it gets to 0x1D)
Does that happen when you intentionally cause an exception to test your exception handlers? If it does, there's something wrong with your exception handlers.
maxtyson123 wrote:This is where it crashes if that's helpful.
I don't see anything obviously wrong there. Perhaps your stack is growing too large and overwriting something important? Stepping through individual instructions instead of source code might be more helpful.
Re: function args overwritten
Posted: Mon Apr 22, 2024 12:54 am
by MichaelPetch
I didn't run your code but I did see this that has a fair amount of undefined behaviour.
Code: Select all
// Load the GDTR
asm volatile("lgdt %0" : : "m" (gdtr));
_kprintf("Loaded GDT\n");
// Reload the segment registers
asm volatile("\
mov $0x10, %ax \n\
mov %ax, %ds \n\
mov %ax, %es \n\
mov %ax, %fs \n\
mov %ax, %gs \n\
mov %ax, %ss \n\
\n\
pop %rdi \n\
push $0x8 \n\
push %rdi \n\
");
You clobber RDI and RAX without telling the compiler. You also assume that the return address is still at the top of the stack (and it may not be). You could do something like this instead:
Code: Select all
asm volatile (
"lgdt %0\n\t"
"mov $0x10, %%eax\n\t"
"mov %%eax, %%ss\n\t"
"mov %%eax, %%es\n\t"
"mov %%eax, %%ds\n\t"
"mov %%eax, %%fs\n\t"
"mov %%eax, %%gs\n\t"
"push $0x08\n\t"
"push 1f(%%rip)\n\t"
"retfq\n" // Perform an indirect 16:64 JMP using retfq to jump to label 1:
"1:\n\t" :: "m"(gdtr) : "memory", "rax");
_kprintf("Loaded GDT\n");
This likely hasn't anything to do with your issue, but undefined behaviour can misbehave in mysterious ways.
Re: function args overwritten
Posted: Mon Apr 22, 2024 4:54 am
by maxtyson123
Octocontrabass wrote:
Are you sure? There are tools you can use to quickly compare text to find differences.
The only difference between before the page is mapped and after the handler is run is that ffffffff801b0000: 00000000001b0000 now has the dirty flag
Octocontrabass wrote:
Does that happen when you intentionally cause an exception to test your exception handlers? If it does, there's something wrong with your exception handlers.
Yea, my handlers have worked previously so I don't think that is the cause.
Octocontrabass wrote:
I don't see anything obviously wrong there. Perhaps your stack is growing too large and overwriting something important? Stepping through individual instructions instead of source code might be more
How would you recommend catching this? By watching RSP in memory view?
MichaelPetch wrote:
I didn't run your code but I did see this that has a fair amount of undefined behaviour.
Ahh yes that's my old 32Bit Code that is no longer used
Re: function args overwritten
Posted: Mon Apr 22, 2024 8:11 am
by Octocontrabass
maxtyson123 wrote:The only difference between before the page is mapped and after the handler is run is that ffffffff801b0000: 00000000001b0000 now has the dirty flag
So the page didn't actually get mapped?
maxtyson123 wrote:How would you recommend catching this? By watching RSP in memory view?
The easiest way is a guard page. If your OS crashes due to accessing the guard page, you know the stack overflowed. If you can't set that up for whatever reason, watching RSP in the debugger works too. Debugging at the instruction level instead of source level might also help.
Re: function args overwritten
Posted: Wed Apr 24, 2024 2:45 am
by maxtyson123
Sorry been away for a bit, did some quick testing and found that it is always after the second page is mapped, no matter the address?
Like as soon as that second entry is set the next print fails to execute.