Page 1 of 1

function args overwritten

Posted: Sun Apr 21, 2024 5:24 pm
by maxtyson123
https://github.com/maxtyson123/MaxOS/tr ... Management
After sucessfully mapping an address:

Code: Select all

physical = {MaxOS::memory::physical_address_t *} 0xfee00000 
virtual_address = {MaxOS::memory::virtual_address_t *} 0xffffffff7ee00000 
My logging function causes the OS to break, however the reason for this is odd.

Using GDB, I inspected my function:

Code: Select all

void _kprintf_internal(uint8_t type, const char* file, int line, const char* func, const char* format, ...)
...
type = {uint8_t} 0 '\000' [0x0]
file = {const char *} 0xffffffff801b0ba0 "/home/max/MaxOS/kernel/src/memory/physical.cpp"
line = {int} 293 [0x125]
func = {const char *} 0xffffffff801b0c63 "map"
format = {const char *} 0xffffffff801b0c97 "Mapped: 0x%x to 0x%x\n"
Which is correct, but when diving a layer deeper into the first line of the function (which is to print a header)

Code: Select all

pre_kprintf(const char* file, int line, const char* func, uint8_t type)
...
file = {const char *} 0xffffffff80103f7d ""
line = {int} -1 [0xffffffff]
func = {const char *} 0xffffffff801c5a18 "p\226\033\200\377\377\377\377"
type = {uint8_t} 224 '\340' [0xe0]
Which suggest to me that some how ive overriten something somewhere with my page mapping?

Registers at the time of crash

Code: Select all

rax            0xffffffff801b0ba0  -2145711200
rbx            0x2                 2
rcx            0x0                 0
rdx            0xffffffff801b0c63  -2145711005
rsi            0x125               293
rdi            0xffffffff801b0ba0  -2145711200
rbp            0xffffffff801c21d0  0xffffffff801c21d0
rsp            0xffffffff801c20f8  0xffffffff801c20f8
r8             0xffffffff801b0c97  -2145710953
r9             0xfee00000          4276092928
r10            0x0                 0
r11            0x0                 0
r12            0x0                 0
r13            0x0                 0
r14            0x0                 0
r15            0x0                 0
rip            0xffffffff801010cc  0xffffffff801010cc <MaxOS::hardwarecommunication::InterruptManager::HandleException0x03()>
eflags         0x200083            [ ID IOPL=0 SF CF ]
cs             0x8                 8
ss             0x10                16
ds             0x10                16
es             0x10                16
fs             0x10                16
gs             0x10                16
fs_base        0x0                 0
gs_base        0x0                 0
k_gs_base      0x0                 0
cr0            0x80010011          [ PG WP ET PE ]
cr2            0x0                 0
cr3            0x1bc000            [ PDBR=444 PCID=0 ]
cr4            0x20                [ PAE ]
cr8            0x0                 0
efer           0x500               [ LMA LME ]
mxcsr          0x1f80              [ IM DM ZM OM UM PM ]


Re: function args overwritten

Posted: Sun Apr 21, 2024 6:00 pm
by maxtyson123
What I've noted is that whenever the ACPI is mapped (either to identity or to the higher half) the mapping of the apic fails (this is the mapping mentioned in this post). However when I don't map the ACPI the APIC can be mapped and read from (but then obviously I encounter a page fault trying to read from the ACPI later on)

Re: function args overwritten

Posted: Sun Apr 21, 2024 7:17 pm
by Octocontrabass
maxtyson123 wrote:Which suggest to me that some how ive overriten something somewhere with my page mapping?
You can use "info mem" and "info tlb" in the QEMU monitor to check the page mappings. This works even when GDB has halted your kernel, so if there are any unexpected mappings, you can track down exactly which part of your code is creating them.

Re: function args overwritten

Posted: Sun Apr 21, 2024 7:35 pm
by maxtyson123
After doing that I can see that the mappings are correct but the issue still persists:

Code: Select all

ffffffff7ee00000: 00000000fee00000 --------W         (Thats my APIC)
fffffffffffe1000: 000000007ffe1000 --------W            (Thats my ACPI)

Re: function args overwritten

Posted: Sun Apr 21, 2024 7:40 pm
by Octocontrabass
maxtyson123 wrote:After doing that
Did you try both? They're a bit buggy, so sometimes one or the other will disagree with the CPU.

And while you're looking, check for anything else unusual, like the same memory being mapped more than once.
maxtyson123 wrote:I can see that the mappings are correct but the issue still persists:
I notice you're not using the va_end macro. That's undefined behavior. Maybe the mappings are fine and it's undefined behavior that causes your kernel to blow up.

Re: function args overwritten

Posted: Sun Apr 21, 2024 7:50 pm
by maxtyson123
Did you try both? They're a bit buggy, so sometimes one or the other will disagree with the CPU.

And while you're looking, check for anything else unusual, like the same memory being mapped more than once.
I did do both, and the memory regions for both are present, the only odd thing I noticed was that in these regions they were larger than my page size:

Code: Select all

ffffff0000000000-ffffff0000002000 0000000000002000 -rw
ffffff7f80000000-ffffff7f80001000 0000000000001000 -rw
ffffff7fbfc00000-ffffff7fbfc01000 0000000000001000 -rw
ffffff7fbfdfe000-ffffff7fbfe00000 0000000000002000 -rw
ffffff7fbfffd000-ffffff7fc0000000 0000000000003000 -rw
ffffff7fffbf7000-ffffff7fffbf8000 0000000000001000 -rw
ffffff7fffc00000-ffffff7fffc02000 0000000000002000 -rw
ffffff7ffffff000-ffffff8000000000 0000000000001000 -rw
I couldn't find any double ups, but here is the raw output if you need https://pastebin.com/kLu4AuRH
Octocontrabass wrote:I notice you're not using the va_end macro. That's undefined behavior. Maybe the mappings are fine and it's undefined behavior that causes your kernel to blow up.
I changed my code to use va end and it still is not working, which is as expected as the it fails in the function before the params are set up. Thank you for pointing that out to me though.

Re: function args overwritten

Posted: Sun Apr 21, 2024 8:22 pm
by Octocontrabass
maxtyson123 wrote:I couldn't find any double ups, but here is the raw output if you need https://pastebin.com/kLu4AuRH
It looks like physical address 0x210000 has been mapped 146 times. That's a bit more than double.

Use GDB to stop your kernel at different points so you can check "info mem"/"info tlb" and see where your page tables are getting overwritten.
Octocontrabass wrote:which is as expected as the it fails in the function before the params are set up.
Undefined behavior can cause failures anywhere, even in places you don't expect.

Re: function args overwritten

Posted: Sun Apr 21, 2024 9:13 pm
by maxtyson123
Octocontrabass wrote: It looks like physical address 0x210000 has been mapped 146 times. That's a bit more than double.
Ahh I see, I was looking at the wrong column oops.

The TLB only seems to be "dirty" like that after it crashes, for instance, it is as expected after writing the new page, and even during interrupt handling of the crash.
Also to be noted with this interrupt handling it just cycles through all the exceptions (ie starts at 0x1 and after holding done F7 {short cut for step in} it gets to 0x1D)

This is where it crashes if that's helpful.
Image

Re: function args overwritten

Posted: Sun Apr 21, 2024 9:19 pm
by maxtyson123
I Also notice that we running with debugger it is filled with 0x20..... instead of 0x21......

Re: function args overwritten

Posted: Sun Apr 21, 2024 9:37 pm
by Octocontrabass
maxtyson123 wrote:The TLB only seems to be "dirty" like that after it crashes, for instance, it is as expected after writing the new page, and even during interrupt handling of the crash.
Are you sure? There are tools you can use to quickly compare text to find differences.
maxtyson123 wrote:Also to be noted with this interrupt handling it just cycles through all the exceptions (ie starts at 0x1 and after holding done F7 {short cut for step in} it gets to 0x1D)
Does that happen when you intentionally cause an exception to test your exception handlers? If it does, there's something wrong with your exception handlers.
maxtyson123 wrote:This is where it crashes if that's helpful.
I don't see anything obviously wrong there. Perhaps your stack is growing too large and overwriting something important? Stepping through individual instructions instead of source code might be more helpful.

Re: function args overwritten

Posted: Mon Apr 22, 2024 12:54 am
by MichaelPetch
I didn't run your code but I did see this that has a fair amount of undefined behaviour.

Code: Select all

    // Load the GDTR
    asm volatile("lgdt %0" : : "m" (gdtr));

    _kprintf("Loaded GDT\n");

    // Reload the segment registers
    asm volatile("\
        mov $0x10, %ax \n\
        mov %ax, %ds \n\
        mov %ax, %es \n\
        mov %ax, %fs \n\
        mov %ax, %gs \n\
        mov %ax, %ss \n\
        \n\
        pop %rdi \n\
        push $0x8 \n\
        push %rdi \n\
    ");
You clobber RDI and RAX without telling the compiler. You also assume that the return address is still at the top of the stack (and it may not be). You could do something like this instead:

Code: Select all

     asm volatile (
        "lgdt %0\n\t"
        "mov $0x10, %%eax\n\t"
        "mov %%eax, %%ss\n\t"
        "mov %%eax, %%es\n\t"
        "mov %%eax, %%ds\n\t"
        "mov %%eax, %%fs\n\t"
        "mov %%eax, %%gs\n\t"
        "push $0x08\n\t"
        "push 1f(%%rip)\n\t"
        "retfq\n" // Perform an indirect 16:64 JMP using retfq to jump to label 1:
        "1:\n\t" :: "m"(gdtr) : "memory", "rax");

    _kprintf("Loaded GDT\n");
This likely hasn't anything to do with your issue, but undefined behaviour can misbehave in mysterious ways.

Re: function args overwritten

Posted: Mon Apr 22, 2024 4:54 am
by maxtyson123
Octocontrabass wrote: Are you sure? There are tools you can use to quickly compare text to find differences.
The only difference between before the page is mapped and after the handler is run is that ffffffff801b0000: 00000000001b0000 now has the dirty flag
Octocontrabass wrote: Does that happen when you intentionally cause an exception to test your exception handlers? If it does, there's something wrong with your exception handlers.
Yea, my handlers have worked previously so I don't think that is the cause.
Octocontrabass wrote: I don't see anything obviously wrong there. Perhaps your stack is growing too large and overwriting something important? Stepping through individual instructions instead of source code might be more
How would you recommend catching this? By watching RSP in memory view?
MichaelPetch wrote: I didn't run your code but I did see this that has a fair amount of undefined behaviour.
Ahh yes that's my old 32Bit Code that is no longer used

Re: function args overwritten

Posted: Mon Apr 22, 2024 8:11 am
by Octocontrabass
maxtyson123 wrote:The only difference between before the page is mapped and after the handler is run is that ffffffff801b0000: 00000000001b0000 now has the dirty flag
So the page didn't actually get mapped?
maxtyson123 wrote:How would you recommend catching this? By watching RSP in memory view?
The easiest way is a guard page. If your OS crashes due to accessing the guard page, you know the stack overflowed. If you can't set that up for whatever reason, watching RSP in the debugger works too. Debugging at the instruction level instead of source level might also help.

Re: function args overwritten

Posted: Wed Apr 24, 2024 2:45 am
by maxtyson123
Sorry been away for a bit, did some quick testing and found that it is always after the second page is mapped, no matter the address?
Like as soon as that second entry is set the next print fails to execute.