OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 3:59 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 3:33 pm 
Offline
Member
Member

Joined: Mon Jul 30, 2018 2:58 am
Posts: 45
So I'm having this very strange bug that triggers a page fault. Apparently, those are the instructions that triggers the page fault:
Code:
(0) [0x000000402f42] 0008:0000000000402f42 (unk. ctxt): add eax, 0x2345e064       ; 0564e04523
<bochs:8> s
Next at t=1451281218
(0) [0x000000402f47] 0008:0000000000402f47 (unk. ctxt): add dword ptr ds:[eax], eax ; 0100

It makes sense as 0x2345e064 is not a valid adress. The problem however is that when I look at my kernel.exe in IDA it shows following instructions at the same spot:
Code:
.text:00402F42                 add     eax, offset word_40E064
.text:00402F47                 movzx   eax, word ptr [eax]

It looks like the code is being modified while the OS is running...

What's more interresting. I started having this issue after changing this line of code:
Code:
terminal_print(term, "\tAvailable memory: %uMB\n", memory_get_available()/1024/1024);

to this one:
Code:
terminal_print(term, "\tAvailable memory: %uMB\n", memory_get_available()/1000000);

in a funtion that has absolutely nothing to do with the page fault triggering function.

I have absolutely no idea what is going on in here. Can you help me with understanding it?


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 3:44 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5100
You've got a pointer somewhere that points to something it shouldn't, and at some point something uses that pointer to write the value 0x12345 into memory.

It's impossible to narrow it down further without more information.


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 3:50 pm 
Offline
Member
Member

Joined: Mon Jul 30, 2018 2:58 am
Posts: 45
Octocontrabass wrote:
You've got a pointer somewhere that points to something it shouldn't, and at some point something uses that pointer to write the value 0x12345 into memory.

It's impossible to narrow it down further without more information.


Is there a way to make bochsdbg inform me when memory at the particular adress gets modified?


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 3:54 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5100
Yes, it's called a watchpoint. Here's the documentation for that.


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 4:17 pm 
Offline
Member
Member

Joined: Mon Jul 30, 2018 2:58 am
Posts: 45
So it looks like this adress is being written to only once, when kernel is being loaded by the bootloader, and it's already wrong. Maybe there is a bug in my bootloader load_kernel funtion:
Code:
load_kernel: ; ret eax = entry point , edx = ImageBase, edi = SizeOfImage
  mov eax, [0x00010000+0x3C];PE offset

  xor ecx, ecx
  mov ecx, [0x00010000+eax+0x50] ;SizeOfImage
  push ecx; push SizeOfImage

  xor ecx, ecx
  mov cx, [0x00010000+eax+6] ;number of section

  mov edx, [0x00010000+eax+0x34] ;image base
  push edx ; push ImageBase

  xor ebx, ebx
  mov bx, [0x00010000+eax+0x14];optional Header Size
  add ebx, 0x00010000+0x18
  add ebx, eax; ebx - section table

  mov eax,[0x00010000+eax+0x28]
  add eax, edx ;eax - AddressOfEntryPoint

  .l1:

  push ecx

  mov ecx, [ebx+0x10];SizeOfRawData
  mov edi, [ebx+0xC];VirtualAdress
  mov esi, [ebx+0x14];PointerToRawData

  add edi, edx
  add esi, 0x00010000

  cld
  rep movsb

  add ebx, 0x28
  pop ecx

  cmp ecx, 5
  loop .l1

  pop edx ; edx = ImageBase
  pop edi ; edi = SizeOfImage

  ret


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 4:29 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5100
That function is supposed to copy the kernel from one location to another, right?

Have you checked to see if it's already corrupt before the copy happens?


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 5:03 pm 
Offline
Member
Member

Joined: Mon Jul 30, 2018 2:58 am
Posts: 45
Octocontrabass wrote:
That function is supposed to copy the kernel from one location to another, right?

Right. The kernel.exe file is present at 0x10000.

Octocontrabass wrote:
Have you checked to see if it's already corrupt before the copy happens?

I'm working on it right now. There are some interresting things happening. So it looks like the load_kernel function breaks the file it is supposed to be copying. The instruction is right before it's copied to the place it is supposed to be:
Code:
<bochs:14> disasm 0x00000012342
0000000000012342: (                    ): add eax, 0x0040e064       ; 0564e04000

Then after the load_kernel function is called this happens:
Code:
<bochs:17> disasm 0x00000012342
0000000000012342: (                    ): add eax, 0x2345e064       ; 0564e04523
<bochs:18> disasm 0x000000402f42
0000000000402f42: (                    ): add eax, 0x2345e064       ; 0564e04523

The code that was meant to be just copied changed and this changed code was copied to 0x402f42.


Top
 Profile  
 
 Post subject: Re: Weird page fault triggering bug...
PostPosted: Fri Mar 17, 2023 5:19 pm 
Offline
Member
Member

Joined: Mon Jul 30, 2018 2:58 am
Posts: 45
I've found the problem!

So before loading kernel I was enabling a20 with a following function:
Code:
enable_a20:
  ;set a20 http://wiki.osdev.org/A20 (Fast A20 Gate)
  pushad ;test a20
  mov edi,0x112345  ;odd megabyte address.
  mov esi,0x012345  ;even megabyte address.
  mov [esi],esi     ;making sure that both addresses contain diffrent values.
  mov [edi],edi     ;(if A20 line is cleared the two pointers would point to the address 0x012345 that would contain 0x112345 (edi))
  cmpsd             ;compare addresses to see if the're equivalent.
  popad
  jne A20_on        ;if not equivalent , A20 line is set.
   
  in al, 0x92 ;enable a20
  test al, 2
  jnz A20_on
  or al, 2
  and al, 0xFE
  out 0x92, al
   
  A20_on:

  ret


Code:
  mov edi,0x112345  ;odd megabyte address.
  mov esi,0x012345  ;even megabyte address.
  mov [esi],esi     ;making sure that both addresses contain diffrent values.
  mov [edi],edi     ;(if A20 line is cleared the two pointers would point to the address 0x012345 that would contain 0x112345

This looks suspicious ,right? =P~

What a stupid bug to make!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot], SanderR and 71 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group