OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 7:41 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: No Idea How To Debug This
PostPosted: Sat Dec 29, 2018 7:26 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Hello first post here!

Full disclosure, this is for a class. I am having a bit of a problem with my OS.
Everything works within the kernel, however when I try to run a program outside
the kernel (the shell), I seem to experience a plethora of undefined behavior.
I defined a custom service routine for interrupt 21, and it works fine in the kernel,
however it seems to cause a processor panic when called from the shell. Loops seem
to cause undefined behavior in the shell as well. I tried to get help with another programmer
with this, and although he couldn't figure this out either he seemed to get a processor panic
when he tried to do a loop in the shell (however I did not). Other interrupts seem to work
fine though for the most part. I think this has something to do with the IVT, and more
specifically interrupt 21, however I am unsure about this as when the ISR for interrupt 21 is not
initialized and not called similar problems can still arise. Even if it does have something to do with the IVT
I do not know how I would go about debugging it (tried looking through the memory view on the emulator
but I am unsure if I was looking in the right place), and if its not the IVT I have no idea what it is.

I have been stuck on this issue for quite a while and need to move on in the assignment,
(professor is unavailable) if anyone can help me figure out how to debug this or figure out
what the problem is it would be very helpful, here is the os https://nofile.io/f/0ewTS042E9Y/OS.zip

and this is the emulator its meant to target: https://github.com/mdblack/simulator


The compiler is bcc (Bruce's C Compiler), all the build scripts included should work on linux and there are debug.bat and
build.bat for Windows Subsystem for Linux.


Thank you
- cgbsu


P.s For some reason it needs memcpy, even though I dont call it, there is no c standard library linked
and I dont call it, but this seems to be some sort of optimization or something, so I implemented it, if
anyone knows how to get bcc to stop doing this, please let me know. Iv wondered if it has something to do with this.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Sat Dec 29, 2018 9:34 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.

I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.

I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Mon Dec 31, 2018 11:39 am 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.

I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.

I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.



I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000). 0x0000 should not be used because it is reserved for interrupt vectors. 0x1000 also should not be used because your kernel lives there and you do not want to overwrite it. Segments above 0xA000 are unavailable because the original IBM-PC was limited to 640k of memory."

I assume nothing aside from the regions he mentioned has anything that could easily be corrupted. I have experimented with changing the segment
but not to much avail.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Mon Dec 31, 2018 1:01 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
cgbsu wrote:
I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)

cgbsu wrote:
"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).

That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.

cgbsu wrote:
0x0000 should not be used because it is reserved for interrupt vectors.

It also contains the BDA, another structure that you must not overwrite.

cgbsu wrote:
I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.

Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)

cgbsu wrote:
I have experimented with changing the segment
but not to much avail.

That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 2:32 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
cgbsu wrote:
I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)

cgbsu wrote:
"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).

That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.

cgbsu wrote:
0x0000 should not be used because it is reserved for interrupt vectors.

It also contains the BDA, another structure that you must not overwrite.

cgbsu wrote:
I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.

Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)

cgbsu wrote:
I have experimented with changing the segment
but not to much avail.

That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?


I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am. I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.

If its not I just ran a test:

If I'm not misunderstanding, the EBDA is an area of memory that contains data structures in certain parts of it. According to what you said, it should be free from the end of the kernel to 0x9FC00. The wiki page you linked said that there is a guaranteed space of free memory at 0x7E00, I tried loading the program there and still found issues.

Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 3:53 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
cgbsu wrote:
I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am.

I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.

cgbsu wrote:
I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.

I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.

cgbsu wrote:
Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.

In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 4:51 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Thank you for the reply.

Octocontrabass wrote:
cgbsu wrote:
I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am.

I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.

cgbsu wrote:
I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.

I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.



I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).



Octocontrabass wrote:
cgbsu wrote:
Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.

In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.



As for the return, bcc is being used so main has no return type.
Code:
main() {
    /*Code goes here.*/
}

I thought this may be a semantic difference put there on purpose
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 6:26 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
cgbsu wrote:
I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).

I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.

cgbsu wrote:
As for the return, bcc is being used so main has no return type.
Code:
main() {
    /*Code goes here.*/
}

I thought this may be a semantic difference put there on purpose
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?

In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.

Since you don't have a wrapper function like that, you can't let main return.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 6:47 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
cgbsu wrote:
I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).

I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.



I was viewing it incorrectly, I thought 0x0 to 0xFFFFF was the EBDA and 0x04 - 0x0497 was the BDA with 0x0 to 0xA0000 being the part with the most
stuff crammed into it (basically I was thinking as the EBDA as the larger category encompassing these things) -- my bad.


Octocontrabass wrote:
cgbsu wrote:
As for the return, bcc is being used so main has no return type.
Code:
main() {
    /*Code goes here.*/
}

I thought this may be a semantic difference put there on purpose
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?

In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.

Since you don't have a wrapper function like that, you can't let main return.


There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 03, 2019 8:30 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
cgbsu wrote:
There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.

Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.

Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Tue Jan 08, 2019 12:22 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
cgbsu wrote:
There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.

Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.

Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.


Did you get a chance to revisit it yet?


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Tue Jan 08, 2019 2:16 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
I did, but I don't see any other issues. I'd have to see your current code to be able to figure out why it still doesn't work.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Wed Jan 09, 2019 9:51 am 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
I did, but I don't see any other issues. I'd have to see your current code to be able to figure out why it still doesn't work.


It hasen't changed, sometimes I comment out makeInterrupt21 in KernelInitizlize() and I mess around with the main()'s commenting in and out stuff.


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 10, 2019 2:34 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
If you'd like me to debug further, please provide a build that incorporates fixes to the two issues I mentioned earlier:

  • The stack overlapping the EBDA due to loading the shell at 0x90000
  • The shell program returning from main() instead of halting with an infinite loop


Top
 Profile  
 
 Post subject: Re: No Idea How To Debug This
PostPosted: Thu Jan 10, 2019 12:12 pm 
Offline

Joined: Sat Dec 29, 2018 6:35 pm
Posts: 8
Octocontrabass wrote:
If you'd like me to debug further, please provide a build that incorporates fixes to the two issues I mentioned earlier:

  • The stack overlapping the EBDA due to loading the shell at 0x90000
  • The shell program returning from main() instead of halting with an infinite loop


Done

https://nofile.io/f/50HTF9goGwV/OS1.zip


Also I tried to redo the project according to the professors simpler guidelines and I am getting similar problems:

https://nofile.io/f/sDawmZQ0QXC/OS2.zip


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 66 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group