OSDev.org

The Place to Start for Operating System Developers
It is currently Sat Apr 27, 2024 4:24 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: Where to load the GDT?
PostPosted: Thu Jan 25, 2024 2:34 pm 
Offline

Joined: Sat Jul 15, 2023 11:26 am
Posts: 7
I've been looking at examples and it isn't clear to me where you're supposed to load the GDT. I've seen some examples where the GDT is a global variable and the structure is loaded like so:


Code:
#pragma pack(push, 1)
struct Descriptor
{
    std::uint16_t segment_limit_low;
    std::uint16_t base_address_low;
    std::uint8_t base_address_mid;
    std::uint8_t type : 4;
    std::uint8_t system : 1;
    std::uint8_t descriptor_privilege_level : 1;
    std::uint8_t present : 1;
    std::uint8_t segment_limit_high : 4;
    std::uint8_t available : 1;
    std::uint8_t d_or_b : 1;
    std::uint8_t granularity : 1;
    std::uint8_t base_address_high;
};
#pragma pack(pop)

struct [[gnu::packed]] DescriptorPointer
{
    std::uint16_t limit;
    void *base;
};

void load_gtdr(DescriptorPointer gdtr)
{
    asm volatile(
        "cli;"
        "lgdtl %0;"
        "sti;" ::"m"(gdtr));
}

// Global Variable
auto gdt = Lib::Array<Descriptor, 7> {};

void gdt_init()
{
    auto gdtr = DescriptorPointer { .limit = 0xFFFF, &gdt };
    load_gdt(gdtr)
}


Or should I hardcode the address of `gdt` like

Code:
void gdt_init()
{

    auto gdtr = DescriptorPointer { .limit = 0xFFFF, std::bit_cast<std::uint16_t*>(0xFFFF) };
    load_gdt(gdtr)
}


Thoughts?


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Thu Jan 25, 2024 8:48 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5146
Once you add SMP support to your kernel, you're probably going to want a separate GDT for each CPU (so you can have a separate TSS and, in 32-bit mode, TLS pointer), so you'll dynamically allocate memory for the GDT anyway.

If you're not at that point yet, or you want a temporary GDT for before you've allocated the per-CPU GDT, it makes the most sense to use a symbolic reference to an address chosen by the linker. A global variable, like in your first example, is one way of doing that.

Using an integer constant doesn't really make sense. The GDT doesn't need to be at a specific fixed address, so you might as well let the linker figure out a good address for you.

jarebear wrote:
Code:
    asm volatile(
        "cli;"
        "lgdtl %0;"
        "sti;" ::"m"(gdtr));

Why would interrupts be enabled before you've set up a GDT? Why would you want to enable interrupts immediately after setting up the GDT?

jarebear wrote:
Code:
.limit = 0xFFFF

Allowing the CPU to try to use memory outside your GDT as segment descriptors sounds like a really bad idea. You should set the limit according to the actual size of your GDT.


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Fri Jan 26, 2024 9:02 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1605
The GDT needs to live for the entire time it is loaded (so in most cases the entire run time of the kernel). The GDT pointer however does not. I put the GDT in my CPU descriptor, which is a data structure allocated for each CPU. Except the BSP's CPU descriptor is statically allocated in the data section, so no allocation can fail for it. Personally, I also don't really hold with packed structures and bit fields. I simply declare the GDT as an array of 64-bit numbers, initialize what I can at compile time, initialize the rest at boot time and load it in assembler like this:
Code:
# void init_gdt_asm(const uint64_t *gdt, size_t gdt_size)
.global init_gdt_asm
.type init_gdt_asm, @function
init_gdt_asm:
  subq $16, %rsp
  decw %si
  movq %rdi, 8(%rsp)
  movw %si, 6(%rsp)
  lgdt 6(%rsp)
  addq $16, %rsp
  retq
.size init_gdt_asm, . - init_gdt_asm
For all the other CPUs, you do need to allocate memory anyway (so you can't do SMP until your allocator is working), so allocating the GDT and TSS along with everything else is perfectly OK.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Tue Feb 06, 2024 4:35 pm 
Offline

Joined: Sat Jul 15, 2023 11:26 am
Posts: 7
Octocontrabass wrote:
Once you add SMP support to your kernel, you're probably going to want a separate GDT for each CPU (so you can have a separate TSS and, in 32-bit mode, TLS pointer), so you'll dynamically allocate memory for the GDT anyway.

If you're not at that point yet, or you want a temporary GDT for before you've allocated the per-CPU GDT, it makes the most sense to use a symbolic reference to an address chosen by the linker. A global variable, like in your first example, is one way of doing that.

Using an integer constant doesn't really make sense. The GDT doesn't need to be at a specific fixed address, so you might as well let the linker figure out a good address for you.


Thanks! This is what I ended up doing!

My only other question is the layout of my segment descriptor. I have it laid out like this:

Code:
#pragma pack(push, 1)
    struct Entry
    {
        std::uint16_t segment_limit_low;
        std::uint16_t base_address_low;
        std::uint8_t base_address_mid;
        std::uint8_t type : 4;
        std::uint8_t system : 1;
        std::uint8_t descriptor_privilege_level : 1;
        std::uint8_t present : 1;
        std::uint8_t segment_limit_high : 4;
        std::uint8_t available : 1;
        std::uint8_t d_or_b : 1;
        std::uint8_t granularity : 1;
        std::uint8_t base_address_high;
    };
#pragma pack(pop)


But, I'm afraid that may pose a problem. According to the the intel manual (Section 3.4.5), the segment descriptors are two 32-bit parts with the least significant bit being the segment limit. Because it is packed, the order of the members matter. Reads are performed from lowest to highest, and structs are laid out in sequential order. So the lowest part of my `base_address_mid` because that's where the 32-bit offset begins?

Am I making sense?


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Tue Feb 06, 2024 6:05 pm 
Offline

Joined: Sat Jul 15, 2023 11:26 am
Posts: 7
nullplan wrote:
[...] Personally, I also don't really hold with packed structures and bit fields. I simply declare the GDT as an array of 64-bit numbers [...]


Thanks! I've seen people do it that way. My problem, though, is that I'm concerned with the memory layout of a packed struct. The intel manual shows the descriptor as two separate 32-bit value; so, I had thought that this affected how the descriptor was read.


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Tue Feb 06, 2024 7:22 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5146
jarebear wrote:
Code:
        std::uint8_t descriptor_privilege_level : 1;

The DPL is two bits, not one.

jarebear wrote:
Code:
        std::uint8_t available : 1;
        std::uint8_t d_or_b : 1;

You're missing one bit between these two bits.

jarebear wrote:
The intel manual shows the descriptor as two separate 32-bit value; so, I had thought that this affected how the descriptor was read.

It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Wed Feb 07, 2024 9:59 am 
Offline

Joined: Sat Jul 15, 2023 11:26 am
Posts: 7
Octocontrabass wrote:
It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)

Bear with me. My question is regarding how the endian-ness might affect how the segment descriptor is read.

Take for example two representaions that occupy 2 bytes of memory. A struct that looks like:

Code:
struct [[gnu::packed]] Packed {
  std::uint8_t x;
  std::uint8_t y : 4;
  std::uint8_t z : 4;
};

Packed p = { .x = 0x12, .y = 0x3, .z = 0x4 };


and an unsigned 2 byte value.

Code:
std::uint16_t value = { 0x1234 };


These are laid out in memory differently.

Here's the generated assembly
Code:
0x555555555129 <main()>                     endbr64
0x55555555512d <main()+4>                   push   rbp
0x55555555512e <main()+5>                   mov    rbp,rsp
0x555555555131 <main()+8>                   movzx  eax,WORD PTR [rip+0xecc]        # 0x555555556004
0x555555555138 <main()+15>                  mov    WORD PTR [rbp-0x2],ax
0x55555555513c <main()+19>                  mov    WORD PTR [rbp-0x4],0x1234
0x555555555142 <main()+25>                  mov    eax,0x0
0x555555555147 <main()+30>                  pop    rbp
0x555555555148 <main()+31>                  ret


Here are the initializations.

Code:
0x555555555131 <main()+8>                   movzx  eax,WORD PTR [rip+0xecc]        # 0x555555556004
0x555555555138 <main()+15>                  mov    WORD PTR [rbp-0x2],ax
0x55555555513c <main()+19>                  mov    WORD PTR [rbp-0x4],0x1234


You could see that these values differ in byte arrangement because of endian-ness. Demonstrated in gdb, here is the struct:

Code:
# This is the struct
0x555555556004: 0x4312
(gdb) x/1xh $rbp - 2
0x7fffffffdeae: 0x4312
(gdb) x/2xb $rbp - 2
0x7fffffffdeae: 0x12    0x43


And here is the `std::uint16_t`

Code:
(gdb) x/1xh $rbp - 4
0x7fffffffdeac: 0x1234
(gdb) x/2xb $rbp - 4
0x7fffffffdeac: 0x34    0x12


The key difference being these two lines:

Code:
(gdb) x/2xb $rbp - 2
0x7fffffffdeae: 0x12    0x43


and

Code:
(gdb) x/2xb $rbp - 4
0x7fffffffdeac: 0x34    0x12


The struct and `std::uint16_t` are laid out different.

But back to my original question, the segment limit (bits 0 - 15) occupy two least significant bytes and reads from memory (from my understanding) are from lowest to highest. So, should I have put my struct member std::uint16_t segment_limit_low to account for this?


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Wed Feb 07, 2024 11:12 am 
Offline
Member
Member

Joined: Tue Apr 03, 2018 2:44 am
Posts: 403
jarebear wrote:
Octocontrabass wrote:
It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)

Bear with me. My question is regarding how the endian-ness might affect how the segment descriptor is read.

Take for example two representaions that occupy 2 bytes of memory. A struct that looks like:

Code:
struct [[gnu::packed]] Packed {
  std::uint8_t x;
  std::uint8_t y : 4;
  std::uint8_t z : 4;
};

Packed p = { .x = 0x12, .y = 0x3, .z = 0x4 };


and an unsigned 2 byte value.

Code:
std::uint16_t value = { 0x1234 };


These are laid out in memory differently.

...

But back to my original question, the segment limit (bits 0 - 15) occupy two least significant bytes and reads from memory (from my understanding) are from lowest to highest. So, should I have put my struct member std::uint16_t segment_limit_low to account for this?


This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.

All my code treats descriptors as array of bytes, and uses masks/shifts to update specific fields as required. In fact, for the GDT, I only have the constant kernel/user code/data segments, which are fixed plus null descriptor, current thread TSS descriptor and a double fault TSS descriptor (7 descriptors in total), which are initialised once with constant values, then not touched again.

Intel didn't design their CPU structures with C in mind, it seems.


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Wed Feb 07, 2024 12:32 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1605
thewrongchristian wrote:
This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.
This is one good point. There is also consensus among the ABIs as to how structures work. Less so for bit fields. Sometimes they are allocated from the bottom, sometimes from the top. Sometimes the base type matters, sometimes it doesn't beyond signedness. Indeed, according to ISO-C, the only allowed base types are signed int, unsigned int, and int. Everything else is an extension.

And because of all of this, an array it is for me.

Another point is of course that
Code:
gdt[1] = 0x00af9a000000ffff;

Is just way shorter than
Code:
gdt[1].base_hi = 0;
gdt[1].flags = FLG_GRANULARITY | FLG_LONGMODE;
gdt[1].limit_hi = 0xf;
gdt[1].access = ACCESS_PRESENT | ACCESS_DPL_0 | ACCESS_NON_SYS_SEG;
gdt[1].type = TYPE_READABLE_CODE_SEG;
gdt[1].base_mid = 0;
gdt[1].base_lo = 0;
gdt[1].limit_lo = 0xffff;
Granted, the latter may be more readable, but even all of those words only help you with a CPU manual in hand, and in that case you are still no worse off having to decode the entire thing.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Wed Feb 07, 2024 1:01 pm 
Offline

Joined: Sat Jul 15, 2023 11:26 am
Posts: 7
thewrongchristian wrote:
This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.

Yeah, I checked and it is UB. But just out of curiosity, I made a simple test program

Code:
namespace A {
#pragma pack(push, 1)
struct Descriptor {
  std::uint16_t segment_limit_low;
  std::uint16_t base_address_low;
  std::uint8_t base_address_mid;
  std::uint8_t type : 4;
  std::uint8_t system : 1;
  std::uint8_t descriptor_privilege_level : 2;
  std::uint8_t present : 1;
  std::uint8_t segment_limit_high : 4;
  std::uint8_t available : 1;
  std::uint8_t reserved : 1;
  std::uint8_t d_or_b : 1;
  std::uint8_t granularity : 1;
  std::uint8_t base_address_high;
};
#pragma pack(pop)
} // namespace A

namespace B {
struct Descriptor {
  void set_segment_limit_low(std::uint16_t segment_limit_low) {
    m_entry &= static_cast<std::uint16_t>(0x0000);
    m_entry |= segment_limit_low;
  }

  void set_dpl(std::uint8_t dpl) {
    std::uint64_t mask = 0xFFFF9FFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(dpl) << 45);
    m_entry |= temp;
  }

  void set_granularity(std::uint8_t granularity) {
    std::uint64_t mask = 0xFFBFFFFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(granularity) << 55);
    m_entry |= temp;
  }

  void set_base_address_high(std::uint8_t base_address_high) {
    std::uint64_t mask = 0x00FFFFFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(base_address_high) << 56);
    m_entry |= temp;
  }

private:
  std::uint64_t m_entry;
};
} // namespace B

int main() {
  auto a = A::Descriptor{};
  auto b = B::Descriptor{};

  a.segment_limit_low = 0x3124;
  a.descriptor_privilege_level = 0x3;
  a.granularity = 0x1;
  a.base_address_high = 0x89;

  b.set_segment_limit_low(0x3124);
  b.set_dpl(0x3);
  b.set_granularity(0x1);
  b.set_base_address_high(0x89);

  return 0;
}


And, yep, they're the same.



Values in gdb:

Code:
(gdb) x/1xg &a 
0x7fffffffde70: 0x8980600000003124
(gdb) x/1xg &b.m_entry
0x7fffffffde68: 0x8980600000003124


nullplan wrote:
Sometimes they are allocated from the bottom, sometimes from the top

This was my one worry. But it seems like its working lol.


Top
 Profile  
 
 Post subject: Re: Where to load the GDT?
PostPosted: Wed Feb 07, 2024 7:48 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5146
Bitfield layout is defined by the ABI, so you don't have to worry about it as long as you're not trying to use the same struct with different ABIs.

You probably aren't going to use an x86 segment descriptor struct for more than one ABI. (And even if you do, it'll be x86 ABIs that all handle bitfields the same way.)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Majestic-12 [Bot], SemrushBot [Bot] and 22 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group