Where to load the GDT?

jarebear · **Joined:** Sat Jul 15, 2023 11:26 am **Posts:** 7

I've been looking at examples and it isn't clear to me where you're supposed to load the GDT. I've seen some examples where the GDT is a global variable and the structure is loaded like so:

Code:

#pragma pack(push, 1)
struct Descriptor
{
    std::uint16_t segment_limit_low;
    std::uint16_t base_address_low;
    std::uint8_t base_address_mid;
    std::uint8_t type : 4;
    std::uint8_t system : 1;
    std::uint8_t descriptor_privilege_level : 1;
    std::uint8_t present : 1;
    std::uint8_t segment_limit_high : 4;
    std::uint8_t available : 1;
    std::uint8_t d_or_b : 1;
    std::uint8_t granularity : 1;
    std::uint8_t base_address_high;
};
#pragma pack(pop)

struct [[gnu::packed]] DescriptorPointer
{
    std::uint16_t limit;
    void *base;
};

void load_gtdr(DescriptorPointer gdtr)
{
    asm volatile(
        "cli;"
        "lgdtl %0;"
        "sti;" ::"m"(gdtr));
}

// Global Variable
auto gdt = Lib::Array<Descriptor, 7> {};

void gdt_init()
{
    auto gdtr = DescriptorPointer { .limit = 0xFFFF, &gdt };
    load_gdt(gdtr)
}

Or should I hardcode the address of `gdt` like

Code:

void gdt_init()
{

    auto gdtr = DescriptorPointer { .limit = 0xFFFF, std::bit_cast<std::uint16_t*>(0xFFFF) }; 
    load_gdt(gdtr)
}

Thoughts?

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5146

Once you add SMP support to your kernel, you're probably going to want a separate GDT for each CPU (so you can have a separate TSS and, in 32-bit mode, TLS pointer), so you'll dynamically allocate memory for the GDT anyway.

If you're not at that point yet, or you want a temporary GDT for before you've allocated the per-CPU GDT, it makes the most sense to use a symbolic reference to an address chosen by the linker. A global variable, like in your first example, is one way of doing that.

Using an integer constant doesn't really make sense. The GDT doesn't need to be at a specific fixed address, so you might as well let the linker figure out a good address for you.

jarebear wrote:

Code:

    asm volatile(
        "cli;"
        "lgdtl %0;"
        "sti;" ::"m"(gdtr));

Why would interrupts be enabled before you've set up a GDT? Why would you want to enable interrupts immediately after setting up the GDT?

jarebear wrote:

Code:

.limit = 0xFFFF

Allowing the CPU to try to use memory outside your GDT as segment descriptors sounds like a really bad idea. You should set the limit according to the actual size of your GDT.

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1605

The GDT needs to live for the entire time it is loaded (so in most cases the entire run time of the kernel). The GDT pointer however does not. I put the GDT in my CPU descriptor, which is a data structure allocated for each CPU. Except the BSP's CPU descriptor is statically allocated in the data section, so no allocation can fail for it. Personally, I also don't really hold with packed structures and bit fields. I simply declare the GDT as an array of 64-bit numbers, initialize what I can at compile time, initialize the rest at boot time and load it in assembler like this:

Code:

# void init_gdt_asm(const uint64_t *gdt, size_t gdt_size)
.global init_gdt_asm
.type init_gdt_asm, @function
init_gdt_asm:
  subq $16, %rsp
  decw %si
  movq %rdi, 8(%rsp)
  movw %si, 6(%rsp)
  lgdt 6(%rsp)
  addq $16, %rsp
  retq
.size init_gdt_asm, . - init_gdt_asm

For all the other CPUs, you do need to allocate memory anyway (so you can't do SMP until your allocator is working), so allocating the GDT and TSS along with everything else is perfectly OK.

jarebear · **Joined:** Sat Jul 15, 2023 11:26 am **Posts:** 7

Octocontrabass wrote:

Once you add SMP support to your kernel, you're probably going to want a separate GDT for each CPU (so you can have a separate TSS and, in 32-bit mode, TLS pointer), so you'll dynamically allocate memory for the GDT anyway.

If you're not at that point yet, or you want a temporary GDT for before you've allocated the per-CPU GDT, it makes the most sense to use a symbolic reference to an address chosen by the linker. A global variable, like in your first example, is one way of doing that.

Using an integer constant doesn't really make sense. The GDT doesn't need to be at a specific fixed address, so you might as well let the linker figure out a good address for you.

Thanks! This is what I ended up doing!

My only other question is the layout of my segment descriptor. I have it laid out like this:

Code:

#pragma pack(push, 1)
    struct Entry
    {
        std::uint16_t segment_limit_low;
        std::uint16_t base_address_low;
        std::uint8_t base_address_mid;
        std::uint8_t type : 4;
        std::uint8_t system : 1;
        std::uint8_t descriptor_privilege_level : 1;
        std::uint8_t present : 1;
        std::uint8_t segment_limit_high : 4;
        std::uint8_t available : 1;
        std::uint8_t d_or_b : 1;
        std::uint8_t granularity : 1;
        std::uint8_t base_address_high;
    };
#pragma pack(pop)

But, I'm afraid that may pose a problem. According to the the intel manual (Section 3.4.5), the segment descriptors are two 32-bit parts with the least significant bit being the segment limit. Because it is packed, the order of the members matter. Reads are performed from lowest to highest, and structs are laid out in sequential order. So the lowest part of my `base_address_mid` because that's where the 32-bit offset begins?

Am I making sense?

jarebear · **Joined:** Sat Jul 15, 2023 11:26 am **Posts:** 7

nullplan wrote:

[...] Personally, I also don't really hold with packed structures and bit fields. I simply declare the GDT as an array of 64-bit numbers [...]

Thanks! I've seen people do it that way. My problem, though, is that I'm concerned with the memory layout of a packed struct. The intel manual shows the descriptor as two separate 32-bit value; so, I had thought that this affected how the descriptor was read.

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5146

jarebear wrote:

Code:

std::uint8_t descriptor_privilege_level : 1;

The DPL is two bits, not one.

jarebear wrote:

Code:

        std::uint8_t available : 1;
        std::uint8_t d_or_b : 1;

You're missing one bit between these two bits.

jarebear wrote:

The intel manual shows the descriptor as two separate 32-bit value; so, I had thought that this affected how the descriptor was read.

It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)

jarebear · **Joined:** Sat Jul 15, 2023 11:26 am **Posts:** 7

Octocontrabass wrote:

It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)

Bear with me. My question is regarding how the endian-ness might affect how the segment descriptor is read.

Take for example two representaions that occupy 2 bytes of memory. A struct that looks like:

Code:

struct [[gnu::packed]] Packed {
  std::uint8_t x;
  std::uint8_t y : 4;
  std::uint8_t z : 4;
};

Packed p = { .x = 0x12, .y = 0x3, .z = 0x4 };

and an unsigned 2 byte value.

Code:

std::uint16_t value = { 0x1234 };

These are laid out in memory differently.

Here's the generated assembly

Code:

0x555555555129 <main()>                     endbr64
0x55555555512d <main()+4>                   push   rbp
0x55555555512e <main()+5>                   mov    rbp,rsp
0x555555555131 <main()+8>                   movzx  eax,WORD PTR [rip+0xecc]        # 0x555555556004
0x555555555138 <main()+15>                  mov    WORD PTR [rbp-0x2],ax
0x55555555513c <main()+19>                  mov    WORD PTR [rbp-0x4],0x1234
0x555555555142 <main()+25>                  mov    eax,0x0
0x555555555147 <main()+30>                  pop    rbp
0x555555555148 <main()+31>                  ret

Here are the initializations.

Code:

0x555555555131 <main()+8>                   movzx  eax,WORD PTR [rip+0xecc]        # 0x555555556004
0x555555555138 <main()+15>                  mov    WORD PTR [rbp-0x2],ax
0x55555555513c <main()+19>                  mov    WORD PTR [rbp-0x4],0x1234

You could see that these values differ in byte arrangement because of endian-ness. Demonstrated in gdb, here is the struct:

Code:

# This is the struct
0x555555556004: 0x4312
(gdb) x/1xh $rbp - 2
0x7fffffffdeae: 0x4312
(gdb) x/2xb $rbp - 2
0x7fffffffdeae: 0x12    0x43

And here is the `std::uint16_t`

Code:

(gdb) x/1xh $rbp - 4
0x7fffffffdeac: 0x1234
(gdb) x/2xb $rbp - 4
0x7fffffffdeac: 0x34    0x12

The key difference being these two lines:

Code:

(gdb) x/2xb $rbp - 2
0x7fffffffdeae: 0x12    0x43

and

Code:

(gdb) x/2xb $rbp - 4
0x7fffffffdeac: 0x34    0x12

The struct and `std::uint16_t` are laid out different.

But back to my original question, the segment limit (bits 0 - 15) occupy two least significant bytes and reads from memory (from my understanding) are from lowest to highest. So, should I have put my struct member std::uint16_t segment_limit_low to account for this?

thewrongchristian · **Joined:** Tue Apr 03, 2018 2:44 am **Posts:** 403

jarebear wrote:

Octocontrabass wrote:

It doesn't. How could it? The memory doesn't remember how you wrote the values, it only remembers the values.

(Okay, technically it could impact alignment, since a packed struct has byte-alignment and an array of 64-bit integers has 64-bit alignment, but that only affects speed.)

Bear with me. My question is regarding how the endian-ness might affect how the segment descriptor is read.

Take for example two representaions that occupy 2 bytes of memory. A struct that looks like:

Code:

struct [[gnu::packed]] Packed {
  std::uint8_t x;
  std::uint8_t y : 4;
  std::uint8_t z : 4;
};

Packed p = { .x = 0x12, .y = 0x3, .z = 0x4 };

and an unsigned 2 byte value.

Code:

std::uint16_t value = { 0x1234 };

These are laid out in memory differently.

...

But back to my original question, the segment limit (bits 0 - 15) occupy two least significant bytes and reads from memory (from my understanding) are from lowest to highest. So, should I have put my struct member std::uint16_t segment_limit_low to account for this?

This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.

All my code treats descriptors as array of bytes, and uses masks/shifts to update specific fields as required. In fact, for the GDT, I only have the constant kernel/user code/data segments, which are fixed plus null descriptor, current thread TSS descriptor and a double fault TSS descriptor (7 descriptors in total), which are initialised once with constant values, then not touched again.

Intel didn't design their CPU structures with C in mind, it seems.

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1605

thewrongchristian wrote:

This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.

This is one good point. There is also consensus among the ABIs as to how structures work. Less so for bit fields. Sometimes they are allocated from the bottom, sometimes from the top. Sometimes the base type matters, sometimes it doesn't beyond signedness. Indeed, according to ISO-C, the only allowed base types are signed int, unsigned int, and int. Everything else is an extension.

And because of all of this, an array it is for me.

Another point is of course that

Code:

gdt[1] = 0x00af9a000000ffff;

Is just way shorter than

Code:

gdt[1].base_hi = 0;
gdt[1].flags = FLG_GRANULARITY | FLG_LONGMODE;
gdt[1].limit_hi = 0xf;
gdt[1].access = ACCESS_PRESENT | ACCESS_DPL_0 | ACCESS_NON_SYS_SEG;
gdt[1].type = TYPE_READABLE_CODE_SEG;
gdt[1].base_mid = 0;
gdt[1].base_lo = 0;
gdt[1].limit_lo = 0xffff;

Granted, the latter may be more readable, but even all of those words only help you with a CPU manual in hand, and in that case you are still no worse off having to decode the entire thing.

jarebear · **Joined:** Sat Jul 15, 2023 11:26 am **Posts:** 7

thewrongchristian wrote:

This is why people don't like bitfields and packed structures (myself included).

You're assuming a layout for the bit fields, based on some mental image in your head.

But as far as I know, the C standard dictates no specific layout for bitfields, and doesn't specify packed structures at all.

Yeah, I checked and it is UB. But just out of curiosity, I made a simple test program

Code:

namespace A {
#pragma pack(push, 1)
struct Descriptor {
  std::uint16_t segment_limit_low;
  std::uint16_t base_address_low;
  std::uint8_t base_address_mid;
  std::uint8_t type : 4;
  std::uint8_t system : 1;
  std::uint8_t descriptor_privilege_level : 2;
  std::uint8_t present : 1;
  std::uint8_t segment_limit_high : 4;
  std::uint8_t available : 1;
  std::uint8_t reserved : 1;
  std::uint8_t d_or_b : 1;
  std::uint8_t granularity : 1;
  std::uint8_t base_address_high;
};
#pragma pack(pop)
} // namespace A

namespace B {
struct Descriptor {
  void set_segment_limit_low(std::uint16_t segment_limit_low) {
    m_entry &= static_cast<std::uint16_t>(0x0000);
    m_entry |= segment_limit_low;
  }

  void set_dpl(std::uint8_t dpl) {
    std::uint64_t mask = 0xFFFF9FFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(dpl) << 45);
    m_entry |= temp;
  }

  void set_granularity(std::uint8_t granularity) {
    std::uint64_t mask = 0xFFBFFFFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(granularity) << 55);
    m_entry |= temp;
  }

  void set_base_address_high(std::uint8_t base_address_high) {
    std::uint64_t mask = 0x00FFFFFFFFFFFFFF;
    m_entry &= mask;
    std::uint64_t temp = (static_cast<std::uint64_t>(base_address_high) << 56);
    m_entry |= temp;
  }

private:
  std::uint64_t m_entry;
};
} // namespace B

int main() {
  auto a = A::Descriptor{};
  auto b = B::Descriptor{};

  a.segment_limit_low = 0x3124;
  a.descriptor_privilege_level = 0x3;
  a.granularity = 0x1;
  a.base_address_high = 0x89;

  b.set_segment_limit_low(0x3124);
  b.set_dpl(0x3);
  b.set_granularity(0x1);
  b.set_base_address_high(0x89);

  return 0;
}

And, yep, they're the same.

Values in gdb:

Code:

(gdb) x/1xg &a  
0x7fffffffde70: 0x8980600000003124
(gdb) x/1xg &b.m_entry 
0x7fffffffde68: 0x8980600000003124

nullplan wrote:

Sometimes they are allocated from the bottom, sometimes from the top

This was my one worry. But it seems like its working lol.

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5146

Bitfield layout is defined by the ABI, so you don't have to worry about it as long as you're not trying to use the same struct with different ABIs.

You probably aren't going to use an x86 segment descriptor struct for more than one ABI. (And even if you do, it'll be x86 ABIs that all handle bitfields the same way.)

OSDev.org

Where to load the GDT?

Who is online