OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 9:50 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2, 3
Author Message
 Post subject: Re: PCI Configuration Process
PostPosted: Sat Sep 11, 2021 2:23 am 
Offline
Member
Member
User avatar

Joined: Tue Sep 15, 2020 8:07 am
Posts: 264
Location: London, UK
Octocontrabass wrote:
bloodline wrote:
PCI Config offset 14 (BAR1) can be configured via registers CF8, CF4, and CF3... Which is oddly cryptic as such "registers" don't appear to be located anywhere...

I took another look at the datasheet and it's in there, section 3.2, but it's not something you can manipulate in software: it's controlled by whether resistors are wired up to specific lines of the memory data bus.

But, QEMU is supposed to emulate a card with BAR1 enabled. Are you sure the BARs you're looking at belong to the Cirrus card?


Doh! :oops: Yup, when I did that run, I did have -vga std option set #-o

But still, you pointing out my errors have helped me far more than the documentation did :D

Quote:
bloodline wrote:
Anyway, I might give-up with the old Cirrus chip and follow @thewrongchristian 's advice and try to find some documentation for QEMU's virtio display adaptor...

The VIRTIO specification should help.



I have been looking at this one, it's perhaps more impenetrable than the Cirrus Document :lol: But Just setting -vga virtio as an option in qemu has afforded a slight speed improvement... So that's the route to go down.

_________________
CuriOS: A single address space GUI based operating system built upon a fairly pure Microkernel/Nanokernel. Download latest bootable x86 Disk Image: https://github.com/h5n1xp/CuriOS/blob/main/disk.img.zip
Discord:https://discord.gg/zn2vV2Su


Top
 Profile  
 
 Post subject: Re: PCI Configuration Process
PostPosted: Tue Sep 14, 2021 2:27 am 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 3191
thewrongchristian wrote:
rdos wrote:
Octocontrabass wrote:
Probably for firmware. Actual GPU drivers use DMA to move things around, they usually don't access the framebuffer directly. (Does "memory schedules" refer to a type of DMA?)


Yes. The kernel driver will construct a memory schedule of work to be done, and then the PCIe device will read & write the schedule with DMA (bus mastering).

Anyway, this seems to explain why performance with LFB is slower than using the GPU interface. Something that appears to be a bit illogical at first. However, BARs never have the same performance as bus mastering. It also explains why the LFB should only be written and not read. When doing a read of a BAR, the CPU will need to wait for the PCIe device to read the contents from local RAM and then send it back as a PCIe transaction. With writing using the correct caching settings & a decently implemented PCIe device, the CPU shouldn't need to wait for the PCIe device to handle the request.


I think you're over thinking this.

This is the CL GD5446 we're talking about, a value PCI GFX chipset from the 1990's. Its "GPU" was a simple blitter, and all the VRAM was on the device side of the PCI bus, and relied on write combining for burst performance when writing to the FB.


Certainly, but I'm wondering why modern Intel graphics chips have such a poor performance, and why AMD chips tend to perform a lot better. A poor implementation of the LFB via BARs certainly can explain it. Since Intel assumes everybody will use the GPU interface, they didn't bother with providing speed in the BAR interface.


Top
 Profile  
 
 Post subject: Re: PCI Configuration Process
PostPosted: Tue Sep 14, 2021 12:07 pm 
Offline
Member
Member

Joined: Tue Apr 03, 2018 2:44 am
Posts: 401
rdos wrote:
thewrongchristian wrote:
rdos wrote:
Anyway, this seems to explain why performance with LFB is slower than using the GPU interface. Something that appears to be a bit illogical at first. However, BARs never have the same performance as bus mastering. It also explains why the LFB should only be written and not read. When doing a read of a BAR, the CPU will need to wait for the PCIe device to read the contents from local RAM and then send it back as a PCIe transaction. With writing using the correct caching settings & a decently implemented PCIe device, the CPU shouldn't need to wait for the PCIe device to handle the request.


I think you're over thinking this.

This is the CL GD5446 we're talking about, a value PCI GFX chipset from the 1990's. Its "GPU" was a simple blitter, and all the VRAM was on the device side of the PCI bus, and relied on write combining for burst performance when writing to the FB.


Certainly, but I'm wondering why modern Intel graphics chips have such a poor performance, and why AMD chips tend to perform a lot better. A poor implementation of the LFB via BARs certainly can explain it. Since Intel assumes everybody will use the GPU interface, they didn't bother with providing speed in the BAR interface.


Doesn't the Intel GPU operate exclusively via the regular shared system RAM?

So, writing to the framebuffer is just a case of writing to the physical RAM you've indicated to the GPU to pull the framebuffer contents from. As I understand it, BAR2 indicates the physical memory address of this shared framebuffer RAM, but I've not poked it. I'm still mostly QEMU based at the moment.

The performance (or lack thereof) in the Intel GPU will be a function of the GPU itself. AMD are probably just better at GPUs than Intel.


Top
 Profile  
 
 Post subject: Re: PCI Configuration Process
PostPosted: Wed Sep 15, 2021 7:22 am 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 3191
thewrongchristian wrote:
So, writing to the framebuffer is just a case of writing to the physical RAM you've indicated to the GPU to pull the framebuffer contents from. As I understand it, BAR2 indicates the physical memory address of this shared framebuffer RAM, but I've not poked it. I'm still mostly QEMU based at the moment.


No, when you read or write to a BAR area you will create PCIe requests to the GPU which it needs to serve in real time. It will typically map some of it's local RAM to the BAR, and then it is the respnosibility of the GPU to route between those. If you do a quick, pipe-lined solution, this could indeed end up as highly inefficient. I know because I have implemented BARs myself, and I decided I needed to do bus mastering requests to main memory to achieve the throughput I wanted. If I let the CPU read the BAR, it's too slow, but when the PCIe card uses bus-mastering, I can read it from main memory very fast.


Top
 Profile  
 
 Post subject: Re: PCI Configuration Process
PostPosted: Wed Sep 15, 2021 11:41 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
thewrongchristian wrote:
Doesn't the Intel GPU operate exclusively via the regular shared system RAM?

Yes. Recent ones participate in the cache coherency protocol, too.

thewrongchristian wrote:
As I understand it, BAR2 indicates the physical memory address of this shared framebuffer RAM, but I've not poked it.

BAR2 provides a window into the GPU's view of RAM according to the GPU's page tables. On a recent GPU with coherent shared memory, I would expect it to be slower than directly accessing the memory from the CPU's view.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2, 3

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Majestic-12 [Bot], SemrushBot [Bot] and 60 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group