What is the cache\caching?

Coconut9 · **Posted:** Mon Oct 16, 2017 7:53 am

I read Intel's manual and I think I understand the basic idea of what caches are but I have a lot of questions. Can someone explain it to me simply the problem,the solution and how can I use it, what I need to check, the problems?

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5145

Does this answer any of your questions?

Coconut9 · **Posted:** Mon Oct 16, 2017 11:33 am

Octocontrabass wrote:

Does this answer any of your questions?

Most of them but I want an example of the instructions that manages caching and also I want to know If some program can do something with the caches and create any problem in the kernel.
I have no problem to disable caches at all. How to do that?

stlw · **Joined:** Fri Apr 04, 2008 6:43 am **Posts:** 357

ARISTOS wrote:

Most of them but I want an example of the instructions that manages caching and also I want to know If some program can do something with the caches and create any problem in the kernel.

Caches is made for performance only, the CPU is working hard to keep them coherent with memory so you won't notice any logical difference.
Of course there also not coherent memory regions where caches can hurt, exactly them 'Memory Type' was invented.
These regions should be set explicitly to UNCACHEABLE Memory type through either paging or MTRR.

For example local APIC space is sort of memory mapped device and it must be accessed through UNCACHEABLE operations.
Violating this rule may lead to unpredictable behavior, up to Machine Check exception and system halt.

ARISTOS wrote:

I have no problem to disable caches at all. How to do that?

By setting CR0.CD bit to 1 and running WBINVD instruction.
WIll lead to >1000x slowdown in the execution.

Coconut9 · **Posted:** Mon Oct 16, 2017 10:20 pm

stlw wrote:

ARISTOS wrote:

Most of them but I want an example of the instructions that manages caching and also I want to know If some program can do something with the caches and create any problem in the kernel.

Caches is made for performance only, the CPU is working hard to keep them coherent with memory so you won't notice any logical difference.
Of course there also not coherent memory regions where caches can hurt, exactly them 'Memory Type' was invented.
These regions should be set explicitly to UNCACHEABLE Memory type through either paging or MTRR.

For example local APIC space is sort of memory mapped device and it must be accessed through UNCACHEABLE operations.
Violating this rule may lead to unpredictable behavior, up to Machine Check exception and system halt.

ARISTOS wrote:

I have no problem to disable caches at all. How to do that?

By setting CR0.CD bit to 1 and running WBINVD instruction.
WIll lead to >1000x slowdown in the execution.

So how to manage them In a multicore environment? I cannot find it anywhere.

Brendan · **Posted:** Tue Oct 17, 2017 2:52 am

Hi,

ARISTOS wrote:

Can someone explain it to me simply the problem,the solution and how can I use it, what I need to check, the problems?

ARISTOS wrote:

So how to manage them In a multicore environment? I cannot find it anywhere.

It's hard to answer your questions without having any idea why you're asking them; so...

Why do you think there is a problem that needs to be solved; and why do you think caches need to be managed?

Cheers,

Brendan

Coconut9 · **Posted:** Tue Oct 17, 2017 7:09 am

Brendan wrote:

It's hard to answer your questions without having any idea why you're asking them; so...

Why do you think there is a problem that needs to be solved; and why do you think caches need to be managed?

My question is: "can my kernel have error because of the caches that I should know and can fix or I would not have to deal with it?"

BrightLight · **Posted:** Tue Oct 17, 2017 7:51 am

ARISTOS wrote:

My question is: "can my kernel have error because of the caches that I should know and can fix or I would not have to deal with it?"

Normally, caching doesn't cause any problems. As others have said, caching has mostly to do with the performance and nothing more, and thus you should never disable caching, as it will slow down your code horribly on real hardware, but not so much on emulators because there's no reason for emulators to implement caching. Basically, when you write to memory, the CPU may write to a cache instead of the actual RAM, to speed up the write, and when you read back that location it will read back from the cache, and not the actual RAM as well, to speed up the read. Of course, this does harm in some cases, the main being memory-mapped I/O. Take any MMIO hardware, the I/O APIC as an example, it expects data to be fed to it to directly and not to a cache. Let's assume the I/O APIC memory range had some kind of caching enabled, sometimes when you write to it the data will stay in the cache and not go to the I/O APIC itself. As such, caching should be disabled for any MMIO hardware present.

The caching of specific memory ranges is generally done via MTRR, and the firmware sets this up for you, so you shouldn't need to mess with it. Of course, PAT also lets you configure caching of specific memory ranges, but the firmware doesn't do this for you because it doesn't set up paging for you.

Generally, you shouldn't worry about the exact configuration of the cache, and all you should do it make sure bits 30 and 29 of CR0 register are both zero, which enables global caching and write-through caching. That should be enough to let you achieve decent performance on real hardware, while avoiding problems with MMIO peripherals as well.

Coconut9 · **Posted:** Tue Oct 17, 2017 11:03 am

omarrx024 wrote:

Of course, this does harm in some cases, the main being memory-mapped I/O. Take any MMIO hardware, the I/O APIC as an example, it expects data to be fed to it to directly and not to a cache. Let's assume the I/O APIC memory range had some kind of caching enabled, sometimes when you write to it the data will stay in the cache and not go to the I/O APIC itself. As such, caching should be disabled for any MMIO hardware present.

OK but I did not understood this, Can I disable caching only in some addresses (for example APIC's)? If not when I want to write to APIC I must disable the cached, write to APIC and enable them again or use some bypass instruction such MOVNTI? By the way I know how MOVNTI works but how SFENCE, MFENCE and LFENCE instructions work?

Korona · **Joined:** Thu May 17, 2007 1:27 pm **Posts:** 999

The FENCE instructions have almost nothing to do with the cache (well, they are implemented on top of the cache coherence protocol but the visible effects of the instructions are independent of caching).

As omarrx024 said, you can disable caching to specific regions using the MTRRs and the PAT. However, in almost all situations, the BIOS already performs the correct configurations for you. The only case in which you have to do the management yourself is if you want to turn regular RAM into WC memory to allow non-snooping PCI transactions to that memory. If you do not know that you need to do that, you probably don't want to do it.

The only type of cache you need to manage is the TLB using the invlpg instruction, but that is not related to RAM caching.

Brendan · **Posted:** Tue Oct 17, 2017 8:26 pm

Hi,

ARISTOS wrote:

Brendan wrote:

It's hard to answer your questions without having any idea why you're asking them; so...

Why do you think there is a problem that needs to be solved; and why do you think caches need to be managed?

My question is: "can my kernel have error because of the caches that I should know and can fix or I would not have to deal with it?"

For normal (instruction, data) caches; there's a few special cases where you have to be aware of the caches; but you'd know if you're doing something like that, and if you're not doing something like that then you can ignore caches. One example would be RAM bandwidth benchmarking or code to test for RAM faults; where you don't want to accidentally measure/test the cache instead of measuring/testing RAM (and you'd have to either disable caches somehow or use CLFLUSH). The only other example I can think of is high-end security algorithms where caches can become a side-channel (and could potentially leak information to an attacker).

For normal (instruction, data) caches; there's also some cases where you don't need to care about caches, but could potentially improve performance if you are aware of caches. The main example of this is "software prefetching" (and that's a relatively complex topic on its own). Another example might be page colouring/cache colouring, which is an optimisation intended to improve the cache efficiency a little, where you need to determine the characteristics of the cache (total size, associativity) to figure out the perfect number of colours to have (but where "wrong number of colours" only reduces the theoretical benefits a tiny little bit).

For TLBs (which are like a cache of page translations), you do have to be a lot more aware because they aren't cache coherent - you are responsible for invalidating TLBs where necessary (and can corrupt data if you don't invalidate TLBs when you must).

The other thing that is vaguely related is the CPU's write-combining buffer; which is used for non-temporal stores, and for some "special case" devices (often video RAM and nothing else) if the OS enables/configures "write-combining"; where you might need fences (e.g. MFENCE, SFENCE) if and only if other CPUs might care about the order that writes are actually done.

ARISTOS wrote:

OK but I did not understood this, Can I disable caching only in some addresses (for example APIC's)? If not when I want to write to APIC I must disable the cached, write to APIC and enable them again or use some bypass instruction such MOVNTI? By the way I know how MOVNTI works but how SFENCE, MFENCE and LFENCE instructions work?

To disable caches globally, there's a flag in CR0 (which mostly only prevents new stuff from being cached and doesn't prevent old stuff that's already in caches from remaining in caches; so if you disable caches in CR0 you also need to use WBINVD to get rid of any old stuff that's already in caches).

To change "caching type" (including changing an area to the "uncached" type) for a specific area of the physical address space you can use MTRRs (but you don't need to care about this because firmware is responsible for ensuring safe defaults and things like IO APIC and local APIC will already be "uncached", and if you do modify MTRRs it's a little complicated because you have to ensure all CPUs end up configured the same and there's a strict order things need to be done to avoid problems while MTRRs are being changed).

To change "caching type" (including changing an area to the "uncached" type) for specific virtual page/s, there's flags in page table entries (and for newer CPUs, a table called PAT that determines what the bits actually do).

Note: for newer CPUs (excluding Pentium, 80486, etc) there are actually 5 possible "caching types" (write-back, write-through, read-only, write-combining and uncached); and there's a bunch of rules that determine how the CPU merges "caching type from MTRR" with "caching type from page tables" to find the resulting caching type; where those rules can mostly be summarised as "the result is whatever caches the least", except for a few corner cases involving the "write-combining plus uncached" where "write-combining" takes precedence).

Cheers,

Brendan

Coconut9 · **Posted:** Tue Oct 17, 2017 10:44 pm

Brendan wrote:

To change "caching type" (including changing an area to the "uncached" type) for specific virtual page/s, there's flags in page table entries (and for newer CPUs, a table called PAT that determines what the bits actually do).

OK I think you are telling to me that if I set the "Cache Disable" (4th) bit in the page directory entry, the physical address specified in that the entry it will be read/write without caches, doesn't it will?

Korona · **Joined:** Thu May 17, 2007 1:27 pm **Posts:** 999

Read the Intel manual for the exact interactions with the MTRRs. It's not that simple.

And let me reiterate my warning: You do not want to disable caches for RAM, unless you know exactly why you're doing it.

Brendan · **Posted:** Wed Oct 18, 2017 12:51 am

Hi,

ARISTOS wrote:

Brendan wrote:

To change "caching type" (including changing an area to the "uncached" type) for specific virtual page/s, there's flags in page table entries (and for newer CPUs, a table called PAT that determines what the bits actually do).

OK I think you are telling to me that if I set the "Cache Disable" (4th) bit in the page directory entry, the physical address specified in that the entry it will be read/write without caches, doesn't it will?

Sort of. It's better to think of it as a "no fill mode" (and not a literal "cache disabled"), where cache misses don't cause the cache line to be fetched into the cache (but anything that is already in the cache remains in the cache and can still cause cache hits). This allows for some some bizarre hackery (e.g. "cache as RAM" where you load stuff into the cache then enter "no fill mode" without doing WBINVD, and whatever was preloaded into the cache stays in the cache and acts like RAM; which is a little useful for firmware before it configures the memory controller and can use RAM like normal).

Note that there's actually 3 bits in page table entries now. Originally (80486?) there's was 2 bits (CD/"cache disable" and NW/"Not Write-through") and 3 "caching types" (uncached, write-back and write-through). Then (P6?) they added PAT/"Page Attribute Table" and used a third bit plus the original 2 bits as a 3-bit index into a table, where the actual "caching type" for a page is determined by whatever value is in the corresponding entry in the table. This allowed them to add a few more "caching types" (read-only and write-combining) that could be chosen if your OS supported PAT (by reconfiguring the table, and possibly by using the new 3rd bit). For backward compatibility the default values in the PAT (e.g. when the CPU is first turned on) are chosen so that the old 2 bits still end up choosing what they used to choose.

Of course there's very few sane reasons to want to mess with cache disable/no fill mode, or write-through or PAT or anything else. For memory mapped devices it's better to use MTRRs (and firmware will set them appropriately for almost everything, and the only thing you might want to change is setting video card's memory as "write-combining"); and for normal RAM you always want "write-back" for performance (and it already will be "write-back"). The most common reason to mess with PAT is that there's so many video cards that you ran out of MTTRs (and had to resort to using PAT to make some video card's memory "write-combining").

Cheers,

Brendan

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5145

Brendan wrote:

Originally (80486?) there's was 2 bits (CD/"cache disable" and NW/"Not Write-through") and 3 "caching types" (uncached, write-back and write-through).

Originally in the 486, there were two bits to control the overall cache behavior (CE/"cache enable" and WT/"write through") and three valid combinations of these bits (cache fully enabled, no cache fills, and no cache fills plus write hits don't update main memory). Very early in the 486's lifespan, Intel inverted their meanings and renamed them CD/NW, but otherwise they behaved the same.

There were two more bits to control the cache policy (PCD/"page cache disable" and PWT/"page write through") and three different cache policies that could be specified with these bits (write-back, write-through, and uncached). Switching between write-back and write-through only affected chipset write-back caches, since the 486's internal cache didn't support write-back.

Later 486 models could use the internal cache in write-back mode if the chipset supported it. (Some competitors to the 486 used the invalid CD/NW bit combination to enable write-back, for chipsets that didn't support Intel's cache policy system.)

OSDev.org

What is the cache\caching?

Who is online