Hi,
ARISTOS wrote:
Brendan wrote:
It's hard to answer your questions without having any idea why you're asking them; so...
Why do you think there is a problem that needs to be solved; and why do you think caches need to be managed?
My question is: "can my kernel have error because of the caches that I should know and can fix or I would not have to deal with it?"
For normal (instruction, data) caches; there's a few special cases where you have to be aware of the caches; but you'd know if you're doing something like that, and if you're not doing something like that then you can ignore caches. One example would be RAM bandwidth benchmarking or code to test for RAM faults; where you don't want to accidentally measure/test the cache instead of measuring/testing RAM (and you'd have to either disable caches somehow or use CLFLUSH). The only other example I can think of is high-end security algorithms where caches can become a side-channel (and could potentially leak information to an attacker).
For normal (instruction, data) caches; there's also some cases where you don't need to care about caches, but could potentially improve performance if you are aware of caches. The main example of this is "software prefetching" (and that's a relatively complex topic on its own). Another example might be
page colouring/cache colouring, which is an optimisation intended to improve the cache efficiency a little, where you need to determine the characteristics of the cache (total size, associativity) to figure out the perfect number of colours to have (but where "wrong number of colours" only reduces the theoretical benefits a tiny little bit).
For TLBs (which are like a cache of page translations), you do have to be a lot more aware because they aren't cache coherent - you are responsible for invalidating TLBs where necessary (and can corrupt data if you don't invalidate TLBs when you must).
The other thing that is vaguely related is the CPU's write-combining buffer; which is used for non-temporal stores, and for some "special case" devices (often video RAM and nothing else) if the OS enables/configures "write-combining"; where you might need fences (e.g. MFENCE, SFENCE) if and only if other CPUs might care about the order that writes are actually done.
ARISTOS wrote:
OK but I did not understood this, Can I disable caching only in some addresses (for example APIC's)? If not when I want to write to APIC I must disable the cached, write to APIC and enable them again or use some bypass instruction such MOVNTI? By the way I know how MOVNTI works but how SFENCE, MFENCE and LFENCE instructions work?
To disable caches globally, there's a flag in CR0 (which mostly only prevents new stuff from being cached and doesn't prevent old stuff that's already in caches from remaining in caches; so if you disable caches in CR0 you also need to use WBINVD to get rid of any old stuff that's already in caches).
To change "caching type" (including changing an area to the "uncached" type) for a specific area of the physical address space you can use MTRRs (but you don't need to care about this because firmware is responsible for ensuring safe defaults and things like IO APIC and local APIC will already be "uncached", and if you do modify MTRRs it's a little complicated because you have to ensure all CPUs end up configured the same and there's a strict order things need to be done to avoid problems while MTRRs are being changed).
To change "caching type" (including changing an area to the "uncached" type) for specific virtual page/s, there's flags in page table entries (and for newer CPUs, a table called PAT that determines what the bits actually do).
Note: for newer CPUs (excluding Pentium, 80486, etc) there are actually 5 possible "caching types" (write-back, write-through, read-only, write-combining and uncached); and there's a bunch of rules that determine how the CPU merges "caching type from MTRR" with "caching type from page tables" to find the resulting caching type; where those rules can mostly be summarised as "the result is whatever caches the least", except for a few corner cases involving the "write-combining plus uncached" where "write-combining" takes precedence).Cheers,
Brendan