Hi,
oscoder wrote:
2. Caching flags in a page table entry. I (roughly) understand what the write-through flag does. What I don't get is what the use cases are. When would this be used? Is there anything I *should* be using it for? If I don't have this flag set, when does data get written to physical memory?
There's 2 uses. The first case is for memory mapped devices - if you don't use MTTRs to control "cacheability" (e.g. because you ran out of variable range MTRRs) then you can use the "slightly less good" paging flags instead.
The second case is cache management for normal software. Caches rely on the assumption that recently used data is more likely to be used again soon; but sometimes that assumption is false, and in those cases caches become less efficient. For an example; imagine your application is logging data, where data that's added to the log is not likely to be used again soon. If that "not likely to be used again" data is cached then "more likely to be used again" data has to be evicted from the cache to make room for it; and the performance of your application suffers because "more likely to be used again" data was evicted from the cache. For old CPUs (that don't support CLFLUSH or non-temporal moves) the application could ask kernel to make the log's pages "uncached" to avoid this problem. Of course for newer CPUs it's easier to use CLFLUSH and/or non-temporal moves (and/or prefetching) for cache management instead.
oscoder wrote:
3. Shared memory and caching on multi-processor systems. When writing to a page mapped into the address space of two processors/cores, do I need to do anything to make sure both processors "see" it? (eg enabling the write-through bit) By shared memory I don't mean global memory - eg I mean memory available to two or more processes, but not to *every* process
For normal caches (excluding TLBs), it's all cache coherent and you don't need to do anything. However you may need to be aware of store-forwarding problems. Store forwarding is where a value being stored is forwarded directly to a later load; which can cause problems in extremely rare cases (typically involving using memory alone for synchronisation) if another CPU modifies the value after it was stored but before it was loaded. For a pathological case, consider something like this code (which might be waiting until another CPU finishes doing some work and modifies "foo" to say the work was finished):
Code:
.wait:
mov [foo],eax ;This store
cmp eax,[foo] ;..may be forwarded to here, causing it to work like "cmp eax,eax" without reading from memory
je .wait ;..and turning this into an infinite loop
oscoder wrote:
4. Paging structures and caching. When changing a paging structure mapped into virtual memory, should I do anything to make sure it's written to physical memory too? Or is calling the INVLPG instruction good enough?
You have this backwards. You modify physical memory, then use INVLPG to tell the CPU it needs to update the TLB entry from physical memory. The INVLPG instruction only effects one CPU, and when there's multiple CPUs that could have the old translation in their TLBs you need to do INVLPG on all CPUs. This is called "multi-CPU TLB shootdown" and typically involves sending an "inter-processor interrupt"/IPI to the other CPUs (where the IPI handler does the INVLPG). Because this is expensive there's multiple tricks to avoid it in various cases; starting with "lazy TLB invalidation".
Cheers,
Brendan