Page cache
Page 1 of 1

Author:  sarah99 [ Fri Mar 29, 2019 5:29 am ]
Post subject:  Page cache

What is the best way to implement page cache?

Binary search OR Hashtables?

Which is better? I'm thinking of going with binary search tree as described on interviewbit

Author:  Korona [ Fri Mar 29, 2019 6:19 am ]
Post subject:  Re: Page cache

First of all, implementing a page cache consists of more than just having a file offset → physical page map. However, for the rest of this post, I'll restrict myself to such these maps. I will also assume that the map should be sparse, if it is expected to be dense, a simple linear array will be the best implementation.

Plain binary search trees do not really offer good time complexities (they only guarantee a trivial O(n) complexity for insertion, deletion and search). Balanced binary search trees (with red-black trees being by far the most common) improve this situation and offer O(log n) insertion, deletion and search complexities. However, due to their overhead (at least two pointers per node) they are not really suited for integer → integer maps unless you need their ordering properties (i.e., the ability to find predecessors and successors of given nodes). B-trees improve this situation considerably.

Hash-tables are more space efficient than binary search trees.

However, the superior data structure for page caches are neither binary search trees nor hash tables. Radix trees have very desirable properties for a page cache: they offer O(log d) insertion, deletion and search, where d is the word size. Furthermore, they store contiguous keys very efficiently: if a subregion of a file is densely populated (in the page cache), the radix subtree of that region degrades into a linear array -- the best possible storage strategy. Finally, the most important feature of radix trees is that (with a careful implementation) search can be done concurrently with modification: the radix tree does not need to be locked exclusively while a thread performs a search. This is not true for binary search trees (or B-trees) and somewhat complicated (at least if you want reasonable performance) in the case of hash tables.

Indeed, the x86 page tables are implemented as (simple) radix trees, with a fixed number of bits per level. More efficient designs can use a variable number of bits per level and thus avoid to store (almost) empty levels.

Author:  sarah99 [ Sat Mar 30, 2019 1:13 am ]
Post subject:  Re: Page cache

@Korona thanks for replying. I will consider using Radix trees. Let me learn more about radix trees first.

Page 1 of 1 All times are UTC - 6 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group