LFU is a cache eviction algorithm called least frequently used cache.
It requires three data structures. One is a hash table that is used to cache the key/values so that given a key we can retrieve the cache entry at O(1). The second one is a double linked list for each frequency of access. The max frequency is capped at the cache size to avoid creating more and more frequency list entries. If we have a cache of max size 4 then we will end up with 4 different frequencies. Each frequency will have a double linked list to keep track of the cache entries belonging to that particular frequency.
The third data structure would be to somehow link these frequencies lists. It can be either an array or another linked list so that on accessing a cache entry it can be easily promoted to the next frequency list in time O(1).
the main difference is that in LRU we only check on which page is recently that used old in time than other pages i.e checking only based on recent used pages.
BUT in LFU we check the old page as well as the frequency of that page and if frequency of the page is lager than the old page we cant remove it and if we all old pages are having same frequency then take last i.e FIFO method for that. and remove page....
Let's consider a constant stream of cache requests with a cache capacity of 3, see below:
A, B, C, A, A, A, A, A, A, A, A, A, A, A, B, C, D
If we just consider a Least Recently Used (LRU) cache with a HashMap + doubly linked list implementation with O(1) eviction time and O(1) load time, we would have the following elements cached while processing the caching requests as mentioned above.
[A]
[A, B]
[A, B, C]
[B, C, A] <- a stream of As keeps A at the head of the list.
[C, A, B]
[A, B, C]
[B, C, D] <- here, we evict A, we can do better!
When you look at this example, you can easily see that we can do better - given the higher expected chance of requesting an A in the future, we should not evict it even if it was least recently used.
A - 12
B - 2
C - 2
D - 1
Least Frequently Used (LFU) cache takes advantage of this information by keeping track of how many times the cache request has been used in its eviction algorithm.
constant-time insertion only works for the LRU because we use a doubly linked list and insertion is based on time. We just add the
element to the tail of the doubly linked list. In LFU, we can
maintain constant-time lookup but we cannot have constant time
insertion. Because insertion depends on how often items have been
used, each element stores this info so when we add a new item we have
to sort the elements based on this variable.
In LFU, the best way to keep order on entries based on their dynamically changing priority is, of course, a priority queue. It
might be a heap, a Fibonacci heap, or any other implementation. The
order of insertion does not matter here, our priority is how often
each element has been used and it is associated with each node.
we have only two things to change in our caching implementation to switch from an LRU to an LFU:
1- The eviction policy 2- The data structure used to store elements
To create LFU cache we use hashmap and treap. treap is the combination of binary tree and priority queue. The idea in treap, you
have two constraints, you structure one of them with binary tree and
another one with a priority queue. have a look at this graph:
Values of hashmap are pointing to treap nodes. In treap nodes, numbers in small blue rectangles are the number of accesses and this is the priority queue part of the treap. NOTE that children nodes always have a higher priority than their parents, but there is no ordering between siblings.
if the LFU cache runs for long, the turnaround time for older entries that are not popular anymore could be very long.