To improve efficiencies, ISPs keep copies of requested
resources on users’ devices. This temporary storage facility, called a cache, stores the served
resources for fixed time duration. (Incidentally, this is why users seeking updated content need
to perform a hard refresh to clear their browser cache to load the most recent version of a
webpage).
While page caching greatly improves the page load speeds, it bypasses the server.
Since no request goes to the server, no log entry is created. This loss of information results in
the under-reporting of metrics such as count of page requests.
It is estimated that caching accounts for as much as a third of all page views. Not
only does this result in the loss of data for web analytics, but it also introduces a bias.