Chapter 7: Problem 49
Suppose that Zipf's law holds for accesses to a 10,000 -movie video server. If the server holds the most popular 1000 movies in memory and the remaining 9000 on disk, give an expression for the fraction of all references that will be to memory. Write a little program to evaluate this expression numerically.
Short Answer
Step by step solution
Understanding Zipf's Law
Setting Up the Expression
Calculating the Total Access Probability Sum
Calculating the Access Probability to Top 1000 Movies
Expressing the Desired Fraction
Writing a Program to Evaluate the Expression
Unlock Step-by-Step Solutions & Ace Your Exams!
-
Full Textbook Solutions
Get detailed explanations and key concepts
-
Unlimited Al creation
Al flashcards, explanations, exams and more...
-
Ads-free access
To over 500 millions flashcards
-
Money-back guarantee
We refund you if you fail your exam.
Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!
Key Concepts
These are the key concepts you need to understand to accurately answer the question.
Access Probability and Zipf's Insight
- For instance, the most popular movie (let's say rank 1) is accessed with relatively high probability, which diminishes as we move to rank 2, rank 3, and so forth.
- This probability is mathematically expressed as inversely proportional to the rank raised to a power, i.e., \(\frac{1}{i^s}\), where \(i\) is the rank and \(s\) is the Zipf parameter.
Unveiling the Harmonic Series
- Though this concept might initially seem complex, it is essentially an aggregated sum of reciprocal values.
- In our scenario with movies, it's the sum of probabilities from the first to the last movie, emphasizing how probabilities accumulate across a dataset.
Understanding Memory Fraction in Data Caching
To calculate this fraction:
- We first determine the access probability for movies stored in memory, symbolized as \(C_{1000}\), summing access probabilities of the top 1000 movies.
- Next, we compare this partial sum \(C_{1000}\) against the overall sum for all 10,000 movies (given by \(C\)).
- The resulting quotient, \(F_{memory} = \frac{C_{1000}}{C}\), offers the memory fraction, a meaningful indicator of memory efficiency.
Role of the Normalization Constant
- In our example, the normalization constant \(C\) is derived from the harmonic series over all items, ensuring that overall access probability is distributed rightly.
- Without this constant, the sum of individual access probabilities could exceed 1, leading to inaccurate predictions and inefficient resource allocation.