Skip to content

Latest commit

 

History

History
163 lines (116 loc) · 7.08 KB

CACHING.md

File metadata and controls

163 lines (116 loc) · 7.08 KB

Caching

Caching is a mechanism used to temporarily store frequently accessed data, wes don't need to re-fetch it from the database every time. Caching improves performance by reducing the number of queries to the database, especially for data that doesn't change frequently.

Backend caching

1. using decorators

Caching the entire response

# caching
from django.views.decorators.cache import cache_page
from django.utils.decorators import method_decorator

class CustomPageNumberPagination(PageNumberPagination):
    page_size = 5
    page_size_query_param = 'perpage'
    max_page_size = 10

@method_decorator(cache_page(60*15), name="dispatch")
class CourseList(generics.ListAPIView):
    filter_backends = [DjangoFilterBackend, filters.SearchFilter]
    search_fields = ["title","description","slug","subject__title","subject__description"]
    filterset_fields = ["course_start_date","level"]
    queryset = Course.objects.all()
    pagination_class = CustomPageNumberPagination
    serializer_class = CourseSerializer
@method_decorator(cache_page(60*15), name="dispatch")
class SingleCourse(generics.RetrieveAPIView):
    queryset = Course.objects.all()
    serializer_class = CourseSerializer

2. DRF-Specific Caching with Mixins

  • Installation
pip install drf-extensions
from rest_framework_extensions.cache.decorators import cache_response

class CourseList(APIView):
    @cache_response(timeout=60 * 15)
    def get(self, request):
        # fetch and return course data
        return Response({"message": "This response is cached!"})

Invalidation

# cache invalidation
@receiver([post_save, post_delete], sender=Course)
def invalidate_course_cache(sender, instance, **kwargs):
    # delete pattern allows more flexible cache invalidation
    cache.delete('course_list')
    cache.delete(f'single_course_{instance.pk}')

The above snippets are a general overview, now let's look deeper how it works really, what are the key components, and where the cached data will be stored.

1. @cache_response(timeout=seconds*minutes, key_func=key_func)

timeout: it determines when the cached data will invalidated and that means the cached data after timeout duration will be deleted. key_func: it defines how the cache key should be generated, and those keys acts as primary key to which kind of cached data to be returned. each cached data have a key to be looked by. If the key_func is not provided, drf uses URL as the key for the cached data.

Custom key_func:

def key_func(view_instance, view_method, request, args, kwargs):
    key_parts = [
        'course_list',
        str(request.query_params),
        str(request.user.id if request.user.is_authenticated else 'anonymous')
    ]
    return ':'.join(key_parts)

The custom key can be defined by a function, for instance in the above snippet, the data is saved under course_list and dynamic params like user id and query params. Lets say a user with id 100 and query params of search=machine request a data, key will be course_list:search=machine:100. DRF will look if we have data with this key in our cached memory, if so, it will return the data. If not, hit the database, generate a key and save the data in memory and return the result to the user.

now, let's say key_func is not provided, then the key would look like http://127.0.0.1:8000/courses/?search=machine&userid=100.

Caching Decorator (@cache_response):

  • The @cache_response decorator caches the HTTP response generated by the list method.
  • The decorator intercepts the response before it's sent to the client, and instead of generating the response every time, it stores the response in the cache (using Redis, in your case).
  • The cached response is stored for 60*15 seconds (15 minutes in this case). This means that if the same request is made again within 15 minutes, the cached version will be returned, significantly reducing the time taken to process the request.

Key Function (key_func):

  • The key_func generates a unique cache key for each request, which ensures that the cache for one user or query is separate from others. For example, it includes the query parameters (e.g., filters and search parameters), the user ID (if authenticated), and other relevant data in the cache key. This ensures that if a user queries with different filters or search terms, they will get a different cached response for each unique query.

Single view caching

@cache_response(timeout=60*15, key_func=lambda view, view_method, request, args, kwargs: f"single_course_{kwargs.get('pk')}")

For single view, based on the id, the key is generated, for instance single_course_1.

Cache Invalidation (Post Save and Delete)

@receiver([post_save, post_delete], sender=Course)
def invalidate_course_cache(sender, instance, **kwargs):
    # delete pattern allows more flexible cache invalidation
    cache.delete('course_list')
    cache.delete(f'single_course_{instance.pk}'

When a course is saved (post_save) or deleted (post_delete), the cache needs to be cleared so the outdated data doesn't persist.

  • cache.delete('course_list'): This deletes the cache for the course list. If a course is added, deleted, or updated, this ensures that the list will be recalculated when requested again.

  • cache.delete(f'single_course_{instance.pk}'): This deletes the cache for a single course based on its primary key. If a course is modified or deleted, the cached version of that specific course will be invalidated.

Pagination and Caching

Since you're caching the course list, each page of the paginated results will have a unique cache key based on the pagination and query parameters. For example, the cache key for the first page may look like:

course_list:?title=python&level=beginner:123&page=1
class CustomPageNumberPagination(PageNumberPagination):
    page_size = 5
    page_size_query_param = 'perpage'
    max_page_size = 10

    def get_paginated_response(self, data):
        response = super().get_paginated_response(data)
        response.data['page'] = self.page.number
        response.data['page_size'] = self.page_size
        return response

Where the cached data will be stored

By default, it uses the memory of the server that DJANGO project is running, if is running locally, it means it will use local machine ram. If deployed to a server, it will use the servers ram, but it is not an optimal approach ot use the ram of the server for caching at least when the project scale is large.

Content Delivery Network (CDN) Caching

for application that serves media like images or videos, a CDN can be used to cache these assets at edge locations closer to the user, reducing latency and improving loading times.

Django's Default Cache Backend

If we haven't configured any custom cache backend in Django, it defaults to using an in-memory cache (which works only locally). This means:

  • Cached data is stored in the server's memory (RAM).
  • If the server restarts, the cache is lost.
  • The cache is not shared between multiple instances of your application (if you scale horizontally with multiple servers).