double checked lock
tags: learning programming
content
- say we have a program that checks cache; if cache is invalid, it refreshes the cache
if cache_valid():
return get_cache()
return refresh_cache()- for multi-threads, if multiple threads reaches this line of code when the cache expires, we don’t want all threads to refresh the cache
- the fact that it’s cached, means
refresh_cachemight be some expensive calls (time / resource) - and we don’t want race condition / data corruption etc
- the fact that it’s cached, means
- so we add locks for
refresh_cache
lock = asyncio.Lock()
if cache_valid():
return get_cache()
with lock:
return refresh_cache()- the above code works, when a thread reaches:
- if cache valid, gets cache and returns
- if cache invalid, gets lock and refreshes and returns
- when multiple thread reaches, while the cache is invalid:
- all threads tries to get lock, only one acquires it
- the one with lock refreshes cache, and releases lock
- then other threads, who are still waiting for the lock, get the lock in turns and refresh the cache in turns
- as mentioned in ^why-cache, if
refresh_cachetakes 30 seconds to run, then each thread still gets blocked for 30 seconds!- cache works as expected (avoided expensive calls) when a thread reaches and cache is valid
- cache isn't doing anything when the cache is invalid, expensive calls are still made multiple times!
- hence we need another check:
if cache_valid():
return get_cache()
with lock:
if cache_valid()
return get_cache()
return refresh_cache()-
and this is double checked lock
-
in other places, it might be written like this:
if not cache_valid(): # first check
with lock:
if not cache_valid(): # second check
return refresh_cache()
return get_cache()
return get_cache()