double checked lock

tags: learning programming

content

  • say we have a program that checks cache; if cache is invalid, it refreshes the cache
if cache_valid():
	return get_cache()
return refresh_cache()
  • for multi-threads, if multiple threads reaches this line of code when the cache expires, we don’t want all threads to refresh the cache
    • the fact that it’s cached, means refresh_cache might be some expensive calls (time / resource)
    • and we don’t want race condition / data corruption etc
  • so we add locks for refresh_cache
lock = asyncio.Lock()
 
if cache_valid():
	return get_cache()
with lock:
	return refresh_cache()
  • the above code works, when a thread reaches:
    • if cache valid, gets cache and returns
    • if cache invalid, gets lock and refreshes and returns
  • when multiple thread reaches, while the cache is invalid:
    • all threads tries to get lock, only one acquires it
    • the one with lock refreshes cache, and releases lock
    • then other threads, who are still waiting for the lock, get the lock in turns and refresh the cache in turns
  • as mentioned in ^why-cache, if refresh_cache takes 30 seconds to run, then each thread still gets blocked for 30 seconds!
    • cache works as expected (avoided expensive calls) when a thread reaches and cache is valid
    • cache isn't doing anything when the cache is invalid, expensive calls are still made multiple times!
  • hence we need another check:
if cache_valid():
	return get_cache()
with lock:
	if cache_valid()
		return get_cache()
	return refresh_cache()
  • and this is double checked lock

  • in other places, it might be written like this:

if not cache_valid():  # first check
	with lock:
		if not cache_valid():  # second check
			return refresh_cache()
		return get_cache()
return get_cache()

up

down

reference