Improve Lock and Double `get` from getOrSet
Description
Activity
Luis MajanoDecember 11, 2023 at 5:35 PM
I have reverted some changes and updated it to just do the gets quietly.

Brad WoodNovember 29, 2023 at 7:54 PM
Hold up, can we discuss this change before it’s pushed through? As the original author of this method, the lock and the double get serve a very specific and important purpose. I’m thinking there is some confusion around what this method does and why it does it, which we should get on the same page prior to just removing it.
This method primarily is designed to ensure multiple threads don’t execute the produce UDF at the same time (which could be costly) which makes only one thread perform the logic to populate the cache while other competing threads wait in line (because of the lock) and once they enter the lock, they will find (via the second get) that the value has not already been created.
So, for example if there is a very large query that takes up to 20 seconds to generate and a fresh cache comes online and 5 threads all go to get this item from cache, we want a few very specific things to happen.
only one thread runs the expensive 20 second query
Under no circumstances should the other threads “dogpile” on and run the expensive query as well
All threads but the first should wait patiently in line until their turn to get into the lock to avoid this duplication of effort
All threads but the first, upon entering the lock, will find the cache now has the value and they can just get it and return (only one thread ever performs both of the in-lock operations! The rest short circuit)
No other threads but those initial ones will ever reach the lock. Once the cache is warmed, all subsequent threads will skip the lock entirely as the first get will return the value immediately meaning only a single cache hit once the cache is warmed.
Now to address the wording of the ticket:
The double lock in
getOrSet
is redundant,
There is only a single lock. In reality, it’s a “double checked lock”.
also includes a double get request for the object
That is absolutely correct, and by design. The trade off for a thread-safe mechanism that avoids a dog-pile affect on startup is to perform the check twice, which is a fast operation to avoid running the potentially-expensive produce UDF more than once. The 1-2 ms for the extra get far outweighs running a 20 second query a second time for no reason, And, in fact, if you let all the threads dogpile and run the query at the same time, it will probably take 40 seconds. It is actually faster overall for the subsequent threads to wait their turn while the first thread produces the value.
this means three cache connections, two of which are inside of the lock.
Again, that is correct and by design. In order to ensure the atomicity of the operation, the code run inside the lock must be run by only one thread at a time. Otherwise, atomicity cannot be guaranteed. And again, ONLY the first thread ever runs the produce so for the rest of the threads wait, they will only run a single operation in the lock-- the get which now will return a value.
Now, I would guess this change is prompted by seeing lock timeouts in your app. If you don’t want the guarantee of atomicity, then you should just use the get()
and set()
methods directly in your code. If you want the guarantee of atomicity and you're getting lock timeouts, that means you have values which take longer to produce than the 45 seconds CacheBox is willing to wait which means two things:
you should deal with the possibility that the data you want may take longer to create than you want and handle the exception
or we should make the lock timeout configurable if you are willing to wait more than 45 seconds for the value.
The double lock in
getOrSet
is redundant, and also includes a double get request for the object. In a distributed cache, especially, this means three cache connections, two of which are inside of the lock.