Add request limit cache #1275

JesseAbram · 2025-01-31T18:10:16Z

Originally started off as writing a threaded test for multiple txs, found out in the increment_or_wipe_request_limit function there was a reserve_key kvdb call that did not handle the overloading tx well and caused failures. This was solved by adding a cache and moving the request limit there

The cache probably should have been used before and will open up an issue about moving things that were stored in the kvdb that shouldn't to it. #1280

As well will open an issue to create a deletion strategy for request limit as it is O(n) (this problem existed with the kvdb too) #1281

Also maybe we should merge other app_state hashmaps into cache #1282

…loaded-signing-test

ameba23

The signing stress test is great ⭐

And the cache is definitely an improvement, since this stuff does not need to be stored on disk.

But i would have done it a bit differently - rather than a generalised cache i would have separate data structures for the different things we need to store.

So for the request limit i would have the property of AppState be:

request_limit: Arc<RwLock<HashMap<VerifyingKey, RequestLimitStorage>>>

and have similar properties for the other things we need to store (we could put them in a sub-struct if AppState is starting to get a mess).

The advantages:

Less time spent waiting for lock to become free, as with everything behind a single RwLock the chances of it being locked are much higher. I think putting them all behind the same Arc is fine.
Faster because you don't need to serialize and deserialize values.
The type system guarantees that entries are well formed - impossible to have a failed lookup or corrupt value.

ameba23 · 2025-02-05T07:30:04Z

crates/threshold-signature-server/src/lib.rs

 impl AppState {
    /// Setup AppState with given secret keys
    pub fn new(
        configuration: Configuration,
        kv_store: KvManager,
        pair: sr25519::Pair,
        x25519_secret: StaticSecret,
+        cache: Cache,


nit: It looks like we always call this constructor with the cache being empty - so it would be slightly less code to create a new cache here rather than have it be passed in.

ya but if we want to test some pre state it would be easier......however I did create the unsafe calls for that

I would also initialize it to Default internally instead of passing it in. There are methods to write into the cache which can be used for testing

crates/threshold-signature-server/src/lib.rs

ameba23 · 2025-02-05T07:42:28Z

crates/threshold-signature-server/src/lib.rs

+        self.clear_poisioned_chache();
+        let cache =
+            self.cache.read().map_err(|_| anyhow!("Error getting read read_from_cache lock"))?;
+        Ok(cache[key].clone())


We should definitely use .get here and return an option, as this will panic if the key does not exist.

Also, it gives us a way to check if the key exists and get it if it does in one go. Otherwise we end up locking the RwLock once for exists, unlocking again then immediately locking again to get the value.

ameba23 · 2025-02-05T08:17:29Z

crates/threshold-signature-server/src/lib.rs

@@ -344,6 +352,46 @@ impl AppState {
        ))
    }

+    pub fn write_to_cache(&self, key: String, value: Vec<u8>) -> anyhow::Result<()> {


nit: Im not super keen on using anyhow here as strictly speaking this is library code. Probably it doesn't really matter but in best case we make have a CacheError

Im ok with that, I saw your method I think I was thinking of a method with multiple errors then never ended up writing one

JesseAbram · 2025-02-05T14:54:06Z

The signing stress test is great ⭐

And the cache is definitely an improvement, since this stuff does not need to be stored on disk.

But i would have done it a bit differently - rather than a generalised cache i would have separate data structures for the different things we need to store.

So for the request limit i would have the property of AppState be:

request_limit: Arc<RwLock<HashMap<VerifyingKey, RequestLimitStorage>>>

and have similar properties for the other things we need to store (we could put them in a sub-struct if AppState is starting to get a mess).

The advantages:

Less time spent waiting for lock to become free, as with everything behind a single RwLock the chances of it being locked are much higher. I think putting them all behind the same Arc is fine.

Faster because you don't need to serialize and deserialize values.

The type system guarantees that entries are well formed - impossible to have a failed lookup or corrupt value.

hmmmmm I see the benefit, the issue being it will be writing a lot more code instead of a generalized get call or write call it becomes multiple functions for each storage unit, im on the fence, Actually don't respond here let's talk about it on #1282

Co-authored-by: peg <[email protected]>

HCastano

I'm with Peg here, we should be leveraging the type system here more + also defining the cache a bit better (but I guess that's being discussed in #1282).

HCastano · 2025-02-05T18:46:18Z

crates/threshold-signature-server/src/lib.rs

 impl AppState {
    /// Setup AppState with given secret keys
    pub fn new(
        configuration: Configuration,
        kv_store: KvManager,
        pair: sr25519::Pair,
        x25519_secret: StaticSecret,
+        cache: Cache,


I would also initialize it to Default internally instead of passing it in. There are methods to write into the cache which can be used for testing

HCastano · 2025-02-05T18:47:17Z

crates/threshold-signature-server/src/lib.rs

@@ -344,6 +352,46 @@ impl AppState {
        ))
    }

+    pub fn write_to_cache(&self, key: String, value: Vec<u8>) -> anyhow::Result<()> {


All these public methods should be documented

HCastano · 2025-02-05T18:51:42Z

crates/threshold-signature-server/src/lib.rs

+            self.cache.clear_poison()
+        }
+    }
+    // TODO delete from cache


Leftover TODO

HCastano · 2025-02-05T18:56:14Z

crates/threshold-signature-server/src/lib.rs

@@ -344,6 +352,46 @@ impl AppState {
        ))
    }

+    pub fn write_to_cache(&self, key: String, value: Vec<u8>) -> anyhow::Result<()> {
+        self.clear_poisioned_cache();


Not sure we should just be bypassing this check every time we try and write.

For example, let's say there's a thread that's updating the request limit and it keeps panicking when trying to update the limit. By bypassing this we could end up with inconsistent data here, potentially leading to the limit just being ignored.

for one block, which would be better than a poisioned cache forever making it unable to sign

HCastano · 2025-02-05T18:58:16Z

crates/threshold-signature-server/src/lib.rs

+/// A global cache type for the TSS
+pub type Cache = Arc<RwLock<HashMap<String, Vec<u8>>>>;


Kinda strange to have this between the struct definition and implementation block

HCastano · 2025-02-05T19:00:39Z

crates/threshold-signature-server/src/lib.rs

+/// A global cache type for the TSS
+pub type Cache = Arc<RwLock<HashMap<String, Vec<u8>>>>;


By using a Vec<u8> value we lose type guarantees here for no real benefit. We know exactly what type(s) we want to store, so imo we should just use them directly

HCastano · 2025-02-05T19:02:21Z

crates/threshold-signature-server/src/user/api.rs

@@ -602,7 +601,7 @@ pub async fn check_for_key(account: &str, kv: &KvManager) -> Result<bool, UserEr
 /// Checks the request limit
 pub async fn request_limit_check(
    rpc: &LegacyRpcMethods<EntropyConfig>,
-    kv_store: &KvManager,
+    app_state: &AppState,


I'd try and limit the scope of what we pass into here. We don't need the whole of AppState, just the cache

currently all hellper functions are in app state #1282 should handle that by creating a sub impl of cache

HCastano · 2025-02-05T19:04:04Z

crates/threshold-signature-server/src/user/tests.rs

@@ -429,6 +440,84 @@ async fn signature_request_with_derived_account_works() {
    clean_tests();
 }

+#[tokio::test]
+#[serial]
+async fn signature_request_overload() {


What are the actual assertions needed for this test? I don't see anything (either success or failure)

that wwe get a verified signature https://github.com/entropyxyz/entropy-core/pull/1275/files#diff-e1d8c0027f2fda5b3998b919bb461f5935ad1f6690b719c637968150be6f5e76R502-R506

HCastano · 2025-02-05T20:21:38Z

@JesseAbram

@dvdplm suggested that we can probably get away with a non-poisoning lock here instead of explicitly ignoring the poisoned lock. parking_lot::RwLock fits the description here, so maybe consider using that?

JesseAbram · 2025-02-05T21:00:32Z

@JesseAbram

@dvdplm suggested that we can probably get away with a non-poisoning lock here instead of explicitly ignoring the poisoned lock. parking_lot::RwLock fits the description here, so maybe consider using that?

From the docs, "No poisoning, the lock is released normally on panic." which doesn't help the initial concern of the data not being updated.......which again is mostly being off by one which isn't a big deal, however locking the sign flow is, anyways can look in more on monday, make an issue

HCastano · 2025-02-05T21:29:24Z

From the docs, "No poisoning, the lock is released normally on panic." which doesn't help the initial concern of the data not being updated.......which again is mostly being off by one which isn't a big deal, however locking the sign flow is, anyways can look in more on monday, make an issue

One of the benefits here is that we lean into the fact that we're not offering any guarantees around a non-poisoned lock state - as opposed to hiding that from the caller by clearing the poisoned state internally

JesseAbram · 2025-02-05T21:39:33Z

From the docs, "No poisoning, the lock is released normally on panic." which doesn't help the initial concern of the data not being updated.......which again is mostly being off by one which isn't a big deal, however locking the sign flow is, anyways can look in more on monday, make an issue

One of the benefits here is that we lean into the fact that we're not offering any guarantees around a non-poisoned lock state - as opposed to hiding that from the caller by clearing the poisoned state internally

right but that means we either do this for all the rwlocks here or just this one? Im writting this one, I know that a poision lock should not lock the request limit Im not sure about the other rwlocks in app_state. would you rather be explicit about this or have two different rwlock libraries

HCastano · 2025-02-05T21:41:57Z

right but that means we either do this for all the rwlocks here or just this one? Im writting this one, I know that a poision lock should not lock the request limit Im not sure about the other rwlocks in app_state. would you rather be explicit about this or have two different rwlock libraries

Not sure - depends on the other data. Let's leave it as is for now, and make a concrete decision in #1282?

JesseAbram · 2025-02-05T21:43:32Z

right but that means we either do this for all the rwlocks here or just this one? Im writting this one, I know that a poision lock should not lock the request limit Im not sure about the other rwlocks in app_state. would you rather be explicit about this or have two different rwlock libraries

Not sure - depends on the other data. Let's leave it as is for now, and make a concrete decision in #1282?

lol pretty much what I was thinking, also to be honest talking about this mutex problem is fun id like to bring it up in our next call.......just cuz idk it is fun

JesseAbram added 9 commits January 31, 2025 13:09

Overload signing test

dbd69cd

change message to not bork session id

5683c57

fix test mismatch

e863d38

fix test

3edc5a8

add a caching layer to tss

477ecad

add more tests

d58be6d

fix test

40cd8fb

caches

6d7791e

remove unwraps

ca2100c

JesseAbram changed the title ~~Overload signing test~~ Add Cache Feb 4, 2025

JesseAbram added 4 commits February 4, 2025 18:34

docs

720cb27

Merge branch 'master' of github.com:entropyxyz/entropy-core into over…

da6aac2

…loaded-signing-test

fmt

52aa828

fmt

b255d1c

JesseAbram marked this pull request as ready for review February 5, 2025 00:21

JesseAbram requested review from HCastano and ameba23 February 5, 2025 00:21

Merge branch 'master' of github.com:entropyxyz/entropy-core into over…

16f8de2

…loaded-signing-test

ameba23 approved these changes Feb 5, 2025

View reviewed changes

ameba23 reviewed Feb 5, 2025

View reviewed changes

ameba23 mentioned this pull request Feb 5, 2025

Merging All HashMap in app state to generalized cache #1282

Open

JesseAbram and others added 2 commits February 5, 2025 06:59

Apply suggestions from code review

c80c77b

Co-authored-by: peg <[email protected]>

fix

dca26d9

HCastano reviewed Feb 5, 2025

View reviewed changes

change generalized cache to request limit

2747fd4

JesseAbram changed the title ~~Add Cache~~ Add Request limit cache Feb 5, 2025

JesseAbram changed the title ~~Add Request limit cache~~ Add request limit cache Feb 5, 2025

JesseAbram merged commit 0fed38b into master Feb 6, 2025
8 checks passed

JesseAbram deleted the overloaded-signing-test branch February 6, 2025 01:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add request limit cache #1275

Add request limit cache #1275

JesseAbram commented Jan 31, 2025 •

edited

Loading

ameba23 left a comment

ameba23 Feb 5, 2025

JesseAbram Feb 5, 2025

HCastano Feb 5, 2025

ameba23 Feb 5, 2025

ameba23 Feb 5, 2025

JesseAbram Feb 5, 2025

JesseAbram commented Feb 5, 2025 •

edited

Loading

HCastano left a comment

HCastano Feb 5, 2025

HCastano Feb 5, 2025

HCastano Feb 5, 2025

HCastano Feb 5, 2025

JesseAbram Feb 5, 2025

HCastano Feb 5, 2025

HCastano Feb 5, 2025

HCastano Feb 5, 2025

JesseAbram Feb 5, 2025

HCastano Feb 5, 2025

JesseAbram Feb 5, 2025

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

		/// A global cache type for the TSS
		pub type Cache = Arc<RwLock<HashMap<String, Vec<u8>>>>;

Add request limit cache #1275

Add request limit cache #1275

Conversation

JesseAbram commented Jan 31, 2025 • edited Loading

ameba23 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JesseAbram commented Feb 5, 2025 • edited Loading

HCastano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

HCastano commented Feb 5, 2025

JesseAbram commented Feb 5, 2025

JesseAbram commented Jan 31, 2025 •

edited

Loading

JesseAbram commented Feb 5, 2025 •

edited

Loading