Mint: Add Acquired<T> type for compile-time row lock enforcement #1446

Merged
crodas merged 7 commits from feature/improve-database-layer-and-expectations into main 2025-12-28 13:50:52 +00:00
crodas commented 2025-12-18 03:51:24 +00:00 (Migrated from github.com)

Description

Fixes #1404 and #1066

Add a LockedRows tracking system to enforce that resources must be read (and locked) before being modified within a transaction. This prevents race conditions and ensures data consistency.

Key changes:

  • Add locked_row module with RowId enum and LockedRows tracker
  • Change add_mint_quote to return the quote and lock it
  • Change increment_mint_quote_amount_paid to take quote by value instead of quote_id, returning the updated quote
  • Change increment_mint_quote_amount_issued to take quote by value instead of quote_id, returning the updated quote
  • Add get_proofs_states to ProofsTransaction trait
  • Auto-lock quotes when fetched via get_mint_quote, get_melt_quote, and related lookup methods
  • Add state transition checks before updating proof states in melt and swap sagas

This ensures the database layer can reject modifications to resources that weren't properly read first, forcing callers to always work with fresh data from the database.


Notes to the reviewers


Suggested CHANGELOG Updates

CHANGED

ADDED

REMOVED

FIXED


Checklist

### Description Fixes #1404 and #1066 Add a LockedRows tracking system to enforce that resources must be read (and locked) before being modified within a transaction. This prevents race conditions and ensures data consistency. Key changes: - Add locked_row module with RowId enum and LockedRows tracker - Change add_mint_quote to return the quote and lock it - Change increment_mint_quote_amount_paid to take quote by value instead of quote_id, returning the updated quote - Change increment_mint_quote_amount_issued to take quote by value instead of quote_id, returning the updated quote - Add get_proofs_states to ProofsTransaction trait - Auto-lock quotes when fetched via get_mint_quote, get_melt_quote, and related lookup methods - Add state transition checks before updating proof states in melt and swap sagas This ensures the database layer can reject modifications to resources that weren't properly read first, forcing callers to always work with fresh data from the database. ----- ### Notes to the reviewers <!-- In this section you can include notes directed to the reviewers, like explaining why some parts of the PR were done in a specific way --> ----- ### Suggested [CHANGELOG](https://github.com/cashubtc/cdk/blob/main/CHANGELOG.md) Updates <!-- Please do not edit the actual changelog but note what you changed here. --> #### CHANGED #### ADDED #### REMOVED #### FIXED ---- ### Checklist * [x] I followed the [code style guidelines](https://github.com/cashubtc/cdk/blob/main/CODE_STYLE.md) * [x] I ran `just final-check` before committing
thesimplekid commented 2025-12-18 14:40:26 +00:00 (Migrated from github.com)

This seems to introduce unnecessary complexity by implementing locking at the application level when we should rely on the locking mechanisms the database already provides. I also think this introduces redundant database calls that could have a performance impact. We would be better served by having more test coverage to ensure we are using the database locking primitives correctly and that there are no race conditions, rather than implementing our own locking in addition to the database locking.

I do like the API change to pass quotes by value rather than by ID.

This seems to introduce unnecessary complexity by implementing locking at the application level when we should rely on the locking mechanisms the database already provides. I also think this introduces redundant database calls that could have a performance impact. We would be better served by having more test coverage to ensure we are using the database locking primitives correctly and that there are no race conditions, rather than implementing our own locking in addition to the database locking. I do like the API change to pass quotes by value rather than by ID.
crodas commented 2025-12-18 14:56:52 +00:00 (Migrated from github.com)

This seems to introduce unnecessary complexity by implementing locking at the application level when we should rely on the locking mechanisms the database already provides. I also think this introduces redundant database calls that could have a performance impact. We would be better served by having more test coverage to ensure we are using the database locking primitives correctly and that there are no race conditions, rather than implementing our own locking in addition to the database locking.

That is not happening in this PR. Instead, I introduce a mechanism for the database to track locked resources. This PR tracks local locked resources.

The entire code could be ignored in release builds. This code primarily serves as a tool to let developers know whenever they are modifying a potentially stale resource. That's why it forces the developer to read and lock it before updating it.

I do like the API change to pass quotes by value rather than by ID.

Great!

> This seems to introduce unnecessary complexity by implementing locking at the application level when we should rely on the locking mechanisms the database already provides. I also think this introduces redundant database calls that could have a performance impact. We would be better served by having more test coverage to ensure we are using the database locking primitives correctly and that there are no race conditions, rather than implementing our own locking in addition to the database locking. That is not happening in this PR. Instead, I introduce a mechanism for the database to track locked resources. This PR tracks local locked resources. The entire code could be ignored in release builds. This code primarily serves as a tool to let developers know whenever they are modifying a potentially stale resource. That's why it forces the developer to read and lock it before updating it. > I do like the API change to pass quotes by value rather than by ID. Great!
thesimplekid commented 2025-12-18 15:08:18 +00:00 (Migrated from github.com)

The entire code could be ignored in release builds.

Then it should be in tests instead of adding complexity to the code base.

> The entire code could be ignored in release builds. Then it should be in tests instead of adding complexity to the code base.
thesimplekid (Migrated from github.com) reviewed 2025-12-21 11:34:03 +00:00
thesimplekid (Migrated from github.com) commented 2025-12-21 11:31:58 +00:00

Why was this test removed and there a similar test elsewhere, as this is something that should be tests?

Why was this test removed and there a similar test elsewhere, as this is something that should be tests?
crodas commented 2025-12-22 03:53:06 +00:00 (Migrated from github.com)

What about now, @thesimplekid ? In 3ed143c0573229c665e9f8af845ee76f6c3f0a1f, I introduced a Rust generic type for reserved-for-update records from the database.

I will create a follow-up PR if this gets accepted, which would do the same, but for proofs. That should eliminate all race conditions, since it would shift responsibility for synchronization and concurrency management to the storage layer.

What about now, @thesimplekid ? In 3ed143c0573229c665e9f8af845ee76f6c3f0a1f, I introduced a Rust generic type for reserved-for-update records from the database. I will create a follow-up PR if this gets accepted, which would do the same, but for proofs. That should eliminate all race conditions, since it would shift responsibility for synchronization and concurrency management to the storage layer.
thesimplekid commented 2025-12-22 09:14:30 +00:00 (Migrated from github.com)

That should eliminate all race conditions

I don't think we currently have any race conditions that this eliminates. It's to protect us in the future from introducing race conditions by making the api more strict.

it would shift responsibility for synchronization and concurrency management to the storage layer.

This is already the case.

> That should eliminate all race conditions I don't think we currently have any race conditions that this eliminates. It's to protect us in the future from introducing race conditions by making the api more strict. > it would shift responsibility for synchronization and concurrency management to the storage layer. This is already the case.
thesimplekid commented 2025-12-22 09:19:47 +00:00 (Migrated from github.com)

What about now, @thesimplekid ? In 3ed143c, I introduced a Rust generic type for reserved-for-update records from the database.

I think this api change is much better then the runtime check before. I'm traveling for the holidays so might take a few days to really review but think the overall approach is sound.

> What about now, @thesimplekid ? In 3ed143c, I introduced a Rust generic type for reserved-for-update records from the database. I think this api change is much better then the runtime check before. I'm traveling for the holidays so might take a few days to really review but think the overall approach is sound.
thesimplekid (Migrated from github.com) reviewed 2025-12-22 10:01:40 +00:00
thesimplekid (Migrated from github.com) commented 2025-12-22 09:56:28 +00:00

I think the increment should not happen here but in the mint and we change the db fn from increment to set amount paid.

I think the increment should not happen here but in the mint and we change the db fn from increment to set amount paid.
@ -1000,3 +1060,2 @@
#[instrument(skip(self))]
async fn increment_mint_quote_amount_paid(
async fn update_mint_quote(
thesimplekid (Migrated from github.com) commented 2025-12-22 09:57:04 +00:00

Same comment as amount paid

Same comment as amount paid
thesimplekid (Migrated from github.com) commented 2025-12-22 09:58:21 +00:00

Another option would be just to remove these two fns for a more general update_mint_quote fn that can do both.

Another option would be just to remove these two fns for a more general update_mint_quote fn that can do both.
crodas (Migrated from github.com) reviewed 2025-12-23 14:48:28 +00:00
crodas (Migrated from github.com) commented 2025-12-23 14:48:27 +00:00

I think we test in the mint codebase, and we agreed to move this test off the database.

I think we test in the mint codebase, and we agreed to move this test off the database.
thesimplekid (Migrated from github.com) reviewed 2025-12-23 16:29:36 +00:00
thesimplekid (Migrated from github.com) commented 2025-12-23 16:29:16 +00:00

I don't think the change-set pattern is appropriate here. The database must be the synchronization point for duplicate detection in multi-instance deployments. In-memory validation creates:

  • Inconsistent error types (domain vs DB errors)
  • False sense of correctness (passes in-memory, fails at DB)
I don't think the change-set pattern is appropriate here. The database must be the synchronization point for duplicate detection in multi-instance deployments. In-memory validation creates: - Inconsistent error types (domain vs DB errors) - False sense of correctness (passes in-memory, fails at DB)
@ -1357,2 +1239,4 @@
let current_time = unix_time();
quote.paid_time = Some(current_time);
quote.payment_preimage = payment_proof.clone();
query(r#"UPDATE melt_quote SET state = :state, paid_time = :paid_time, payment_preimage = :payment_preimage WHERE id = :id"#)?
thesimplekid (Migrated from github.com) commented 2025-12-23 16:21:57 +00:00

Removing this check and relying on the in-memory duplicate check in add_payment() breaks multi-instance deployments. If two instances load the same quote, both will pass the in-memory check, but one will fail here with a generic DB constraint error instead of DuplicatePaymentId. The callers in ln.rs:92 and mod.rs:761 explicitly catch DuplicatePaymentId to handle this race gracefully that no longer works.

Removing this check and relying on the in-memory duplicate check in add_payment() breaks multi-instance deployments. If two instances load the same quote, both will pass the in-memory check, but one will fail here with a generic DB constraint error instead of DuplicatePaymentId. The callers in ln.rs:92 and mod.rs:761 explicitly catch DuplicatePaymentId to handle this race gracefully that no longer works.
thesimplekid (Migrated from github.com) commented 2025-12-23 16:24:07 +00:00

This DuplicatePaymentId catch only works for single-instance races (caught by add_payment() in-memory check). In multi-instance deployments, both instances pass add_payment(), then one fails at update_mint_quote() with a DB constraint error which is not DuplicatePaymentId.

This DuplicatePaymentId catch only works for single-instance races (caught by add_payment() in-memory check). In multi-instance deployments, both instances pass add_payment(), then one fails at update_mint_quote() with a DB constraint error which is not DuplicatePaymentId.
thesimplekid (Migrated from github.com) commented 2025-12-23 16:25:06 +00:00

This has the same multi instance issue.

This has the same multi instance issue.
crodas (Migrated from github.com) reviewed 2025-12-23 17:01:37 +00:00
@ -1357,2 +1239,4 @@
let current_time = unix_time();
quote.paid_time = Some(current_time);
quote.payment_preimage = payment_proof.clone();
query(r#"UPDATE melt_quote SET state = :state, paid_time = :paid_time, payment_preimage = :payment_preimage WHERE id = :id"#)?
crodas (Migrated from github.com) commented 2025-12-23 17:01:36 +00:00

We are not, remember the quote is read as exclusive, signaling any other instance of this intent to update.

Also, the INSERT would fail on duplicate entry.

We are not, remember the quote is read as exclusive, signaling any other instance of this intent to update. Also, the INSERT would fail on duplicate entry.
crodas (Migrated from github.com) reviewed 2025-12-23 17:02:11 +00:00
crodas (Migrated from github.com) commented 2025-12-23 17:02:11 +00:00

Keep in mind this is for Acquired not MintQuote.

Keep in mind this is for Acquired<MintQuote> not MintQuote.
thesimplekid (Migrated from github.com) reviewed 2025-12-23 17:12:01 +00:00
@ -1357,2 +1239,4 @@
let current_time = unix_time();
quote.paid_time = Some(current_time);
quote.payment_preimage = payment_proof.clone();
query(r#"UPDATE melt_quote SET state = :state, paid_time = :paid_time, payment_preimage = :payment_preimage WHERE id = :id"#)?
thesimplekid (Migrated from github.com) commented 2025-12-23 17:12:01 +00:00

Also, the INSERT would fail on duplicate entry.

With a different error. But yeah you're right using the for update we shouldn't hit this.

> Also, the INSERT would fail on duplicate entry. With a different error. But yeah you're right using the for update we shouldn't hit this.
thesimplekid (Migrated from github.com) reviewed 2025-12-23 17:14:29 +00:00
thesimplekid (Migrated from github.com) commented 2025-12-23 17:14:29 +00:00

This comment also is incorrect and using for update makes it work.

This comment also is incorrect and using for update makes it work.
crodas (Migrated from github.com) reviewed 2025-12-23 17:50:09 +00:00
@ -1357,2 +1239,4 @@
let current_time = unix_time();
quote.paid_time = Some(current_time);
quote.payment_preimage = payment_proof.clone();
query(r#"UPDATE melt_quote SET state = :state, paid_time = :paid_time, payment_preimage = :payment_preimage WHERE id = :id"#)?
crodas (Migrated from github.com) commented 2025-12-23 17:50:09 +00:00

I would make it use the same error

I would make it use the same error
thesimplekid (Migrated from github.com) reviewed 2025-12-27 17:21:38 +00:00
thesimplekid (Migrated from github.com) commented 2025-12-27 17:11:59 +00:00

Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is.

Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is.
@ -241,6 +244,79 @@ pub async fn process_melt_change(
Ok((Some(change_sigs), tx))
thesimplekid (Migrated from github.com) commented 2025-12-27 17:04:14 +00:00

We should log that its locked.

We should log that its locked.
thesimplekid (Migrated from github.com) commented 2025-12-27 17:05:32 +00:00

I think we're over using inline in general and here specifically its an async fn with multiple db calls I don't think there is a benefit to inline that.

```suggestion ``` I think we're over using inline in general and here specifically its an async fn with multiple db calls I don't think there is a benefit to inline that.
crodas commented 2025-12-27 17:28:01 +00:00 (Migrated from github.com)

Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is.

You are right, I will add unit tests for check_state_transition in crates/cdk-common/src/state.rs

> Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is. You are right, I will add unit tests for `check_state_transition` in `crates/cdk-common/src/state.rs`
crodas (Migrated from github.com) reviewed 2025-12-28 01:48:07 +00:00
@ -241,6 +244,79 @@ pub async fn process_melt_change(
Ok((Some(change_sigs), tx))
crodas (Migrated from github.com) commented 2025-12-28 01:48:06 +00:00

Fixed in 9f5b58eb95

Fixed in 9f5b58eb95e26a381103b20f13ae8db25cbcc9c1
crodas commented 2025-12-28 01:48:53 +00:00 (Migrated from github.com)

Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is.

7ca6c92cce

> Can you link it? When removing tests especially an important one we should be careful that is intentional and there is the expected test converge and we know where it is. 7ca6c92cce304b25ab0d2382559fc7f967f17bfa
thesimplekid (Migrated from github.com) approved these changes 2025-12-28 13:45:46 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cashubtc/cdk!1446
No description provided.