Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bookmark logic for transactions and order_refunds #197

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

prijendev
Copy link
Contributor

@prijendev prijendev commented Feb 3, 2025

Description of change

  • Update bookmark logic for transactions and order_refunds. More details can be found here.

To synchronize transactions, we first fetch recent order records based on the transaction_orders bookmark. We then make API calls for each fetched order ID to retrieve the associated transaction records. The transaction_orders bookmark is based on the updated_at value of the order records, while the transactions bookmark uses the created_at value from the transaction records as the replication key. At the end of extraction, we update the bookmarks with the maximum replication key values encountered.

The extraction began on 2024-11-29T11:02, using the following state:

{ "bookmarks": { "transactions": { "created_at": "2024-11-29T10:33:55.000000Z", "transaction_orders": { "updated_at": "2024-11-29T10:33:36.000000Z" } } } }
The API call was made to fetch orders within the date range 2024-11-29T10:33:36+00:00 to 2024-11-29T11:05:24+00:00. However, the customer's updated record had a updated_at value of 2024-11-29T11:06:49, which fell outside the current extraction window and was therefore missed.

At 11:05:27, all eligible orders were fetched, and API calls began to retrieve transactions for each order. At 11:07:30, an API call was made for order X, where a new transaction was created at 2024-11-29T11:06:52. This transaction record was fetched successfully, and since its created_at value was the maximum seen so far, the transactions bookmark was updated accordingly. The resulting bookmarks were:

{ "bookmarks": { "transactions": { "created_at": "2024-11-29T11:06:52.000000Z"", "transaction_orders": { "updated_at": "2024-11-29T11:05:15.000000Z" } } } }

In the next extraction window, orders were fetched within the range 2024-11-29T11:05:15to {}2024-11-29T11:35:36{}. This included the expected order ID 5677413597302 and its transactions. However, one transaction from this order had a created_at value of 2024-11-29T11:06:49, which was less than the transactions bookmark of 2024-11-29T11:06:52. As a result, this transaction was ignored and not written to the output, even though it was fetched from the API.

To resolve this issue, we use the transaction_orders bookmark (parent) as the bookmark for the transactions stream (child). This approach ensures consistency between the parent and child extraction windows and prevents valid transactions from being skipped.

QA steps

  • Resync the transactions and refunds_order stream and verify that no more records are missing.
  • Sync the tap with all streams and found that there is no change in bookmark of rest of the streams.

Risks

Rollback steps

  • revert this branch

AI generated code

https://internal.qlik.dev/general/ways-of-working/code-reviews/#guidelines-for-ai-generated-code

  • this PR has been written with the help of GitHub Copilot or another generative AI tool

@prijendev prijendev changed the title Update bookmark logic Update bookmark logic for transactions and order_refunds Feb 3, 2025
@prijendev prijendev marked this pull request as ready for review February 3, 2025 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant