Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems when updating items for continuous ingestion #475

Open
KlausIllmayer opened this issue Oct 16, 2024 · 7 comments
Open

Problems when updating items for continuous ingestion #475

KlausIllmayer opened this issue Oct 16, 2024 · 7 comments
Assignees
Labels
bug Something isn't working in review

Comments

@KlausIllmayer
Copy link
Contributor

This is based upon the old discussion at https://gitlab.gwdg.de/sshoc/sshoc-marketplace-backend/-/issues/85#note_536637

Scenario: for continuous ingestion we like to call the PUT call and it should merge as explained in the above mentioned discussion

Problem: I'm not able to get to the expected behavior in the case when there is a change of the item on SSHOMP in between - this change gets lost but in my understanding it should be merged

Steps to reproduce:

  1. Created a test source as administrator: `POST /api/sources'
  2. Created a tool as System importer (aka ingestion pipeline): POST /api/tools-services using minimal data (label, description, source and sourceItemId)
  3. Approved this tool as administrator: PUT /api/tools-services/{persistent-id} and removing source and sourceItemId (as described in the mentioned disucssion)
  4. Changed the approved tool as administrator: PUT /api/tools-services/{persistent-id} and added a value in accessibleAt which did not appear by ingest pipeline
  5. Update the tool as System importer: PUT /api/tools-services/{persistent-id} and changed the description but still no value in accessibleAt

As a result of step 5 I would expect that for the new ingested version of the tool the description is changed (based on the change of the System importer) and that the accessibleAt value stays (based on the change on the SSHOMP by the administrator). But when looking on the return values the accessibleAt value is not there.

@tparkola is there a chance that you look into this behavior and into the mentioned old discussion so that you may able to tell me if I'm doing something wrong or if my expectation is wonrg or if there is a bug in the described update mechanism?

@KlausIllmayer KlausIllmayer added the bug Something isn't working label Oct 16, 2024
@tparkola
Copy link
Contributor

Can you check what is the payload for in the last PUT call (5.)? From what I see - if there is no information on the accessibleAt in this request's payload then the tool is updated to have no such values. This is most probably why you get what you described.

@KlausIllmayer
Copy link
Contributor Author

That is true, in the PUT from step 5 there is no accessibleAt present. Seems that I was a little bit unclear with my description. The expectation is, that if an update by the ingest process (= step 5) happens, that the changes that were done in the marketplace (= step 4) stay, as they were done after the last ingest (= step 2). Otherwise, we would lose the curated information from the marketplace users, which would be a bad thing. And in the linked issue I understand the comment of Michał that the different versions are handled and merged (he speaks about three versions: the last ingested, the change in marketplace and the current ingested version).
We don't like to intervene at the source of the ingest (and in most cases we can't) that they add additional information. Instead we expect, that the changes of the marketplace are handled at least on the same level as the changes at the source.
@tparkola I hope, it makes clearer what is expected?

@tparkola
Copy link
Contributor

tparkola commented Nov 5, 2024

@KlausIllmayer I added PATCH method to the /api/{category}/{persistent-id} endpoint. This should work as described in the discussion mentioned by you earlier (https://gitlab.gwdg.de/sshoc/sshoc-marketplace-backend/-/issues/85#note_536637) , i.e. when moderating you should not provide source and sourceItemId, but when ingesting then you should provide source and sourceItemId (as this is the way the ingest is identified). So now, please use/test PATCH in the ingestion pipelines and do not use PUT. A comment related to PUT: it was doing some comparison of three versions in specific scenarios, as mentioned by you, but the comparison was not proper and cause problems that you indicated. So I removed this behaviour from PUT. This way PUT does what is should - update the resource provided in the request. Now PATCH should be used in ingestion pipelines.
IMPORTANT: Because these changes are quite extensive, please also make sure to do testing of other parts of the MP, including user interface, moderation, etc.

@KlausIllmayer
Copy link
Contributor Author

I tested the PATCH endpoint by using the scenario described in the first post of this issue. But instead of using the PUT in step 5 I now used the PATCH. But I still get a result from this patch call, that does ignore the changes done on the marketplace (in this case, adding an accessibleAt-value which is my scenario not present in the source). Maybe I misunderstand something - can you try to reproduce my scenario to see, if you have the same effect?

@KlausIllmayer
Copy link
Contributor Author

Okay, I need to take back my last comment - I tested it again and now it works. Seems that last week I did something wrong, sorry.

But when testing it now I found another issue:

I take the same approach as in my very first approach which now works. But for step 5 (by ingest) I additionally added an accessibleAt-value which is different to the one that was applied in step 4 (by administrator). I expected that this will raise the "conflict-at-source" scenario, which means, that the property {"conflict-at-source": "true"} is added to the response of the PATCH call. But this was not the case. @tparkola Could it be, that this behaviour is missing for the PATCH call? If so, would it be possible to add it again (it was already in the old PUT-call implemented).

Side note: it looks like, that it also works with the current approve-workflow, where we don't remove source and sourceItemId and instead use the endpoint PUT /api/{category}/{persistentId}/revert - we need still to test this intensively, as it also means that in future edits by contributors and moderators the source and sourceItemId stays on the item. @tparkola what do you think about this approach with the revert-endpoint and do you have a clue why it works even without not removing source and sourceItemId?

@tparkola
Copy link
Contributor

tparkola commented Jan 8, 2025

I had a look at the revert - it looks like it has never taken into consideration information about source or sourceItemId as it was introduced based on #435. Does that explain everything?

@KlausIllmayer
Copy link
Contributor Author

Thanks for your answer. It explains it a little bit, but the question itself is a bit different. I reformulate it:

  • Do you think, that using the revert-endpoint is still a good way to approve items? Or will this have a negative effect on the implemented PATCH mechanism? In my understanding it could mean, that by approving items with revert (thus not removing source and sourceItemId - and you explicit told us in this comment that we should give away source and sourceItemId when moderating/approving) the PATCH will give wrong results as it can't differ between the version of the item that comes from the source and the versions that where updated on the MP.

And there was also a second question in my last comment, about missing the conflict-at-source-property. Could you kindly elaborate also on this question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in review
Projects
None yet
Development

No branches or pull requests

2 participants