Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata gets added in the archive without timestamp update #30

Closed
yarikoptic opened this issue Feb 15, 2024 · 5 comments
Closed

metadata gets added in the archive without timestamp update #30

yarikoptic opened this issue Feb 15, 2024 · 5 comments
Assignees

Comments

@yarikoptic
Copy link
Member

needs to be troubleshooted and possibly channeled to dandi-archive so it could be fixed there

Now that we have diff included recent errored out reports say seems always that metadata was presumably added, e.g.

2024-02-13T01:53:01-0500 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000344/draft>:                                                                                                                                                                  
  + Exception Group Traceback (most recent call last):                                                                                                                                                                                                             
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/aioutil.py", line 176, in dowork                                                                                                                                  
  |     outp = await func(inp)                                                                                                                                                                                                                                     
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 159, in update_dandiset                                                                                                                      
  |     changed = await self.sync_dataset(                                                                                                                                                                                                                         
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 203, in sync_dataset                                                                                                                         
  |     await syncer.sync_assets()                                                                                                                                                                                                                                 
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/syncer.py", line 77, in sync_assets                                                                                                                               
  |     report = await async_assets(                                                                                                                                                                                                                               
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/asyncer.py", line 499, in async_assets                                                                                                                            
  |     async with (                                                                                                                                                                                                                                               
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 664, in __aexit__                                                                                                                              
  |     raise BaseExceptionGroup(                                                                                                                                                                                                                                  
  | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)                                                                                                                                                                               
  +-+---------------- 1 ----------------                                                                                                                                                                                                                           
    | Traceback (most recent call last):                                                                                                                                                                                                                           
    |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/asyncer.py", line 257, in process_blob                                                                                                                          
    |     raise UnexpectedChangeError(                                                                                                                                                                                                                             
    | backups2datalad.util.UnexpectedChangeError: Dandiset 000344: Metadata for asset README.md was changed/added but draft timestamp was not updated on server:                                                                                                   
    |                                                                                                                                                                                                                                                              
    | Metadata diff:                                                                                                                                                                                                                                               
    |                                                                                                                                                                                                                                                              
    | --- old-metadata                                                                                                                                                                                                                                             
    | +++ new-metadata                                                                                                                                                                                                                                             
    | @@ -1,2 +1,37 @@                                                                                                                                                                                                                                             
    | -null                                                                                                                                                                                                                                                        
    | -...                                                                                                                                                                                                                                                         
    | +asset_id: 4ec64802-d979-4917-b7f9-c025c9aeecb1                                                                                                                                                                                                              
    | +blob: bc2f5c33-6022-4249-807b-a6623793d3cf                                                                                                                                                                                                                  
    | +created: '2022-10-10T19:16:58.431070+00:00'                                                                                                                                                                                                                 
    | +metadata:                                                                                                                                                                                                                                                   
    | +  '@context': https://raw.githubusercontent.com/dandi/schema/master/releases/0.6.3/context.json                                                                                                                                                             
    | +  access:                                                                                                                                                                                                                                                   
    | +  - schemaKey: AccessRequirements                                                                                                                                                                                                                           
    | +    status: dandi:OpenAccess                                                                                                                                                                                                                                
    | +  blobDateModified: '2022-10-10T15:15:28.417630-04:00'                                                                                                                                                                                                      
    | +  contentSize: 121                                                                                                                                                                                                                                          
    | +  contentUrl:                                                                                                                             
````.

and so on. 
@jwodder
Copy link
Member

jwodder commented Feb 16, 2024

@yarikoptic In this specific case, I think what happened was that Dandiset 000344 was originally backed up prior to #21 being merged, and so the README.md asset couldn't be downloaded (but the rest of the assets were committed successfully? Not sure how that happened). Now that we're running backups2datalad again after #21, the program sees that README.md is in the Dandiset on the Archive but not in the backup, and so it treats it as a new asset, but since the Dandiset's modified timestamp hasn't changed, we get an error.

You can fix the backup by running backups2datalad update-from-backup with the --mode force option.

@yarikoptic
Copy link
Member Author

Thank you! I can confirm that both 000344 and 000729 on which we have similar errors ATM are both embargoed. Rerunning with --mode force for them to see if resolves.

@yarikoptic
Copy link
Member Author

I have reran with --mode verify and it came out clean. good. let's consider addressed

@yarikoptic
Copy link
Member Author

keeps happening to other dandisets (now for 000731) ... I guess for the same reasons. I guess I might need just to rerun with force for all embargoed dandisets

@yarikoptic
Copy link
Member Author

update: FWIW, did that but it seems didn't do anything (no new commit to superdataset):

> backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup --workers 5 -e '000(026|108|243)$' --mode force 000067 000041 000044 000061 000056 000126 000053 000054 000027 000020 000115 000023 000109 000142 000008 000035 000034 000165 000045 000206 000127 000128 000130 000138 000139 000140 000166 000121 000217 000004 000005 000006 000007 000009 000010 000011 000015 000013 000019 000055 000213 000218 000221 000230 000292 000293 000296 000223 000231 000173 000064 000147 000398 000350 000233 000039 000049 000454 000402 000238 000235 000237 000236 000447 000252 000462 000473 000481 000448 000483 000249 000544 000540 000036 000489 000548 000549 000550 000209 000535 000207 000465 000554 000491 000114 000488 000404 000226 000239 000458 000113 000449 000122 000024 000070 000071 000222 000157 000212 000219
> grep -v 'nothing to save, working tree clean'
> git pull --commit --no-edit
> datalad push -J 5
dandi@drogon:/mnt/backup/dandi/dandisets/000731$
dandi@drogon:/mnt/backup/dandi/dandisets/000731$
dandi@drogon:/mnt/backup/dandi/dandisets/000731$ cd ..
dandi@drogon:/mnt/backup/dandi/dandisets$ git show --stat
commit b651695ce45c9d0a269e85dec4c45b87aa238d2d (HEAD -> draft, github/draft)
Author: DANDI Team <[email protected]>
Date:   Mon Feb 19 14:38:45 2024 -0500

    1 updated (000731)

    000731:
     - [backups2datalad] 4 files added

 000731 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

which should be all ok, as long as we do not get new reports on this kind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants