Skip to content
This repository has been archived by the owner on Mar 15, 2023. It is now read-only.

Refactor dict_deep_update function. #482

Open
4 tasks
h-mayorquin opened this issue Apr 26, 2022 · 1 comment
Open
4 tasks

Refactor dict_deep_update function. #482

h-mayorquin opened this issue Apr 26, 2022 · 1 comment
Assignees
Labels
low priority Not currently essential for any projects but would be great for a contributor to work on refactoring writing for the long term

Comments

@h-mayorquin
Copy link
Collaborator

While working on PR #481 me and @CodyCBakerPhD were discussing about the poor variable name in this very important function. Right now, we have:

d: dict
dictionary to update
u: dict
dictionary to update from

Which go against the design principle that variable names should be as meaningful as possible so they work as documentation themselves.

Right now PR #481 has modified the internal names of the function but we are leaving the keywords arguments untouched as that might break some script that rely on this. I am writing this issue as a record of intent for modifying this in the future, propose some break down of the tasks to do, write some other concerns that I have and general discussion.

Proposal for to-do:

  • Do a PR changing the keyword arguments but also adding logic so the previous keyword arguments still work but throw a deprecation warning.
  • The auxiliary function append_replace_dict_in_list suffers from the sames issues and should be refactored as well.
  • Moreover, the append_replace_dict_in_list seems to be actually doing more than one thing, so it should be break down for the different actions that it actually performs.
  • Modify the unit_tests so they reflect some use cases with metadata or source_schemas so they work better as functional documentation.

I am also in general concerned about all the functionality that we are adding to the function. We have many parameters that change the behavior of the function:

  1. list_dict_deep_update (A boolean) for example, controls whether lists are appended/extended or just overwritten.
  2. compare_key (a string that equals "name") controls where properties in our metadata have to match so they are updated.

I checked in our library and as far as I could find we don't really use the function without the defaults. My impression is that we are keeping this for backwards compatibility. Given that the inclusion of this parameters account for a large chunk of the complexity of this function and makes a central part of our code base more difficult to understand and modify I think we should also start thinking on dropping them (with the usual warning period of course).

Am I missing something?

@h-mayorquin h-mayorquin added low priority Not currently essential for any projects but would be great for a contributor to work on refactoring writing for the long term labels Apr 26, 2022
@h-mayorquin h-mayorquin self-assigned this Apr 26, 2022
@CodyCBakerPhD
Copy link
Member

Task list sounds fine. Deprecation to removal should probably be on a 6-month schedule from time of merge.

Moreover, the append_replace_dict_in_list seems to be actually doing more than one thing, so it should be break down for the different actions that it actually performs.

I'd do that in a separate PR

Modify the unit_tests so they reflect some use cases with metadata or source_schemas so they work better as functional documentation.

Never a bad thing to have more explicit tests.

I checked in our library and as far as I could find we don't really use the function without the defaults. My impression is that we are keeping this for backwards compatibility. Given that the inclusion of this parameters account for a large chunk of the complexity of this function and makes a central part of our code base more difficult to understand and modify I think we should also start thinking on dropping them (with the usual warning period of course).

Am I missing something?

I know it's a complicated function, but it's also a very important one that has undergone a lot of work over the past years to allow it to handle all that extra functionality. I don't think we should reduce the amount of things it's capable of, but we can certainly adjust any internal code to be more understandable.

The current state of the function does all the things that have been needed in the past, which is not specific to our local usage within NWBCT. In fact, most of the changes have been needed by several past conversion projects in order to properly update their specific, highly unique structure. Unit tests in https://github.com/catalystneuro/nwb-conversion-tools/blob/main/tests/test_internals/test_json_schema_utils.py were usually updated with minimal examples showcasing the changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
low priority Not currently essential for any projects but would be great for a contributor to work on refactoring writing for the long term
Projects
None yet
Development

No branches or pull requests

2 participants