-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
33 zooma endp #1
base: main
Are you sure you want to change the base?
Conversation
…he port / requirements.txt update)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Milena, looks good mostly. I have added few comments. Let me know if you have any questions/comments on the chat.
harmonise/match.py
Outdated
pass | ||
|
||
def get_field_dict(self, url): | ||
z_cl = ZoomaClient() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always good to use self explanatory names. For very short lived names it is acceptable sometimes. Here I would name this zooma_client rather than z_cl.
harmonise/match.py
Outdated
self.field_dict['semanticTags'] = el['semanticTags'] | ||
self.field_dict['confidence'] = el['confidence'] | ||
except Exception as e: | ||
print(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use a proper logging library. Default python logging module will do fine.
@@ -1,12 +1,29 @@ | |||
id,name,label,description,type,values,parent,annotations,tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not only receiving the 'labels' but a csv file that could contain field 'type', 'description', etc...
So the original file contains the possible CSV format. You can add more labels to the same file format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you suggest me to add more 'columns' there?
print(f"Request failed with status code: {response.status_code}; file path: {file_path}") | ||
|
||
assert len(outp_json) > 0, "Empty json" | ||
assert len(outp_json) == 29, "Wrong size json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test cases should be easy to understand. Eg. what does 29 here means, does it need a comment there to explain this, or self explanatory constant will describe it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I'm not going to change csv file for the test I'm expecting the same result about the result json size. But if service here (f'http://www.ebi.ac.uk/spot/zooma/v2/api/services/annotate?propertyValue={label}') will be changed the result will be changed also
how is it better to explain here?
tests/test_endpoints.py
Outdated
|
||
first_5_elements = dict(islice(outp_json.items(), 5)) | ||
|
||
expected_values = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happen if zooma has new knowledge and there is another mapping in zooma output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly we can check json keys only, like:
'Age at present'
'Age at the agreement date',
etc. ...
import requests | ||
|
||
|
||
class ZoomaClient: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class looks static, there are few ways we can improve this to make it better OOP
- the URL could be a class variable accepting in the constructor (eg. base_url)
- Rename get_json should have better name and could accept arguments (eg. field_label) and then construct the final API call url from base_url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean:
field_label_json = zooma_client.field_label()
if field_label_json is not None:
for i, el in enumerate(field_label_json):
?
harmonise/match.py
Outdated
|
||
for label in labels: | ||
if len(label) != 0: | ||
fm_cl = FieldMatchingService() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldMatchingService and ZoomaClient doing two different things?
Also need to extract the URL to a variable or a config.
'Alcohol consumption habits', 'Birthdate'] | ||
|
||
for key, value in first_5_elements.items(): | ||
assert key in expected_values, f"Unexpected key in json: {key}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now I'm checking only keys in the json
https://app.zenhub.com/workspaces/cohort-atlas-642e179715832600114e7867/issues/gh/ebibiosamples/cohort-atlas/33