Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

33 zooma endp #1

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

33 zooma endp #1

wants to merge 13 commits into from

Conversation

Copy link
Member

@theisuru theisuru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Milena, looks good mostly. I have added few comments. Let me know if you have any questions/comments on the chat.

pass

def get_field_dict(self, url):
z_cl = ZoomaClient()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always good to use self explanatory names. For very short lived names it is acceptable sometimes. Here I would name this zooma_client rather than z_cl.

self.field_dict['semanticTags'] = el['semanticTags']
self.field_dict['confidence'] = el['confidence']
except Exception as e:
print(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use a proper logging library. Default python logging module will do fine.

@@ -1,12 +1,29 @@
id,name,label,description,type,values,parent,annotations,tags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not only receiving the 'labels' but a csv file that could contain field 'type', 'description', etc...
So the original file contains the possible CSV format. You can add more labels to the same file format.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you suggest me to add more 'columns' there?

print(f"Request failed with status code: {response.status_code}; file path: {file_path}")

assert len(outp_json) > 0, "Empty json"
assert len(outp_json) == 29, "Wrong size json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases should be easy to understand. Eg. what does 29 here means, does it need a comment there to explain this, or self explanatory constant will describe it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I'm not going to change csv file for the test I'm expecting the same result about the result json size. But if service here (f'http://www.ebi.ac.uk/spot/zooma/v2/api/services/annotate?propertyValue={label}') will be changed the result will be changed also
how is it better to explain here?


first_5_elements = dict(islice(outp_json.items(), 5))

expected_values = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if zooma has new knowledge and there is another mapping in zooma output.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly we can check json keys only, like:
'Age at present'
'Age at the agreement date',
etc. ...

import requests


class ZoomaClient:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class looks static, there are few ways we can improve this to make it better OOP

  • the URL could be a class variable accepting in the constructor (eg. base_url)
  • Rename get_json should have better name and could accept arguments (eg. field_label) and then construct the final API call url from base_url

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean:

    field_label_json = zooma_client.field_label()

    if field_label_json is not None:
        for i, el in enumerate(field_label_json):

?


for label in labels:
if len(label) != 0:
fm_cl = FieldMatchingService()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FieldMatchingService and ZoomaClient doing two different things?
Also need to extract the URL to a variable or a config.

'Alcohol consumption habits', 'Birthdate']

for key, value in first_5_elements.items():
assert key in expected_values, f"Unexpected key in json: {key}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I'm checking only keys in the json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants