Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of the TangoShutter Hardware object and class #834

Merged
merged 11 commits into from
Jan 22, 2024

Conversation

HilbertCorsair
Copy link
Contributor

@HilbertCorsair HilbertCorsair commented Jan 10, 2024

This code was based on a model provided by Antonia Beteva @beteva and it aims to be a general tango shutter class.
The specific HardwareObject values have to be mapped to existing predefined values in the dictionary within the values tags of the xml configuration file corresponding to the specific shutter Hardware object.
This version uses the AbstractShutter class.

Copy link
Member

@beteva beteva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done with you first PR @HilbertCorsair and thank you for handling this. I'm requesting few cosmetic changes and a question about eval.

Comment on lines 74 to 75
"""Get the values dict from the xml config file. It will be used to map specific state values to existing standard states defined """
self.properties_dict = eval(f'dict( {self.get_property("values")} )')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know what the self.get_property("values") returns, but maybe by using literal_eval instead of eval you could get your dictionary.
from ast import literal_eval
self.properties_dict = literal_eval(self.get_property("values"))
It will be great if you could add this in the xml example at the beginning of the file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using eval (even literal_eval) is always a bit of a red flag for me...

Wouldn't it be possible to use XML features for this?

Otherwise I would suggest writing some JSON in the XML and use Python's JSON parser. Probably way cleaner and safer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think that using literal_eval would be problematic ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using literal_eval (and of course eval) is a red flag, in the sense that:

  1. very often there is a better technical solution (better code practice);
  2. if there is no other way then it should be done with careful consideration because it could cause issues (as stated in the Python documentation for this function, and of course not as "worrisome" as eval).

In this case: for (1.) I do not know for sure what a better way could be because I have not understood which kind of input we are trying to evaluate (as an educated guess I suggested the JSON parser, it should be a very good fit to read a dict-like input); and for (2.) since we control the input (i.e. it does not come from 3rd party, for example the users of MXCuBE) chances of issues are extremely low.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. Yes one could in this case use for instance json.loads. I think we consider that literal_eval is rather harmless in our context (reading literals from a configuration file we provide our selfs) I would otherwise generally agree. The aim is to replace the entire XML based configuration file solution and then this construct would disappear.

Copy link
Contributor Author

@HilbertCorsair HilbertCorsair Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed the less elegant part of the code. I also realized in this form the TangoShutter class always requires a values tags in the configuration.
I used json.loads() to get the dictionary from within the <values> tag and I also added conditional statements to make the use of the <values> tag necessary only for the specific cases where the tango shutter states are not covered by the TangoShutter class.
I also added extra details in the xml configuration example.
Let me know what you think

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having JSON hidden within XML sounds pretty cringe-making. Could we not instead prioritise the work of moving to JSON entirely?

Copy link
Member

@marcus-oscarsson marcus-oscarsson Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should make it a priority to move to YAML configuration file with more explicitly defined configuration file so that one does not need to do this sort of operation. To make things easier I'd simply stick with ast.literal_eval but I have no strong opinion in this case. Loading a dictionary from a string with json.loads is also fine and I understand the comment from @fabcor-maxiv. I would not change it now, unless you would like to be more consistent, then go for ast.literal_eval

@HilbertCorsair
Copy link
Contributor Author

HilbertCorsair commented Jan 11, 2024

Thank you very much for your input. I made changes to the code according to you comments.
This latest version of the code is a bit more elegant and it can also handle cases in which using a values tag in the xml configuration file isn't necessary.

return self.state_value_str
def init(self):
"""Initilise the predefined values"""
AbstractShutter.init(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not super().init() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! I missed that one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't used any analysis tools like black or pylint for this particular code. It might be a good idea to do so.

@beteva
Copy link
Member

beteva commented Jan 12, 2024

Hi @HilbertCorsair, this gets better and better. Did you run black and pylint (or similar) ?

@HilbertCorsair
Copy link
Contributor Author

In this latest push I made the change mentioned by @beteva and also used black to analyse and format the code.
In summary with this new class there are 2 kinds of tango shutters: those whit conventional values and those who require a json dictionary in the values tag of the xml configuration file, to map the local specific values to conventional ones :)

@fabcor-maxiv
Copy link
Contributor

Did you run black and pylint (or similar) ?

Seems like the GitHub Actions workflows were not allowed to run. GitHub was waiting for some manual approval first. I have now given this approval (which I did not know I had the permission to do).

We can now see there are some linting/formatting issues.

@HilbertCorsair Do you have the pre-commit hooks installed in your local development environment?

@HilbertCorsair
Copy link
Contributor Author

I have just now installed git pre-commit and updated it. I tried to re-commit the code but now I get this error : failed to find interpreter for Builtin discover of python_spec='python3.8'
stderr: (none).
It seems to be looking for python 3.8 but my conda environment of the mxcube core uses python 3.10.12
Should I modify the pre-comit-config.yml file to use python 3.10 instead of 3.8?

@fabcor-maxiv
Copy link
Contributor

I have just now installed git pre-commit and updated it. I tried to re-commit the code but now I get this error : failed to find interpreter for Builtin discover of python_spec='python3.8'
stderr: (none).
It seems to be looking for python 3.8 but my conda environment of the mxcube core uses python 3.10.12

Oh... I had not noticed this... It did not show up as an issue for me probably because I happen to have a Python 3.8 installed as well.

Should I modify the pre-comit-config.yml file to use python 3.10 instead of 3.8?

@HilbertCorsair I guess you can try to modify locally, and see if it allows you to move forward with your work. Otherwise you can also uninstall the pre-commit hooks for now (the GitHub workflow will run the same checks anyway).

In a separate pull request we should probably address this inconsistency. I opened a ticket: #836

@HilbertCorsair
Copy link
Contributor Author

OK I changes the pre-commit config file to use python 3.10 and managed to push the commit. I made no changes to the code other than removing 2 empty lines. The code should work on python 3.8 as well as it doesn't contain any features specific to python 3.10
The broader issue is now whether I should chance my conda environment to python3.8 or keep using python3.10 ? SInce Python 3.10 was installed by default with the mxcube web app conda environment I would be tempted to stick to this newer version.

@fabcor-maxiv
Copy link
Contributor

fabcor-maxiv commented Jan 15, 2024

OK I changes the pre-commit config file to use python 3.10 and managed to push the commit. I made no changes to the code other than removing 2 empty lines. The code should work on python 3.8 as well as it doesn't contain any features specific to python 3.10 The broader issue is now whether I should chance my conda environment to python3.8 or keep using python3.10 ? SInce Python 3.10 was installed by default with the mxcube web app conda environment I would be tempted to stick to this newer version.

@HilbertCorsair See my previous comment.

The Python version topic should have been further addressed in separate ticket and pull request. Now that you have introduced this unrelated change in the branch I do not know how we should proceed. If it were just me I would tell you to remove this change from the pull request (not with a revert commit, but with a hard modification of the commits on the branch and then a force-push of the branch), but I am not sure it is in line with the policy for this repository. Maybe a squash and merge would be fine if we do a revert commit, I do not know.

Keep using Python 3.10. Keep your modifications of the pre-commit hooks local to your computer, for now until this is fixed in the repository.

@marcus-oscarsson
Copy link
Member

Looks pretty good to me, I agree with @fabcor-maxiv that its nice if you can remove the commit that changes the pre-commit hook.

You don't have to install it locally you can simply run it with the pre-commit command if that makes things easier for you.

This reverts commit 557ff3c.

Going back to pre - git hooks.
@HilbertCorsair
Copy link
Contributor Author

OK, I have uninstalled local git hooks and reverted my last commit.

@marcus-oscarsson
Copy link
Member

I think its more or less fine, just a run of black and it should pass, did you try to run pre-commit in the console ?

@HilbertCorsair
Copy link
Contributor Author

I noticed there was a mixed line endings issue. The file was edited on different editors.

@fabcor-maxiv
Copy link
Contributor

@marcus-oscarsson Do you know how come the GitHub Actions workflows do not run automatically? Seems like I can trigger them manually, but it needs manual re-trigger after each modification of the pull request.

@HilbertCorsair
Copy link
Contributor Author

@marcus-oscarsson
Copy link
Member

@fabcor-maxiv, I think its because @HilbertCorsair have not yet had any PR merged, ill see If I can find the setting somewhere.

@HilbertCorsair, There still seems to be some black linting error, did you run black one last time before comiting ?

@marcus-oscarsson
Copy link
Member

🥇 thanks !

@marcus-oscarsson marcus-oscarsson merged commit c9cf5a1 into mxcube:develop Jan 22, 2024
7 of 9 checks passed
@beteva
Copy link
Member

beteva commented Jan 24, 2024

@fabcor-maxiv I still do not understand why using json and not ast. This is a quote from the json documentation

Be cautious when parsing JSON data from untrusted sources. A malicious JSON string may cause the decoder to consume considerable CPU and memory resources

We may need to have something like <values>{"FOO": ("MYFOO", HardwareObjectState.READY}</values> and it seems, at least for me, that json.loads does not handle well tuples, while the ast.literal_eval is just fine.

@fabcor-maxiv
Copy link
Contributor

fabcor-maxiv commented Jan 25, 2024

@beteva My advice was based on the pieces of info I had (understood) at the time. I did not know we might want to have tuples, for example. This is what I wrote:

I have not understood which kind of input we are trying to evaluate (as an educated guess I suggested the JSON parser, it should be a very good fit to read a dict-like input)

Before that I wrote that I would prefer if we used XML all the way:

Wouldn't it be possible to use XML features for this?


On AST vs. JSON: Yes, both of them have their (security) risk. Seems like AST can lead to a crash, while JSON only to high CPU and memory usage. Both are bad. And both are unlikely to happen in our case since we are not reading from untrusted sources.


Now that I am brought to think about this topic again and more deeply, I am quite warm (but cautious) to the idea of having configuration as Python files (that we read with ast.literal_eval). All the formats have their pros and cons. JSON and XML are not human-friendly. YAML is so complex and has so many pitfalls. For example, this is how the configuration is done for our Sphinx documentation build: https://github.com/mxcube/mxcubecore/blob/dcb38fdc53a34af67f5ff39f3e2bac9fbb64167d/docs/conf.py

We could consider something like this instead of migrating to YAML. I very much prefer writing and reading Python rather than XML, YAML, or JSON.

@rhfogh
Copy link
Collaborator

rhfogh commented Jan 25, 2024

A few points:

  • Can't we agree on one format for configuration files, and slowly get rid of the others - reducing the clutter instead of increasing it? For now we have TOML, JSON, XML, YAML, and now someone wants to add literal Python to the mix. The more formats we introduce, the harder it will be to ever get things simplified - just see how hard it is to get rid of XML.
  • Can we agree that supporting multiple parallel file formats for configuration is a bad idea? And that nesting one syntax inside another - like putting JSON strings inside XML files, is a very bad idea?
  • Once we have agreed, can we stick to the agreement instead of reopening the discussion every time someone wants to write a new file? YAML may or may not have been the best choice, and we might decide we want to change it, but until we do we should at least be aware of what we had already agreed on. People are still arguing as if we were sitting with a blank sheet of paper and making new choices.

There is a good overview of the weaknesses of different formats in https://pypi.org/project/strictyaml/. It shows pretty clearly that there is no single good solution (including the author's own).

GPhL will be using the full Yaml 1.2 for our own configuration (likely) and for our own inter-program communication (especially) since we need the ability to represent object graphs. For the general MXCuBE fiels we would adapt to whatever was agreed. But (with the above web site in mind) StrictYAML - which is a lot simpler than full YAML - would seem to be a good choice for general configuration files and would get my vote.

@fabcor-maxiv
Copy link
Contributor

@rhfogh Thanks for providing some context which I was not aware of (having joined the project relatively recently).

I will note that JSON is a subset of YAML. Every valid piece of JSON is also a valid piece of YAML. I do not know if it is also valid strictyaml, though.

@rhfogh
Copy link
Collaborator

rhfogh commented Jan 25, 2024

@fabcor-maxiv I know you have joined recently - and the less technical parts of my post ;-) were not directed at you as much as at the entire discussion. I should add that while we did at one point settle on YAML as the next format, it is not something that generated much commitment or work from anybody (and where I still have not kept my promise to make it easier to move from XML to YAML). Whether we should decide on another format after all (like when we should finally move out of XML) is definitely a legitimate question - which has now been raised. I just hope that we can agree on something and then apply that agreement going forward.

On the purely technical side it is true that YAML subsumes JSON, but if you stick to JSON you lose the advantages of YAML, i.e. the comments and the human readability and editability. All you get it the ability to use the same parser.

@marcus-oscarsson
Copy link
Member

marcus-oscarsson commented Jan 25, 2024

It's of-course a very important topic and I really appreciate the fresh view from @fabcor-maxiv. I think we did at some point even consider Python for configuration as well. I do agree with @rhfogh that even though its taken us some time to get started we have decided for YAML and I think we should stick to that unless there are apparent drawbacks that are solved with another solution.

I simply by experience, without having read https://pypi.org/project/strictyaml/, also think that there are no good single solution to configuration file formats. Perhaps simply proven by the amount of formats there are :)

I thought we could bring this up on Monday on our next meeting, but I was hoping the topic could be more how to proceed with the XML to YAML conversion rather than discussing which format to use. We will leave some room for the format discussion but then I think we should carry on, with one format and a roadmap on how to get there :)

@fabcor-maxiv
Copy link
Contributor

rather than discussing which format to use

I knew a migration from XML to YAML was in the cards; I did not know there was already an agreement. I am fine with YAML, no problem from my side. I will have a look at "strict YAML", that seems interesting.

@marcus-oscarsson
Copy link
Member

:), @fabcor-maxiv not so easy to know. We had and will have new developers coming and going and its natural that one can't know whats been decided in the past and for instance technical details that are maybe not well documented. I think we all appreciate to be challenged by/presented new angels and ideas, I think its important for the project, so please continue to voice your mind :). We also need to be better to document especially technical bits to make it easier for newcomers, and you already started a nice job on that !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants