-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiline strings are ugly after dumping #240
Comments
Or shorter still:
Most dumpers have to guess how to dump a string.This is one of those items where it's hard to guess. I do agree the single quote style is ugly here. If someone wants to create a PR for scalar style guessing (with lots of tests), we'd be happy to review and consider integrating it. Be aware that the same logic needs to be made to pyyaml and libyaml. |
def str_presenter(dumper, data):
try:
dlen = len(data.splitlines())
if (dlen > 1):
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
except TypeError as ex:
return dumper.represent_scalar('tag:yaml.org,2002:str', data)
return dumper.represent_scalar('tag:yaml.org,2002:str', data) Tried adding this using If anyone can give me a hint, on where I have to make the changes, I can put some amount of time to make the PR |
My issue was related to #121 its painful to do spend lots of time to findout what is the reason |
Nice. If there is fundamental issue we could check in another language that has yaml load/dump (like golang) and supports it. Or ruamel-yaml. |
Actually, this issue is only reason why I can't use pyyaml on my project, for me changing from |
OK I've added this to https://github.com/yaml/pyyaml/projects/9 So we'll look at that for the next release, though I can't say when that will happen. |
Sone suggestions : Separate concerns of representation from data otherwise you will end up with a mess of code that is hard to maintain. Eg the | symbol is presentation, it is metadata about the layout of the data. Similarly references are metadata. And I think most people would consider Comments to be data in the context of yaml, but they can also be treated as metadata because they provide info about surrounding data or context. So at load time the metadata data must be loaded and stored, and used at output time. And as mentioned, there may be third-party open source libs out there eg in Java, Go or Javascript (or even in Python) that have already solved this problem. This is one of the purposes of open source, sharing knowledge. There is no reason to reinvent the wheel here, use them as inspiration. |
@schollii i am not sure if you understand the issue. let me repeat. pyyaml parser voluntary changes an original yaml markup when doing a dump. it shouldn't do this or at least this behavior should be configurable. our programs are not dependent on presentation layers, people who read and edit yaml files are. |
@melezhik right. Since we are probably adding a better config system in the next release, the rough plan for this is that we add a config option for the format of multiline strings to prefer. Also I suspect it should be easy to configure this with a custom function that can provide that the preference. |
found this on StackOverflow as a quick fix for my requirement: parsed = {'fdir_root': '/mnt/c/engDev/git_mf/ipyrun/tests/examples/line_graph_batch',
'fpth_config': '/mnt/c/engDev/git_mf/ipyrun/tests/examples/line_graph_batch/config-shell_handler.json',
'title': '# Plot Straight Lines\n### example RunApp',
'configs': []}
import yaml
def str_presenter(dumper, data):
"""configures yaml for dumping multiline strings
Ref: https://stackoverflow.com/questions/8640959/how-can-i-control-what-scalar-form-pyyaml-uses-for-my-data"""
if len(data.splitlines()) > 1: # check for multiline string
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
return dumper.represent_scalar('tag:yaml.org,2002:str', data)
yaml.add_representer(str, str_presenter)
yaml.representer.SafeRepresenter.add_representer(str, str_presenter) # to use with safe_dum
s = yaml.dump(parsed, indent=2) # , sort_keys=True)
print(s)
>>> configs: []
>>> fdir_root: /mnt/c/engDev/git_mf/ipyrun/tests/examples/line_graph_batch
>>> fpth_config: /mnt/c/engDev/git_mf/ipyrun/tests/examples/line_graph_batch/config-shell_handler.json
>>> title: |-
>>> # Plot Straight Lines
>>> ### example RunApp |
Also #121 (comment) |
Slight tweak, better handles strings ending in a newline and might be a bit faster: def str_presenter(dumper, data):
"""configures yaml for dumping multiline strings
Ref: https://stackoverflow.com/questions/8640959/how-can-i-control-what-scalar-form-pyyaml-uses-for-my-data"""
if data.count('\n') > 0: # check for multiline string
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
return dumper.represent_scalar('tag:yaml.org,2002:str', data) |
Sorry to necro this, but wanted to save others the headache. This solution only works if you do not have trailing spaces on any of your lines. If there is a trailing space somewhere, you'll see the original behavior of the string getting all messed up and "\n"s everywhere. To prevent this from accidentally occurring, you can strip them with this modification to @cjw296's code: def str_presenter(dumper, data):
if data.count('\n') > 0:
data = "\n".join([line.rstrip() for line in data.splitlines()]) # Remove any trailing spaces, then put it back together again
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
return dumper.represent_scalar('tag:yaml.org,2002:str', data) This behavior took way too long than I'd care to admit to track down. |
is there a plan to add this to the library, maybe as a style? its a very common use case and its a shame we have to do these hacks to get proper yaml. |
This converts "|" blocks to "|-". To preserve the style we can use: # Remove any trailing spaces messing out the output.
block = "\n".join([line.rstrip() for line in data.splitlines()])
if data.endswith("\n"):
block += "\n"
return dumper.represent_scalar("tag:yaml.org,2002:str", block, style="|") |
Here's a PR for discussion: |
This is what we use in ramen to avoid this issue: |
This could output much shorter
The text was updated successfully, but these errors were encountered: