Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message when _ convention is not used in catalog.yml #3555

Closed
noklam opened this issue Jan 25, 2024 · 3 comments
Closed

Improve error message when _ convention is not used in catalog.yml #3555

noklam opened this issue Jan 25, 2024 · 3 comments

Comments

@noklam
Copy link
Contributor

noklam commented Jan 25, 2024

Description

https://linen-slack.kedro.org/t/16334283/hi-all-i-m-looking-into-kedro-0-19-2-and-i-created-a-project#84a17399-4031-4976-93d4-11c71d5a465a

The error look like this:
AttributeError: 'str' object has no attribute 'items'

William Caicedo
1 day ago
Hi all, I’m looking into Kedro 0.19.2 and I created a project with Spark support. I have just one spark.SparkDataset in my catalog and I’m getting this error:

File "/opt/conda/envs/personas/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 83, in _resolve_credentials
    return {k: _map_value(k, v) for k, v in config.items()}
AttributeError: 'str' object has no attribute 'items'

I normally don’t use credentials but in the past this wasn’t an issue. Any ideas what I’m doing wrong?

Context

When using OmegaConfigLoader, user need to use {_xxxx} underscore convention to use template variable. It produces an obscure error when user forgot to do so, it creates confusion particularly when they try to upgrade from older version of Kedro.

Steps to Reproduce

  1. kedro new with an empty project
  2. catalog.yml as follow
dataset1:
  type: spark.SparkDataset
  metadata: ${exhibitor}
  filepath: ""

dataset2:
  type: spark.SparkDataset
  metadata: ${exhibitor}
  filepath: ""

exhibitor: abc
  1. ipython - then %load_ext kedro.ipython` - it should raise an error when catalog is initialised.

Expected Result

Actual Result

-- If you received an error, place it here.
-- Separate them if you have more than one.

Your Environment

  • Kedro version used (pip show kedro or kedro -V):
  • Python version used (python -V):
  • Operating system and version:
@ankatiyar
Copy link
Contributor

For context, there's two situations -

  • Entries in the catalog that are not dict eg -
my_var: "whatever"

It fails with this unhelpful error message because we assume the catalog entries are dicts.

We could check for this at the config loader stage and/or display a more helpful error message.

  • For catalog entries that are dict -
my_var:
  name: "whatever"

The error message is still a bit more informative but could be improved -

kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration

@astrojuanlu
Copy link
Member

Currently mentioned in the catalog docs: https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog

Ideally the traceback in both cases from #3555 (comment) should be

kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration.
Did you mean to define a template variable? If so, prefix it with `_` as explained in https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog

However, it was raised that template values are a more advanced configuration, and also that performing this validation might be tricky.

@ankatiyar
Copy link
Contributor

Close this in favour of #3910?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants