A simple and stupid module for converting JSON into Python dataclasses with annotated attributes.
Although, this module has no external dependencies, its default behavior ouputs Python code that imports the attr
and cattr
packages.
Specifically, it writes import statements, attr.dataclass
decorated classes and a utility method on the outermost class for structuring data with cattr.structure
.
Class attributes are declared sorted by name with annotations representing their built-in or generated class types. Attribute object class names are generated automatically based on attribute names and declared above classes depending on them.
For optional attributes, the output code will require manual updates, for example, including Optional
in the typing
import line, wrapping annotations with Optional[]
, providing default values and reordering so they're declared last.
Run the dataclassify
module with the -h
/--help
argument to see the complete command line usage.
The following example uses HTTPie
to generate classes from httpbin
's JSON endpoint.
$ https httpbin.org/json | tee /dev/stderr | python3 -m dataclassify > slideshow_models.py
{
"slideshow": {
"author": "Yours Truly",
"date": "date of publication",
"slides": [
{
"title": "Wake up to WonderWidgets!",
"type": "all"
},
{
"items": [
"Why <em>WonderWidgets</em> are great",
"Who <em>buys</em> WonderWidgets"
],
"title": "Overview",
"type": "all"
}
],
"title": "Sample Slide Show"
}
}
$ cat slideshow_models.py
from typing import List
import attr
import cattr
@attr.dataclass
class Slide:
items: List[str]
title: str
type: str
@attr.dataclass
class Slideshow:
author: str
date: str
slides: List[Slide]
title: str
@attr.dataclass
class Root:
slideshow: Slideshow
@classmethod
def instantiate(cls, obj):
return cattr.structure(obj, cls)
Call the same function as running the main script, specifying name
, infile
and outfile
.
from dataclassify import generate_dataclasses
generate_dataclasses("HAR", "web-session.har", "har_models.py")
Generate native Python (>= 3.7) dataclasses, with no external dependencies, from an API JSON response.
import json
import urllib.request
import dataclassify
# Override the default decorator
dataclassify.decorator = "@dataclass"
# Generate class lines and update referenced annotation types
with urllib.request.urlopen("http://example.com") as response:
class_definition_lines = dataclassify.classify_dict("APIModel", json.load(response))
# Set imports after class lines are generated to include referenced types
preface = [
"from dataclasses import dataclass",
f"from typing import {', '.join(sorted(dataclassify.annotation_types))}",
"",
"",
]
with open("api_models.py", "w") as wf:
wf.write("\n".join(preface + class_definition_lines + [""]))
For dataclass scripts generated by the main module, class instances can be created by calling the instantiate
method on the outermost class.
For example, assuming dependent packages are installed and optional attributes have been updated, this will create nested instances from the script output in the CLI usage example.
import json
import urllib.request
from slideshow_models import Root
with urllib.request.urlopen("https://httpbin.org/json") as response:
inst = Root.instantiate(json.load(response))
inst
# Root(slideshow=Slideshow(author='Yours Truly', date='date of publication', slides=[Slide(title='Wake up to WonderWidgets!', type='all', items=None), Slide(title='Overview', type='all', items=['Why <em>WonderWidgets</em> are great', 'Who <em>buys</em> WonderWidgets'])], title='Sample Slide Show'))