Skip to content

Commit

Permalink
0.0.18 release.
Browse files Browse the repository at this point in the history
Added hl7 tests and cleaned up code as a result.

Update json serializer, as a result indent no longer supported. Misc code refactor.

Update links and benchmark info.
  • Loading branch information
chaseastewart committed Jan 14, 2024
1 parent 842899f commit 79098e2
Show file tree
Hide file tree
Showing 11 changed files with 671 additions and 221 deletions.
27 changes: 12 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,10 @@ Provides a python native version of [FHIR-Converter](https://github.com/microsof

The key features are:

* **Fastish**: Leverages Cython where possible
* **Fastish**: Minimize overhead outside the rendering engine
* **Move fast**: Designed to be extensibile. Use the thin rendering API or leverage the builtin parts
* **Easy**: Designed to be easy to use, extend and deploy.
* **Robust**: Get production-ready code.
* **Easy**: Designed to be easy to use, extend and deploy
* **Robust**: Get production-ready code

Limitations:
* **Only CDA->FHIR** is currently builtin. Additional work is needed to implement the filters, etc to support FHIR->FHIR and HL7v2->FHIR and back.
Expand Down Expand Up @@ -80,15 +80,13 @@ $ pip install python-fhir-converter


## Basic Usage
See [examples](./scripts/examples.py) for more indepth usage / usecases.
See [examples](https://github.com/chaseastewart/fhir-converter/blob/main/scripts/examples.py) for more indepth usage / usecases.

```python
from fhir_converter.renderers import CcdaRenderer

# Render the file to string using the rendering defaults
with open("data/sample/ccda/ccd.ccda") as xml_in:
# indent is provided, any other kwargs supported by dump may be provided
print(CcdaRenderer().render_fhir_string("CCD", xml_in, indent=1))
print(CcdaRenderer().render_fhir_string("CCD", xml_in))
```

## Command line interface
Expand Down Expand Up @@ -119,23 +117,22 @@ Final Memory: 37M

## Templates

Templates can be loaded from any python-liquid supported mechanism. To make packaging easier a ResourceLoader is provided. When a rendering environment is not provided, templates will be loaded from the [module](/fhir_converter/templates/). To ease the creation of user defined templates a TemplateSystemLoader is provided that allows templates to be loaded from a primary and optionally default location. This allows user defined templates to reference templates in the default location. The example user defined [templates](data/templates/ccda) reuse the default section / header templates.
Templates can be loaded from any python-liquid supported mechanism. To make packaging easier a [ResourceLoader](https://github.com/chaseastewart/fhir-converter/blob/main/fhir_converter/loaders.py#L119) is provided. When a rendering environment is not provided, templates will be loaded from the module [resources](https://github.com/chaseastewart/fhir-converter/tree/main/fhir_converter/templates/ccda). To ease the creation of user defined templates a [TemplateSystemLoader](https://github.com/chaseastewart/fhir-converter/blob/main/fhir_converter/loaders.py#L21) is provided that allows templates to be loaded from a primary and optionally default location. This allows user defined templates to reference templates in the default location. The example user defined [templates](https://github.com/chaseastewart/fhir-converter/tree/main/data/templates/ccda) reuse the default section / header templates.


## Benchmark

You can run the [benchmark](./scripts/benchmark.py) from the root of the source tree. Test rig is a 14-inch, 2021 Macbook Pro with the binned M1 PRO not in low power mode.
You can run the [benchmark](https://github.com/chaseastewart/fhir-converter/blob/main/scripts/benchmark.py) from the root of the source tree. Test rig is a 16-inch, 2023 Macbook Pro with the M3 Pro not in low power mode. Python version is 3.12.1.
```text
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
3 0.000 0.000 16.998 5.666 ../scripts/benchmark.py:75(render_samples)
22 0.003 0.000 16.997 0.773 ../fhir-converter/fhir_converter/renderers.py:187(render_files_to_dir)
484 0.002 0.000 16.968 0.035 ../fhir-converter/fhir_converter/renderers.py:220(render_to_dir)
484 0.010 0.000 16.842 0.035 ../fhir-converter/fhir_converter/renderers.py:93(render_fhir)
484 0.003 0.000 14.674 0.030 ../fhir-converter/fhir_converter/renderers.py:117(render_to_fhir)
3 0.000 0.000 12.273 4.091 ./scripts/benchmark.py:75(render_samples)
22 0.003 0.000 12.272 0.558 ./fhir-converter/fhir_converter/renderers.py:187(render_files_to_dir)
484 0.002 0.000 12.258 0.025 ./fhir-converter/fhir_converter/renderers.py:220(render_to_dir)
484 0.010 0.000 12.172 0.025 ./fhir-converter/fhir_converter/renderers.py:93(render_fhir)
484 0.003 0.000 12.004 0.025 ./fhir-converter/fhir_converter/renderers.py:117(render_to_fhir)
```
The test fixture profiles the converter using a single thread. The samples are rendered using all of the builtin templates along with the handful of user defined templates. The percall time is relative to the rendering template being used, the number of files being rendered (there is some warm up) and the size of the files to be rendered. In a 60 minute period in similar conditions a little over 100K CDA documents could be rendered into FHIR bundles. Note: including the original CDA document in the bundle as a DocumentReference adds noticable overhead to the render. Omitting this via a user defined template is recommended if this is not required for your usecase.


## Related Projects
Expand Down
22 changes: 4 additions & 18 deletions fhir_converter/__main__.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
import argparse
import os
import sys
from collections.abc import Mapping, Sequence
from collections.abc import Sequence
from datetime import datetime
from functools import partial
from pathlib import Path
from shutil import get_terminal_size
from textwrap import dedent, indent
from time import time
from traceback import print_exception
from typing import Any, Optional
from typing import Optional

from liquid import Environment
from psutil import Process
Expand All @@ -24,7 +24,7 @@
render_files_to_dir,
render_to_dir,
)
from fhir_converter.utils import mkdir_if_not_exists, rmdir_if_empty
from fhir_converter.utils import mkdir, rmdir_if_empty


def main(argv: Sequence[str], prog: Optional[str] = None) -> None:
Expand Down Expand Up @@ -61,7 +61,6 @@ def get_renderer(args: argparse.Namespace) -> DataRenderer:
return partial(
CcdaRenderer(get_user_defined_environment(args)).render_fhir,
args.template_name,
**get_user_defined_options(args),
)


Expand All @@ -73,15 +72,8 @@ def get_user_defined_environment(args: argparse.Namespace) -> Optional[Environme
return None


def get_user_defined_options(args: argparse.Namespace) -> Mapping[str, Any]:
options = {}
if args.indent:
options["indent"] = args.indent
return options


def render(render: DataRenderer, args: argparse.Namespace) -> None:
to_dir_created = mkdir_if_not_exists(args.to_dir)
to_dir_created = mkdir(args.to_dir)
try:
if args.from_dir:
render_files_to_dir(
Expand Down Expand Up @@ -163,12 +155,6 @@ def get_argparser(prog: Optional[str] = None) -> argparse.ArgumentParser:
help="The liquid template to use when rendering the file",
required=True,
)
parser.add_argument(
"--indent",
type=int,
metavar="<int>",
help="The indentation amount or level. 0 is none.",
)
parser.add_argument(
"--continue_on_error",
action="store_true",
Expand Down
38 changes: 11 additions & 27 deletions fhir_converter/filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,13 @@
with_context,
)
from liquid.undefined import Undefined
from pyjson5 import dumps as json5_dumps
from pyjson5 import dumps as json_dumps

from fhir_converter.hl7 import (
Hl7DtmPrecision,
get_ccda_components,
get_ccda_section_template_ids,
get_ccda_section,
get_template_id_key,
hl7_to_fhir_dtm,
is_template_id,
to_fhir_dtm,
)
from fhir_converter.utils import to_list
Expand Down Expand Up @@ -59,7 +57,7 @@ def wrapper(val: object, *args: Any, **kwargs: Any) -> Any:
def to_json_string(data: Any) -> str:
if isinstance(data, Undefined) or not data:
return ""
return json5_dumps(data)
return json_dumps(data)


@liquid_filter
Expand Down Expand Up @@ -132,41 +130,27 @@ def get_property(


@mapping_filter
def get_first_ccda_sections_by_template_id(data: Mapping, template_ids: Any) -> Mapping:
def get_first_ccda_sections_by_template_id(msg: Mapping, template_ids: Any) -> Mapping:
sections, search_template_ids = {}, list(
filter(None, str_arg(template_ids).split("|"))
)
if search_template_ids and data:
components = get_ccda_components(data)
if components:
for template_id in search_template_ids:
template_id_key = get_template_id_key(template_id)
for component in components:
for id in get_ccda_section_template_ids(component):
if is_template_id(id, template_id):
sections[template_id_key] = component["section"]
break
if template_id_key in sections:
break
for template_id in search_template_ids:
section = get_ccda_section(msg, search_template_ids=[template_id])
if section:
sections[get_template_id_key(template_id)] = section
return sections


@mapping_filter
def get_ccda_section_by_template_id(
data: Mapping, template_id: Any, *template_ids: Any
msg: Mapping, template_id: Any, *template_ids: Any
) -> Mapping:
search_template_ids = [template_id]
if template_ids:
search_template_ids += template_ids

search_template_ids = list(filter(None, map(str_arg, flatten(search_template_ids))))
if search_template_ids and data:
for component in get_ccda_components(data):
for id in get_ccda_section_template_ids(component):
for template_id in search_template_ids:
if is_template_id(id, template_id):
return component["section"]
return {}
section = get_ccda_section(msg, search_template_ids)
return section or {}


@with_context
Expand Down
116 changes: 74 additions & 42 deletions fhir_converter/hl7.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from __future__ import annotations

from collections.abc import Mapping, MutableMapping, Sequence
from datetime import datetime, timedelta, tzinfo
from datetime import datetime, timedelta, timezone
from enum import IntEnum
from math import copysign
from re import compile as re_compile
Expand All @@ -10,24 +10,7 @@

from fhir_converter.utils import merge_mappings, parse_json, to_list

DTM_REGEX = re_compile(r"(\d+(?:\.\d+)?)(?:([+-]\d{2})(\d{2}))?")


class UTCOffset(tzinfo):
def __init__(self, minutes) -> None:
self.minutes = minutes

def utcoffset(self, _) -> timedelta:
return timedelta(minutes=self.minutes)

def tzname(self, _) -> str:
minutes = abs(self.minutes)
return "{0}{1:02}{2:02}".format(
"-" if self.minutes < 0 else "+", minutes // 60, minutes % 60
)

def dst(self, _) -> timedelta:
return timedelta(0)
DTM_REGEX = re_compile(r"(\d+(?:\.\d*)?)(?:([+-]\d{2})(\d{2}))?")


class FhirDtmPrecision(IntEnum):
Expand Down Expand Up @@ -57,20 +40,20 @@ class Hl7DtmPrecision(IntEnum):
def fhir_precision(self) -> FhirDtmPrecision:
return FhirDtmPrecision[self.name]

@classmethod
def from_dtm(cls, dtm: str) -> Hl7DtmPrecision:
@staticmethod
def from_dtm(dtm: str) -> Hl7DtmPrecision:
_len = len(dtm)
if _len > Hl7DtmPrecision.SEC:
if _len >= Hl7DtmPrecision.MILLIS:
return Hl7DtmPrecision.MILLIS
elif _len > Hl7DtmPrecision.MIN:
elif _len == Hl7DtmPrecision.SEC:
return Hl7DtmPrecision.SEC
elif _len > Hl7DtmPrecision.HOUR:
elif _len == Hl7DtmPrecision.MIN:
return Hl7DtmPrecision.MIN
elif _len > Hl7DtmPrecision.DAY:
elif _len == Hl7DtmPrecision.HOUR:
return Hl7DtmPrecision.HOUR
elif _len > Hl7DtmPrecision.MONTH:
elif _len == Hl7DtmPrecision.DAY:
return Hl7DtmPrecision.DAY
elif _len > Hl7DtmPrecision.YEAR:
elif _len == Hl7DtmPrecision.MONTH:
return Hl7DtmPrecision.MONTH
elif _len == Hl7DtmPrecision.YEAR:
return Hl7DtmPrecision.YEAR
Expand All @@ -93,7 +76,7 @@ def parse_hl7_dtm(hl7_input: str) -> Hl7ParsedDtm:
if tzh and tzm:
minutes = int(tzh) * 60.0
minutes += copysign(int(tzm), minutes)
tzinfo = UTCOffset(minutes)
tzinfo = timezone(timedelta(minutes=minutes))
else:
tzinfo = None

Expand Down Expand Up @@ -162,16 +145,19 @@ def to_fhir_dtm(dt: datetime, precision: Optional[FhirDtmPrecision] = None) -> s
return iso_dtm[: FhirDtmPrecision.YEAR]


def parse_fhir(json_input: str, encoding: str = "utf-8") -> MutableMapping:
json_data = parse_json(json_input, encoding)
unique_entrys: dict[str, dict] = {}
for entry in json_data.get("entry", []):
key = get_fhir_entry_key(entry)
if key in unique_entrys:
merge_mappings(unique_entrys[key], entry)
else:
unique_entrys[key] = entry
json_data["entry"] = list(unique_entrys.values())
def parse_fhir(json_input: str) -> MutableMapping:
json_data = parse_json(json_input)
if json_data:
entries = to_list(json_data.get("entry", []))
if len(entries) > 1:
unique_entrys: dict[str, dict] = {}
for entry in entries:
key = get_fhir_entry_key(entry)
if key in unique_entrys:
merge_mappings(unique_entrys[key], entry)
else:
unique_entrys[key] = entry
json_data["entry"] = list(unique_entrys.values())
return json_data


Expand All @@ -189,22 +175,68 @@ def get_fhir_entry_key(entry: Mapping) -> str:
)


def get_ccda_components(data: Mapping) -> Sequence:
def get_ccda_section(
ccda: Mapping, search_template_ids: Sequence[str]
) -> Optional[Mapping]:
"""get_ccda_section Gets the POCD_MT000040.Section
from the ClinicalDocument that matches one of the templateIds
See https://github.com/HL7/CDA-core-2.0/tree/master/schema
Arguments:
ccda (Mapping): The ccda document as a map
search_template_ids (Sequence): The templateIds
Returns:
The section from the document if present
"""
if search_template_ids:
for component in get_ccda_component3(ccda):
for id in get_component3_section_templateId(component):
for template_id in search_template_ids:
if is_template_id(id, template_id):
return component["section"]
return None


def get_ccda_component3(ccda: Mapping) -> Sequence:
"""get_ccda_component3 Gets the POCD_MT000040.Component3
from the ClinicalDocument.
See https://github.com/HL7/CDA-core-2.0/tree/master/schema
Arguments:
ccda (Mapping): The ccda document as a map
Returns:
The Component3 elements from the document, otherwise []
"""
return to_list(
data.get("ClinicalDocument", {})
ccda.get("ClinicalDocument", {})
.get("component", {})
.get("structuredBody", {})
.get("component", [])
)


def get_ccda_section_template_ids(component: Mapping) -> Sequence:
def get_component3_section_templateId(component: Mapping) -> Sequence:
"""get_component3_section_template_id Gets the templateId
from the POCD_MT000040.Component3.
See https://github.com/HL7/CDA-core-2.0/tree/master/schema
Arguments:
component (Mapping): The component3 as a map
Returns:
The templateId from the component3, otherwise []
"""
return to_list(component.get("section", {}).get("templateId", []))


def get_template_id_key(template_id: str) -> str:
return re_sub(r"[^A-Za-z0-9]", "_", template_id)


def is_template_id(id: dict, template_id: str) -> bool:
def is_template_id(id: Mapping, template_id: str) -> bool:
return template_id == id.get("root", "").strip()
Loading

0 comments on commit 79098e2

Please sign in to comment.