0.0.18 release.

Added hl7 tests and cleaned up code as a result. Update json serializer, as a result indent no longer supported. Misc code refactor. Update links and benchmark info.
chaseastewart · Jan 14, 2024 · 79098e2 · 79098e2
1 parent 842899f
commit 79098e2
Show file tree

Hide file tree

Showing 11 changed files with 671 additions and 221 deletions.
diff --git a/README.md b/README.md
@@ -32,10 +32,10 @@ Provides a python native version of [FHIR-Converter](https://github.com/microsof
 
 The key features are:
 
-* **Fastish**: Leverages Cython where possible 
+* **Fastish**: Minimize overhead outside the rendering engine 
 * **Move fast**: Designed to be extensibile. Use the thin rendering API or leverage the builtin parts
-* **Easy**: Designed to be easy to use, extend and deploy.
-* **Robust**: Get production-ready code.
+* **Easy**: Designed to be easy to use, extend and deploy
+* **Robust**: Get production-ready code
 
 Limitations:
 * **Only CDA->FHIR** is currently builtin. Additional work is needed to implement the filters, etc to support FHIR->FHIR and HL7v2->FHIR and back.
@@ -80,15 +80,13 @@ $ pip install python-fhir-converter
 
 
 ## Basic Usage
-See [examples](./scripts/examples.py) for more indepth usage / usecases.
+See [examples](https://github.com/chaseastewart/fhir-converter/blob/main/scripts/examples.py) for more indepth usage / usecases.
 
 ```python
 from fhir_converter.renderers import  CcdaRenderer
 
-# Render the file to string using the rendering defaults
 with open("data/sample/ccda/ccd.ccda") as xml_in:
-    # indent is provided, any other kwargs supported by dump may be provided
-    print(CcdaRenderer().render_fhir_string("CCD", xml_in, indent=1))
+    print(CcdaRenderer().render_fhir_string("CCD", xml_in))
 ```
 
 ## Command line interface
@@ -119,23 +117,22 @@ Final Memory: 37M
 
 ## Templates
 
-Templates can be loaded from any python-liquid supported mechanism. To make packaging easier a ResourceLoader is provided. When a rendering environment is not provided, templates will be loaded from the [module](/fhir_converter/templates/). To ease the creation of user defined templates a TemplateSystemLoader is provided that allows templates to be loaded from a primary and optionally default location. This allows user defined templates to reference templates in the default location. The example user defined [templates](data/templates/ccda) reuse the default section / header templates.
+Templates can be loaded from any python-liquid supported mechanism. To make packaging easier a [ResourceLoader](https://github.com/chaseastewart/fhir-converter/blob/main/fhir_converter/loaders.py#L119) is provided. When a rendering environment is not provided, templates will be loaded from the module [resources](https://github.com/chaseastewart/fhir-converter/tree/main/fhir_converter/templates/ccda). To ease the creation of user defined templates a [TemplateSystemLoader](https://github.com/chaseastewart/fhir-converter/blob/main/fhir_converter/loaders.py#L21) is provided that allows templates to be loaded from a primary and optionally default location. This allows user defined templates to reference templates in the default location. The example user defined [templates](https://github.com/chaseastewart/fhir-converter/tree/main/data/templates/ccda) reuse the default section / header templates.
 
 
 ## Benchmark
 
-You can run the [benchmark](./scripts/benchmark.py) from the root of the source tree. Test rig is a 14-inch, 2021 Macbook Pro with the binned M1 PRO not in low power mode.
+You can run the [benchmark](https://github.com/chaseastewart/fhir-converter/blob/main/scripts/benchmark.py) from the root of the source tree. Test rig is a 16-inch, 2023 Macbook Pro with the M3 Pro not in low power mode. Python version is 3.12.1.
 ```text
    Ordered by: cumulative time
 
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
-        3    0.000    0.000   16.998    5.666 ../scripts/benchmark.py:75(render_samples)
-       22    0.003    0.000   16.997    0.773 ../fhir-converter/fhir_converter/renderers.py:187(render_files_to_dir)
-      484    0.002    0.000   16.968    0.035 ../fhir-converter/fhir_converter/renderers.py:220(render_to_dir)
-      484    0.010    0.000   16.842    0.035 ../fhir-converter/fhir_converter/renderers.py:93(render_fhir)
-      484    0.003    0.000   14.674    0.030 ../fhir-converter/fhir_converter/renderers.py:117(render_to_fhir)
+        3    0.000    0.000   12.273    4.091 ./scripts/benchmark.py:75(render_samples)
+       22    0.003    0.000   12.272    0.558 ./fhir-converter/fhir_converter/renderers.py:187(render_files_to_dir)
+      484    0.002    0.000   12.258    0.025 ./fhir-converter/fhir_converter/renderers.py:220(render_to_dir)
+      484    0.010    0.000   12.172    0.025 ./fhir-converter/fhir_converter/renderers.py:93(render_fhir)
+      484    0.003    0.000   12.004    0.025 ./fhir-converter/fhir_converter/renderers.py:117(render_to_fhir)
 ```
-The test fixture profiles the converter using a single thread. The samples are rendered using all of the builtin templates along with the handful of user defined templates. The percall time is relative to the rendering template being used, the number of files being rendered (there is some warm up) and the size of the files to be rendered. In a 60 minute period in similar conditions a little over 100K CDA documents could be rendered into FHIR bundles. Note: including the original CDA document in the bundle as a DocumentReference adds noticable overhead to the render. Omitting this via a user defined template is recommended if this is not required for your usecase.
 
 
 ## Related Projects

diff --git a/fhir_converter/__main__.py b/fhir_converter/__main__.py
@@ -1,15 +1,15 @@
 import argparse
 import os
 import sys
-from collections.abc import Mapping, Sequence
+from collections.abc import Sequence
 from datetime import datetime
 from functools import partial
 from pathlib import Path
 from shutil import get_terminal_size
 from textwrap import dedent, indent
 from time import time
 from traceback import print_exception
-from typing import Any, Optional
+from typing import Optional
 
 from liquid import Environment
 from psutil import Process
@@ -24,7 +24,7 @@
     render_files_to_dir,
     render_to_dir,
 )
-from fhir_converter.utils import mkdir_if_not_exists, rmdir_if_empty
+from fhir_converter.utils import mkdir, rmdir_if_empty
 
 
 def main(argv: Sequence[str], prog: Optional[str] = None) -> None:
@@ -61,7 +61,6 @@ def get_renderer(args: argparse.Namespace) -> DataRenderer:
     return partial(
         CcdaRenderer(get_user_defined_environment(args)).render_fhir,
         args.template_name,
-        **get_user_defined_options(args),
     )
 
 
@@ -73,15 +72,8 @@ def get_user_defined_environment(args: argparse.Namespace) -> Optional[Environme
     return None
 
 
-def get_user_defined_options(args: argparse.Namespace) -> Mapping[str, Any]:
-    options = {}
-    if args.indent:
-        options["indent"] = args.indent
-    return options
-
-
 def render(render: DataRenderer, args: argparse.Namespace) -> None:
-    to_dir_created = mkdir_if_not_exists(args.to_dir)
+    to_dir_created = mkdir(args.to_dir)
     try:
         if args.from_dir:
             render_files_to_dir(
@@ -163,12 +155,6 @@ def get_argparser(prog: Optional[str] = None) -> argparse.ArgumentParser:
         help="The liquid template to use when rendering the file",
         required=True,
     )
-    parser.add_argument(
-        "--indent",
-        type=int,
-        metavar="<int>",
-        help="The indentation amount or level. 0 is none.",
-    )
     parser.add_argument(
         "--continue_on_error",
         action="store_true",

diff --git a/fhir_converter/filters.py b/fhir_converter/filters.py
@@ -19,15 +19,13 @@
     with_context,
 )
 from liquid.undefined import Undefined
-from pyjson5 import dumps as json5_dumps
+from pyjson5 import dumps as json_dumps
 
 from fhir_converter.hl7 import (
     Hl7DtmPrecision,
-    get_ccda_components,
-    get_ccda_section_template_ids,
+    get_ccda_section,
     get_template_id_key,
     hl7_to_fhir_dtm,
-    is_template_id,
     to_fhir_dtm,
 )
 from fhir_converter.utils import to_list
@@ -59,7 +57,7 @@ def wrapper(val: object, *args: Any, **kwargs: Any) -> Any:
 def to_json_string(data: Any) -> str:
     if isinstance(data, Undefined) or not data:
         return ""
-    return json5_dumps(data)
+    return json_dumps(data)
 
 
 @liquid_filter
@@ -132,41 +130,27 @@ def get_property(
 
 
 @mapping_filter
-def get_first_ccda_sections_by_template_id(data: Mapping, template_ids: Any) -> Mapping:
+def get_first_ccda_sections_by_template_id(msg: Mapping, template_ids: Any) -> Mapping:
     sections, search_template_ids = {}, list(
         filter(None, str_arg(template_ids).split("|"))
     )
-    if search_template_ids and data:
-        components = get_ccda_components(data)
-        if components:
-            for template_id in search_template_ids:
-                template_id_key = get_template_id_key(template_id)
-                for component in components:
-                    for id in get_ccda_section_template_ids(component):
-                        if is_template_id(id, template_id):
-                            sections[template_id_key] = component["section"]
-                            break
-                    if template_id_key in sections:
-                        break
+    for template_id in search_template_ids:
+        section = get_ccda_section(msg, search_template_ids=[template_id])
+        if section:
+            sections[get_template_id_key(template_id)] = section
     return sections
 
 
 @mapping_filter
 def get_ccda_section_by_template_id(
-    data: Mapping, template_id: Any, *template_ids: Any
+    msg: Mapping, template_id: Any, *template_ids: Any
 ) -> Mapping:
     search_template_ids = [template_id]
     if template_ids:
         search_template_ids += template_ids
-
     search_template_ids = list(filter(None, map(str_arg, flatten(search_template_ids))))
-    if search_template_ids and data:
-        for component in get_ccda_components(data):
-            for id in get_ccda_section_template_ids(component):
-                for template_id in search_template_ids:
-                    if is_template_id(id, template_id):
-                        return component["section"]
-    return {}
+    section = get_ccda_section(msg, search_template_ids)
+    return section or {}
 
 
 @with_context

diff --git a/fhir_converter/hl7.py b/fhir_converter/hl7.py
@@ -1,7 +1,7 @@
 from __future__ import annotations
 
 from collections.abc import Mapping, MutableMapping, Sequence
-from datetime import datetime, timedelta, tzinfo
+from datetime import datetime, timedelta, timezone
 from enum import IntEnum
 from math import copysign
 from re import compile as re_compile
@@ -10,24 +10,7 @@
 
 from fhir_converter.utils import merge_mappings, parse_json, to_list
 
-DTM_REGEX = re_compile(r"(\d+(?:\.\d+)?)(?:([+-]\d{2})(\d{2}))?")
-
-
-class UTCOffset(tzinfo):
-    def __init__(self, minutes) -> None:
-        self.minutes = minutes
-
-    def utcoffset(self, _) -> timedelta:
-        return timedelta(minutes=self.minutes)
-
-    def tzname(self, _) -> str:
-        minutes = abs(self.minutes)
-        return "{0}{1:02}{2:02}".format(
-            "-" if self.minutes < 0 else "+", minutes // 60, minutes % 60
-        )
-
-    def dst(self, _) -> timedelta:
-        return timedelta(0)
+DTM_REGEX = re_compile(r"(\d+(?:\.\d*)?)(?:([+-]\d{2})(\d{2}))?")
 
 
 class FhirDtmPrecision(IntEnum):
@@ -57,20 +40,20 @@ class Hl7DtmPrecision(IntEnum):
     def fhir_precision(self) -> FhirDtmPrecision:
         return FhirDtmPrecision[self.name]
 
-    @classmethod
-    def from_dtm(cls, dtm: str) -> Hl7DtmPrecision:
+    @staticmethod
+    def from_dtm(dtm: str) -> Hl7DtmPrecision:
         _len = len(dtm)
-        if _len > Hl7DtmPrecision.SEC:
+        if _len >= Hl7DtmPrecision.MILLIS:
             return Hl7DtmPrecision.MILLIS
-        elif _len > Hl7DtmPrecision.MIN:
+        elif _len == Hl7DtmPrecision.SEC:
             return Hl7DtmPrecision.SEC
-        elif _len > Hl7DtmPrecision.HOUR:
+        elif _len == Hl7DtmPrecision.MIN:
             return Hl7DtmPrecision.MIN
-        elif _len > Hl7DtmPrecision.DAY:
+        elif _len == Hl7DtmPrecision.HOUR:
             return Hl7DtmPrecision.HOUR
-        elif _len > Hl7DtmPrecision.MONTH:
+        elif _len == Hl7DtmPrecision.DAY:
             return Hl7DtmPrecision.DAY
-        elif _len > Hl7DtmPrecision.YEAR:
+        elif _len == Hl7DtmPrecision.MONTH:
             return Hl7DtmPrecision.MONTH
         elif _len == Hl7DtmPrecision.YEAR:
             return Hl7DtmPrecision.YEAR
@@ -93,7 +76,7 @@ def parse_hl7_dtm(hl7_input: str) -> Hl7ParsedDtm:
     if tzh and tzm:
         minutes = int(tzh) * 60.0
         minutes += copysign(int(tzm), minutes)
-        tzinfo = UTCOffset(minutes)
+        tzinfo = timezone(timedelta(minutes=minutes))
     else:
         tzinfo = None
 
@@ -162,16 +145,19 @@ def to_fhir_dtm(dt: datetime, precision: Optional[FhirDtmPrecision] = None) -> s
     return iso_dtm[: FhirDtmPrecision.YEAR]
 
 
-def parse_fhir(json_input: str, encoding: str = "utf-8") -> MutableMapping:
-    json_data = parse_json(json_input, encoding)
-    unique_entrys: dict[str, dict] = {}
-    for entry in json_data.get("entry", []):
-        key = get_fhir_entry_key(entry)
-        if key in unique_entrys:
-            merge_mappings(unique_entrys[key], entry)
-        else:
-            unique_entrys[key] = entry
-    json_data["entry"] = list(unique_entrys.values())
+def parse_fhir(json_input: str) -> MutableMapping:
+    json_data = parse_json(json_input)
+    if json_data:
+        entries = to_list(json_data.get("entry", []))
+        if len(entries) > 1:
+            unique_entrys: dict[str, dict] = {}
+            for entry in entries:
+                key = get_fhir_entry_key(entry)
+                if key in unique_entrys:
+                    merge_mappings(unique_entrys[key], entry)
+                else:
+                    unique_entrys[key] = entry
+            json_data["entry"] = list(unique_entrys.values())
     return json_data
 
 
@@ -189,22 +175,68 @@ def get_fhir_entry_key(entry: Mapping) -> str:
     )
 
 
-def get_ccda_components(data: Mapping) -> Sequence:
+def get_ccda_section(
+    ccda: Mapping, search_template_ids: Sequence[str]
+) -> Optional[Mapping]:
+    """get_ccda_section Gets the POCD_MT000040.Section
+    from the ClinicalDocument that matches one of the templateIds
+
+    See https://github.com/HL7/CDA-core-2.0/tree/master/schema
+
+    Arguments:
+        ccda (Mapping): The ccda document as a map
+        search_template_ids (Sequence): The templateIds
+
+    Returns:
+        The section from the document if present
+    """
+    if search_template_ids:
+        for component in get_ccda_component3(ccda):
+            for id in get_component3_section_templateId(component):
+                for template_id in search_template_ids:
+                    if is_template_id(id, template_id):
+                        return component["section"]
+    return None
+
+
+def get_ccda_component3(ccda: Mapping) -> Sequence:
+    """get_ccda_component3 Gets the POCD_MT000040.Component3
+    from the ClinicalDocument.
+
+    See https://github.com/HL7/CDA-core-2.0/tree/master/schema
+
+    Arguments:
+        ccda (Mapping): The ccda document as a map
+
+    Returns:
+        The Component3 elements from the document, otherwise []
+    """
     return to_list(
-        data.get("ClinicalDocument", {})
+        ccda.get("ClinicalDocument", {})
         .get("component", {})
         .get("structuredBody", {})
         .get("component", [])
     )
 
 
-def get_ccda_section_template_ids(component: Mapping) -> Sequence:
+def get_component3_section_templateId(component: Mapping) -> Sequence:
+    """get_component3_section_template_id Gets the templateId
+    from the POCD_MT000040.Component3.
+
+    See https://github.com/HL7/CDA-core-2.0/tree/master/schema
+
+    Arguments:
+        component (Mapping): The component3 as a map
+
+    Returns:
+        The templateId from the component3, otherwise []
+    """
     return to_list(component.get("section", {}).get("templateId", []))
 
 
 def get_template_id_key(template_id: str) -> str:
     return re_sub(r"[^A-Za-z0-9]", "_", template_id)
 
 
-def is_template_id(id: dict, template_id: str) -> bool:
+def is_template_id(id: Mapping, template_id: str) -> bool:
     return template_id == id.get("root", "").strip()