Release v2.0.0 2020-10-29 · cu-mkp/manuscript-object

Second major release of manuscript-object with significant changes to the core object code.

All of the central code is now in two files, with one auxiliary file (utils.py).
update.py usage remains unchanged
The titular "manuscript object" is now a class called Manuscript inside a file called manuscript.py. Each entry is turned into an object of the Entry class inside a file called entry.py.

Some highlights:

much faster (update_entries() is, as before, the longest step)
increased verbosity during generation
manuscript and entry modules are importable and interactable
Manuscript and Entry classes control their own behavior
- e.g. generating and updating derivatives happens inside the Manuscript class
- update.py works as before, but simply calls the update methods inside Manuscript
- this means if you want to generate the derivative output in a Python shell and interact with it as a string or table, you can do so by importing manuscript and running one of the derivative generation methods
- derivative generation takes place in 2 steps: generation and then writing. This enables checks for correctness before writing to disk
All xml is converted to lxml.etree objects for easier and more consistent parsing
text renditions of editorial tags are created using an XSLT stylesheet
- this stylesheet takes parameters, so if you don't want to render del tags as <-TEXT->, for example, you can just set that to "false()"
As possible, functions are reused rather than duplicated in order to facilitate bug checks, e.g., there's only one function which tells you how to convert a string to an lxml.etree Element.
the Entry class is very flexible:
- there are different methods to take a valid lxml.etree Element, a string of well-formed XML, or a filepath to a valid XML file
- folio and identity arguments are optional
- only one version of each entry is given at a time (handling tc, tcn, and tl versions is done by the Manuscript object, not the Entry)
- if it is desired to test or inspect the contents of a txt or xml file -instead of manually opening a file - it can simply be loaded as an Entry object in a Python shell and look at the text and the properties that way

To do:

implementing more automated spot- and unit-tests
sophisticated search function for Manuscript
type annotations are useful and correct (e.g., specificity of "xml") - see use in
https://github.com/cu-mkp/manuscript-object/blob/94d158d814bf9a62071a11845a9b2938d561ab3e/entry.py#L10
optional arguments to Manuscript specifying which entries you want to generate
function to inspect the context around a particular term
visualization engine
thesaurus

see also any open issues: https://github.com/cu-mkp/manuscript-object/issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0.0 2020-10-29