Skip to content

Commit

Permalink
Merge pull request #270 from gilch/macro-mistakes
Browse files Browse the repository at this point in the history
Macro mistakes
gilch authored Nov 10, 2024
2 parents 72e199f + 0021e5d commit 5f85883
Showing 16 changed files with 5,247 additions and 1,066 deletions.
4 changes: 2 additions & 2 deletions docs/command_line_reference.rst
Original file line number Diff line number Diff line change
@@ -45,7 +45,7 @@ For example, using `hissp.reader.transpile`, a package name, and module names,

.. code-block:: console
$ alias lisspt="lissp -c '(hissp..transpile : :* [##1:] sys..argv)'"
$ alias lisspt="lissp -c '(H#transpile : :* [##1:] sys..argv)'"
$ lisspt pkg foo # Transpiles pkg/foo.lissp to pkg/foo.py in a package context.
$ lisspt pkg.sub foo # Transpiles pkg/sub/foo.lissp to .py in subpackage context.
$ lisspt "" foo bar # foo.lissp, bar.lissp to foo.py, bar.py without a package.
@@ -54,7 +54,7 @@ or using `hissp.reader.transpile_file`, a file name, and a package name,

.. code-block:: console
$ alias lissptf="lissp -c '(hissp.reader..transpile_file : :* [##1:] sys..argv)'"
$ alias lissptf="lissp -c '(H#reader.transpile_file : :* [##1:] sys..argv)'"
$ lissptf spam.lissp # Transpile a single file without a package.
$ cd pkg
$ lissptf eggs.lissp pkg # must declare the package name
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -98,4 +98,4 @@
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]

intersphinx_mapping = {"python": ("https://docs.python.org/3", None)}
intersphinx_mapping = {"python": ("https://docs.python.org/3.10", None)}
56 changes: 46 additions & 10 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
@@ -20,9 +20,12 @@ Glossary
`Injection` of a Python statement is only valid at the top level.

doorstop
A `discarded item` used to "hold open" a bracket trail.
Typically ``_#/`` in Lissp,
but may be a `Unicode token` with a comment.
A `discarded item` used to "hold open" a bracket trail
or avoid a ``))`` in line.
Any discarded item used this way is functionally a doorstop,
but, in Lissp, the typical style starts with ``_#/``
and may continue with a label of what form the next ``)`` is closing,
like ``_#/foo``, similar to XML tags.

abstract syntax tree
ast
@@ -125,7 +128,9 @@ Glossary

atom
A `form` that is either the empty tuple ``()`` or not of type `tuple`.
`is_atomic` tests for atoms.
Atoms are the leaf elements of Hissp's syntax trees,
while non-empty tuples are the nodes.
`is_node` tests for the non-leaves, so its negation tests for atoms.

form
An object meant for evaluation;
@@ -165,6 +170,33 @@ Glossary
not linked-lists or vectors,
hence “params tuple” when written with a tuple.

standard
nonstandard
The standard language is a disciplined subset with full generality.
Standard (`readerless mode`) Hissp uses `str atom`\ s only for
`control word`\ s and `symbol`\ s
(which include imports and attribute access)
and avoids other `Python injection`\ s.
Standard Lissp also uses `str atom`\ s for `string literal fragment`\ s.
(Standard readerless mode instead compiles string literals exclusively
via the quote `special form`, or nested in `set`, `dict`, or `list` `atom`\ s.)
Other Python injections are considered nonstandard.
Nonstandard constructions should be used sparingly and with care.
Metaprograms are not necessarily expected to handle nonstandard Python injections,
because that would require processing the much more complicated language
of Python expressions, but not all nonstandard injections are problematic.
The bundled tags and macros mostly avoid nonstandard injections in expansions,
but (with the notable exception of `mix`)
allow them where they would be no worse than an opaque
`fully-qualified identifier`,
or in a few cases where the user writes part of the injection.
Standard Hissp also avoids importing the ``hissp`` package outside of
metaprograms (and direct helpers not otherwise called) to preserve the
`standalone property`.
Standard atom types are those the compiler has a literal notation for.
Use of nonstandard types can result in a `pickle expression` or a crash
during compilation (if the atom is unpickleable).

injection
Either a `Python injection` or a `Hissp injection`, depending on context.

@@ -178,22 +210,22 @@ Glossary
making them less useful or risking errors.
However, the compiler only targets a subset of Python expressions.
Injection transcends that limitation.
Injection of identifiers is considered standard in Hissp,
Injection of identifiers is considered `standard` in Hissp,
so is not discourarged.
A Lissp `Unicode token` reads as a `string literal fragment`,
rather than as a `quote`\ d `str atom`,
making them an example of injection as well.
This usage is standard in Lissp.
This usage is `standard` in Lissp.

hissp injection
Any `atom` of non-standard type (or the use thereof),
Any `atom` of `nonstandard` type (or the use thereof),
i.e., anything the compiler doesn't have a literal notation for,
which it would have to attempt to emit as a `pickle expression`.
This includes instances of standard types without a literal notation
(e.g., `float` is a standard type, but `math.nan` has no literal)
or collections containing nonstandard elements or cycles.
A macroexpansion may be an injection.
Besides macroexpansions, in readerless mode,
A `macro expansion` may be an injection.
Besides macro expansions, in readerless mode,
this almost always requires the use of non-literal notation,
(i.e., notation not accepted by `ast.literal_eval`).
In Lissp, this almost always requires the use of a `tagging token`.
@@ -266,10 +298,14 @@ Glossary
a type of `str atom` used for identifiers.

symbol
A `module handle` or a `Python fragment` containing an identifier.
A `module handle` or a `Python fragment` containing an
`identifier<str.isidentifier>`.
(Possibly with `qualification`.)
A symbol is always a `str atom`.
`is_symbol` tests for symbols.
Some identifiers are `reserved<keyword.iskeyword>` in Python and
can't be used as variable/attribute names
(`not`, `None`, `class`, etc.) These still count as symbols.

munging
The process of replacing characters invalid in a Python identifier
44 changes: 26 additions & 18 deletions docs/lissp_whirlwind_tour.rst
Original file line number Diff line number Diff line change
@@ -393,9 +393,9 @@ Lissp Whirlwind Tour
;; only meant for use at read time, but they're allowed to survive to
;; run time for debugging purposes.
#> spam=eggs
>>> # Kwarg('spam', 'eggs')
... __import__('pickle').loads(b'ccopy_reg\n_reconstructor\n(chissp.reader\nKwarg\nc__builtin__\nobject\nNtR(dVk\nVspam\nsVv\nVeggs\nsb.')
Kwarg('spam', 'eggs')
>>> # Kwarg(k='spam', v='eggs')
... __import__('pickle').loads(b'ccopy_reg\n_reconstructor\n(chissp.reader\nKwarg\nc__builtin__\ntuple\n(Vspam\nVeggs\nttR.')
Kwarg(k='spam', v='eggs')


;; use ; for a COMMENT TOKEN (like this one)
@@ -531,8 +531,8 @@ Lissp Whirlwind Tour
'QzDIGITxFOUR_2'

#> '\.
>>> 'QzFULLxSTOP_'
'QzFULLxSTOP_'
>>> 'QzDOT_'
'QzDOT_'

#> '\\
>>> 'QzBSOL_'
@@ -1532,17 +1532,20 @@ Lissp Whirlwind Tour
('operator..add', 1, ('operator..add', 2, 3))


;; Five of the helpers are predicates for inspecting code.
;; Some of the helpers are predicates for inspecting code.
#> (pprint..pp
#.. (list
#.. (itertools..starmap
#.. (lambda xy (|| x y.__name__))
#.. (filter (lambda x (|x[1]| |x[0]|))
#.. (itertools..product '(:control symbol "string" 'quoted () 1 '2)
#.. (|| hissp..is_atomic
#.. hissp..is_control
#.. (itertools..product '(:control re. "string" 'quoted () 1 '2)
#.. (|| hissp..is_control
#.. hissp..is_import
#.. hissp..is_node
#.. hissp..is_str
#.. hissp..is_symbol
#.. hissp..is_hissp_string
#.. hissp..is_lissp_unicode
#.. hissp..is_string_literal))))))
>>> __import__('pprint').pp(
... list(
@@ -1559,7 +1562,7 @@ Lissp Whirlwind Tour
... ),
... __import__('itertools').product(
... (':control',
... 'symbol',
... 're.',
... "('string')",
... ('quote',
... 'quoted',),
@@ -1568,21 +1571,26 @@ Lissp Whirlwind Tour
... ('quote',
... (2),),),
... (
... __import__('hissp').is_atomic,
... __import__('hissp').is_control,
... __import__('hissp').is_import,
... __import__('hissp').is_node,
... __import__('hissp').is_str,
... __import__('hissp').is_symbol,
... __import__('hissp').is_hissp_string,
... __import__('hissp').is_lissp_unicode,
... __import__('hissp').is_string_literal))))))
[(':control', 'is_atomic'),
(':control', 'is_control'),
('symbol', 'is_atomic'),
('symbol', 'is_symbol'),
("('string')", 'is_atomic'),
[(':control', 'is_control'),
(':control', 'is_str'),
('re.', 'is_import'),
('re.', 'is_str'),
('re.', 'is_symbol'),
("('string')", 'is_str'),
("('string')", 'is_hissp_string'),
("('string')", 'is_lissp_unicode'),
("('string')", 'is_string_literal'),
(('quote', 'quoted'), 'is_node'),
(('quote', 'quoted'), 'is_hissp_string'),
((), 'is_atomic'),
(1, 'is_atomic')]
(('quote', 2), 'is_node')]


;; Macros only work as invocations, not arguments!
4,568 changes: 3,845 additions & 723 deletions docs/macro_tutorial.rst

Large diffs are not rendered by default.

12 changes: 9 additions & 3 deletions docs/primer.rst
Original file line number Diff line number Diff line change
@@ -1559,9 +1559,9 @@ If you see one of these, make sure you used enough ``#``\ s on your tag.
.. code-block:: REPL
#> base=6
>>> # Kwarg('base', 6)
... __import__('pickle').loads(b'ccopy_reg\n_reconstructor\n(chissp.reader\nKwarg\nc__builtin__\nobject\nNtR(dVk\nVbase\nsVv\nI6\nsb.')
Kwarg('base', 6)
>>> # Kwarg(k='base', v=6)
... __import__('pickle').loads(b'ccopy_reg\n_reconstructor\n(chissp.reader\nKwarg\nc__builtin__\ntuple\n(Vbase\nI6\nttR.')
Kwarg(k='base', v=6)
The :term:`stararg token`\ s ``*=`` and ``**=`` also evaluate to a `Kwarg` object
and unpack the argument at that position,
@@ -1588,6 +1588,12 @@ Notice the ``.#``\ s required here.
>>> ['c', 'B', 'a']
['c', 'B', 'a']
;; Kwarg is a NamedTuple subclass, so they also count as pairs.
;; We used them directly before, but they each needed a #.
#> builtins..sorted##**=(reverse=True key=.#str.lower) (a B c)
>>> ['c', 'B', 'a']
['c', 'B', 'a']
;; A mapping object works as well, of course.
;; The .# makes a read-time dict object here.
#> builtins..sorted##**=.#(dict : reverse True key str.lower) (a B c)
121 changes: 90 additions & 31 deletions docs/style_guide.rst
Original file line number Diff line number Diff line change
@@ -539,7 +539,9 @@ Your code should look like these examples, recursively applied to subforms:
a 1 ; May be better for linewise version control.
b 2 ; Use this style sparingly.
c 3
_#/)
_#/dict) ;Doorstops are allowed to label what they're
; closing, like XML tags. Not needed for a
; form this short; a _#/ would have sufficed.
(function arg1 ;Bad. : not first. Weird extra levels.
arg2
@@ -707,7 +709,7 @@ not just the fact that it's a call:
- operator..sub ; Sometimes worth it, but
* operator..mul ; use this style sparingly.
/ operator..truediv
_#/) ;Doorstop holding ) on this line.
_#/.update) ;Doorstop holding ) on this line.
(.update (globals) ;Preferred. Standard style.
: + operator..add
@@ -1074,6 +1076,16 @@ Headings begin with four semicolons and a space ``;;;; Foo Bar``,
fit on one line,
and are written in ``Title Case`` by default.

Other Lisp dialects may use quadruple-semicolon comments for module-level comments,
as a category distinct from top-level commentary.
In Lissp,
module-level commentary should instead appear in the module's docstring,
or, in the case of implementation details,
in triple-semicolon comments near the top of the file,
usually immediately before or after the module docstring.
(E.g., license boilerplate.)
Quadruple semicolon comments are exclusively for headings.

Headings are for the :term:`top level` only;
they aren't nested in :term:`form`\ s;
they get their own line and start at the beginning of it.
@@ -1280,7 +1292,7 @@ should start with the `demunge`\ d name in doubled backticks
;; in a lexical scope surrounding ``e``.
;; ...
For `tag`\ s, use the number of hashes required for its minimum arity.
For :term:`tag`\ s, use the number of hashes required for its minimum arity.
The demunged names should be followed by the pronunciation in single quotes,
if it's not obvious from the identifier:

@@ -1527,29 +1539,44 @@ especially for small functions.
Aliasing and Imports
::::::::::::::::::::

Avoid repeating the name of the containing module or package when writing definitions,
because they may be accessed through an alias or as a module attribute.

The programmer should not have to guess
what an `alias<_macro_.alias>` means when jumping into an unfamiliar file.
Use consistent aliases within a project.
Usually, this means the alias is the module name, but not its containing packages,
unless there is a shorter well-known name in the community
(like ``np#`` for NumPy or ``op#`` for operators)
or for an internal module well-known within your project.
The bundled aliases can be considered well-known in Lissp.

Avoid reassigning attributes from other modules as globals
without a very good reason.
Yes, Python does this all the time.
Avoid assigning globals attributes of other modules
without a good reason.
(A good reason might be to present a clean public interface in
``__init__.py``.)
Yes, Python code does this all the time.
It's how `from` works at the :term:`top level`.
Just access them as attributes from the module they belong to.
Just access them as attributes directly from the module they belong to.
This improves readability,
and for internal project modules,
improves reloadability during REPL-driven development.
Otherwise, instead of just refreshing the module with the updated definition,
every module reassigning it would have to be reloaded as well.

Aliases are also preferred over assigning modules as globals
Avoid using the same name for a module and one of its definitions.
This is an anti-pattern.
Yes, the standard library does this a lot,
with `datetime.datetime` being a notorious example.
In Python code, ``import datetime`` vs. ``from datetime import datetime``
is a common source of confusion.
And the module and class can't both be used as globals without renaming one of them,
which unfortunately discourages the use of the module object at all.

A set of variables with a common prefix (or suffix) is a code smell,
suggesting they should be members of a namespace (or other data structure)
with the prefix name.
If they're already members of a common namespace, the prefix is redundant
and should be removed.

Aliases are also preferred over assigning globals modules
(although this is less of a problem).
They have the advantage of never colliding
with your locals or global function names,
@@ -1661,8 +1688,10 @@ The End of the Line

Ending brackets should also end the line.
That's what lets us indent and see the tree structure clearly.
Readability is mainly laid out on the page.
It's OK to have single ``)``'s inside the line,
but don't overdo it:
but don't overdo it.
Nesting more than a few levels in a single line can get confusing.

.. code-block:: Lissp
@@ -1689,12 +1718,39 @@ then the tree structure is clear from the indents:
(len xs))
"on average.")
A train of ``)``'s within a line is almost never acceptable.
A train of closing ``)`` tokens within a line is almost never acceptable.
A rare exception might be in something like an `enjoin`_,
because the structure of the string is more important for readability
than the structure of the tree,
but even then, limit it to three ``)))``.

Remember,
this rule is so we can indent to clearly see the tree structure of the code,
so it only applies to *closing tokens*,
not to the ``)`` character *per se*.
Brackets within an object token don't count.
E.g., whatever bracket structure you like within a :term:`Unicode token` is fine.
Symbols can also have ``)`` characters as long as they're escaped.

The empty tuple ``()`` is *technically* not an object token,
because it's made of an open and close token,
but this is an implementation detail
and it easily could have been a token in its own right.
Regardless, it *is* considered an :term:`atom`, which doesn't count as a node
(because atoms are leaves in the syntax tree).
A ``())`` is OK even if it's not at the end of the line,
but putting it there is usually preferred:

.. code-block:: Lissp
(print (.get items "key 1" ()) (.get items ())) ;OK. Only one node ) inside.
(print (.get items "key 1" ()) ;Preferred. )'s end the line.
(.get items ()))
(print "))) is 3 right parentheses") ;No problem. No closing tokens inside.
Semantic groups should be kept together.
Closing brackets inside a pair can happen in `cond`,
for example:
@@ -1717,6 +1773,12 @@ even in an implied group:
(gt (len xs) (len ys)) (print ">")
:else (print "0"))))
(defun compare (xs ys) ;OK. But use doorstops sparingly.
(cond (lt (len xs) (len ys) _#/lt) (print "<") ; 3 nesting levels in line is pushing it.
(gt (len xs) (len ys) _#/gt) (print ">")
:else (print "0"))))
(defun compare (xs ys) ;Bad. No groups. Can't tell if from then.
(cond (lt (len xs) (len ys))
(print "<")
@@ -1725,20 +1787,13 @@ even in an implied group:
:else
(print "0"))))
(defun compare (xs ys) ;OK. Use discard comments sparingly.
(cond (lt (len xs) (len ys))
(print "<")
_#:elif->(gt (len xs) (len ys)) ;Unambiguous, but unaligned.
(print ">")
:else (print "0")))) ; No internal ), so 1 line is OK. Still grouped.
(defun compare (xs ys) ;OK. Better.
(defun compare (xs ys) ;OK.
(cond (lt (len xs) (len ys))
(print "<")
;; else if ;The styling comment is not optional;
(gt (len xs) (len ys)) ; it's needed for separating groups.
;; else if ;A styling comment isn't optional here;
(gt (len xs) (len ys)) ; it's required to separate groups.
(print ">")
:else (print "0"))))
:else (print "0")))) ;Still grouped. 1 line OK--no internal ).
(defun compare (xs ys) ;Preferred. Keep cond simple.
(let (lxs (len xs)
@@ -1755,8 +1810,9 @@ implementing a single easily-testable concept
or perhaps a few very closely related ones.
Build up a vocabulary of definitions
so the requisite function becomes easily expressible.
Function definition bodies should be no more than 10 lines,
and usually no more than 5.
Function definition bodies should
usually be no more than 5 lines,
but may occasionally be several times that.
That's not counting docstrings, comments, or assertions.
(:term:`Params` aren't in the body.)

@@ -1766,6 +1822,11 @@ once the pure functional bits have been factored out.
At that point, lexical locality is more important for readability,
so it's better to leave them long than to break them up.

The bodies of nested lexical closures are also counted separately,
but any lines in their :term:`params` do count as part of their enclosing body.
This pattern behaves more like a class with methods,
and refactoring to that form may make it more easily testable.

Don't break up a single concept just to get under the line quota,
but consider if it could be refactored into a data structure,
or expressed with a more concise macro or tag notation.
@@ -1787,7 +1848,7 @@ you're not likely to consider exhaustively,
so you need a good reason to nail down at least one of them.
It just gets worse from there. The factorial sequence grows pretty quickly.
Why not make it easy and use meaningful names instead of meaningless positions?
Sort the kwonly parameters and (argument pairs) lexicographically by default.
Sort the kwonly parameters (and argument pairs) lexicographically by default.
You may have a good reason for some other order. E.g.,
if arguments or defaults need to be evaluated in a certain order due to side effects.
Document this kind of thing with a comment when it's not obvious,
@@ -1883,8 +1944,6 @@ or it may be possible to configure it to ignore violations in strings or comment
The Limits of Length
::::::::::::::::::::

Readability is mainly laid out on the page.

The optimal length for a line in a block of English text is thought to be around
50-75 characters, given the limitations of the human eye.
More than that, and it gets difficult to find the next line in the return sweep.
@@ -2027,7 +2086,7 @@ You need only modify forms containing long lines (or contained in long lines).
(function1 arg1 arg2 (function2 arg1
arg2))
;; Very bad! Confusing style. Break on `)` first!
;; Very bad! Confusing stair-step style. Break on `)` first!
(function1 arg1 (function2 arg1
arg2) arg2 (function3 arg1 arg2))
@@ -2039,7 +2098,7 @@ You need only modify forms containing long lines (or contained in long lines).
arg4)
:else (function3 arg1 arg2))
;; Bad. Confusing style.
;; Bad. Confusing stair-step style.
(cond (test1 x) (function1 arg1 arg2)
(test2 argument1
argument2
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
# Copyright 2019, 2021, 2022, 2024 Matthew Egan Odendahl
# SPDX-License-Identifier: Apache-2.0
hypothesis==5.43.3
coverage==5.3
coverage==7.6.4
pytest==6.2.5
pytest-cov==2.10.1
sybil==2.0.1
25 changes: 18 additions & 7 deletions src/hissp/__init__.py
Original file line number Diff line number Diff line change
@@ -24,24 +24,30 @@
# / znxrf tbbq cenpgvpr / orfg cenpgvpr vg orgenlf.:Pnfgyrf ohvyg / va gur nve / juvgure gurl qb orybat?: Ryrtnapr /~
# gura rkprcgvba: Sbez / orsber qrgnvy: jurapr haqre gurz,:Sbhaqngvbaf nccrne.:Znxr gur evtug jnl boivbhf,:zrqvgngr~
# ba guvf.: --Mn Mra bs Uvffc:~
"""It's Python with a `Lissp`.
R"""It's Python with a `Lissp`.
See the GitHub project for complete documentation and tests.
https://github.com/gilch/hissp
``__init__.py`` imports several utilities for convenience, including
``__init__.py`` defines a few functions meant for use as
:term:`fully-qualified tag`\ s and imports several utilities for
convenience, including,
* from :mod:`hissp.compiler`:
* `Compiler`
* `is_atomic`
* `evaluate`
* `execute`
* `is_control`
* `is_import`
* `is_node`
* `is_str`
* `is_symbol`
* `readerless`
* `macroexpand`
* `macroexpand1`
* `macroexpand_all`
* `readerless`
* from :mod:`hissp.munger`:
@@ -52,6 +58,7 @@
* `transpile`
* `is_hissp_string`
* `is_lissp_unicode`
* `is_string_literal`
* and `hissp.repl.interact`
@@ -61,16 +68,20 @@
"""
from hissp.compiler import (
Compiler,
is_atomic,
evaluate,
execute,
is_control,
is_import,
is_node,
is_str,
is_symbol,
readerless,
macroexpand,
macroexpand1,
macroexpand_all,
readerless,
)
from hissp.munger import demunge, munge
from hissp.reader import transpile, is_hissp_string, is_string_literal
from hissp.reader import transpile, is_hissp_string, is_lissp_unicode, is_string_literal
from hissp.repl import interact

# Hissp must be importable to compile macros.lissp the first time.
164 changes: 110 additions & 54 deletions src/hissp/compiler.py
Original file line number Diff line number Diff line change
@@ -15,12 +15,12 @@
from collections.abc import Iterable, Sequence
from contextlib import contextmanager, suppress
from contextvars import ContextVar
from functools import wraps
from functools import partial, wraps
from itertools import chain, starmap, takewhile
from pprint import pformat
from traceback import format_exc
from types import ModuleType
from typing import Any, NewType, TypeAlias, TypeVar
from typing import Any, NewType, TypeAlias, TypeGuard, TypeVar
from warnings import warn

PAIR_WORDS = {":*": "*", ":**": "**", ":?": ""}
@@ -33,7 +33,7 @@
_PARAM_INDENT = f"\n{len('(lambda ')*' '}"

Env: TypeAlias = dict[str, Any]
ENV: ContextVar[Env | None] = ContextVar("ENV", default=None)
ENV: ContextVar[Env] = ContextVar("ENV")
"""
Expansion environment.
@@ -60,12 +60,12 @@


@contextmanager
def macro_context(env: Env | None):
def macro_context(env: Env):
"""Sets `ENV` during macroexpansions.
Does nothing if ``env`` is ``None`` or already the current context.
Does nothing if ``env`` is already the current context.
"""
if env is None or ENV.get() is env:
if ENV.get(None) is env:
yield
else:
token = ENV.set(env)
@@ -179,21 +179,21 @@ def compile_form(self, form) -> str:
`tuple` and `str` have special evaluation rules,
otherwise it's an `atom` that represents itself.
"""
if type(form) is tuple and form:
if is_node(form):
return self.tuple_(form)
if type(form) is str and not form.startswith(":"):
if is_str(form) and not form.startswith(":"):
return self.fragment(form)
return self.atomic(form)

@_trace
def tuple_(self, form: tuple) -> str:
"""Compile `call`, `macro`, or `special` forms."""
match form:
case [["lambda", params, *body] as head] if type(
head
) is tuple and not self.parameters(params):
case [["lambda", params, *body] as head] if (
is_node(head) and not self.parameters(params)
):
return self.body(body) # progn optimization
case head, *_ if type(head) is str:
case head, *_ if is_str(head):
return self.special(form)
return self.call(form)

@@ -368,12 +368,12 @@ def expand_macro(self, form: tuple) -> str | Sentinel:
return _SENTINEL

@classmethod
def get_macro(cls, symbol, env: Env):
def get_macro(cls, symbol: object, env: Env):
"""Returns the macro function for ``symbol`` given the ``env``.
Returns ``None`` if ``symbol`` isn't a macro identifier.
"""
if type(symbol) is not str or symbol.startswith(":"):
if not is_str(symbol) or symbol.startswith(":"):
return None
return cls._get_macro(symbol, env)

@@ -484,7 +484,7 @@ def call(self, form: Iterable) -> str:
(singles := [*map(self.compile_form, takewhile(lambda a: a != ":", form))]),
starmap(self._pair_arg, pairs := [*_pairs(form)]),
)
if type(head) is str and head.startswith("."):
if is_str(head) and head.startswith("."):
if singles or pairs[0][0] == ":?":
return "{}.{}({})".format(next(args), head[1:], _join_args(*args))
raise CompileError("self must be paired with :?")
@@ -567,7 +567,7 @@ def atomic(self, form) -> str:
case = type(form)
if case is set and not form:
return "{*''}" # "set()" could be shadowed. "{}" is a dict.
if case is tuple and form:
if is_node(form):
return self._lisp_normal_form(form)
if case in {dict, list, set}:
return self._collection(form)
@@ -653,36 +653,91 @@ def _pairs(it: Iterable[T]) -> Iterable[tuple[T, T]]:
raise CompileError("incomplete pair") from None


def _resolve_env(env: Env | None = None) -> Env:
if env is not None or (env := ENV.get(None)) is not None:
return env
return inspect.currentframe().f_back.f_back.f_globals


def readerless(form: object, env: Env | None = None) -> str:
"""Compile a Hissp form to Python without evaluating it.
Uses the current `ENV` for context, unless an alternative is provided.
(Creates a temporary environment if neither is available.)
Returns the Python in a string.
Returns the compiled Python in a string.
Unless an alternative ``env`` is specified, uses the current `ENV`
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
"""
if env is None and (env := ENV.get()) is None:
env = {"__name__": "__main__"}
return Compiler(env=env, evaluate=False).compile([form])
return Compiler(env=_resolve_env(env), evaluate=False).compile([form])


def is_atomic(form: object) -> bool:
"""Determines if form is an `atom`."""
return type(form) is not tuple or form == ()
def evaluate(form: object, env: Env | None = None):
"""Convenience function to evaluate a Hissp form.
Unless an alternative ``env`` is specified, uses the current `ENV`
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
>>> evaluate(('operator..mul',6,7))
42
"""
env = _resolve_env(env)
return eval(readerless(form, env), env)

def is_symbol(form: object) -> bool:

def execute(*forms: object, env: Env | None = None) -> str:
"""Convenience function to compile and execute Hissp forms.
Returns the compiled Python in a string.
Unless an alternative ``env`` is specified, uses the current `ENV`
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
>>> print(execute(
... ('hissp.._macro_.define','FACTOR',7,),
... ('hissp.._macro_.define','result',('operator..mul','FACTOR',6,),),
... ))
# hissp.._macro_.define
__import__('builtins').globals().update(
FACTOR=(7))
<BLANKLINE>
# hissp.._macro_.define
__import__('builtins').globals().update(
result=__import__('operator').mul(
FACTOR,
(6)))
>>> result
42
"""
return Compiler(env=(_resolve_env(env))).compile(forms)


def is_str(form: object) -> TypeGuard[str]:
"""Determines if form is a `str atom`. (Not a `str` subtype.)"""
return type(form) is str


def is_node(form: object) -> TypeGuard[tuple]:
"""Determines if form is a nonempty tuple (not an `atom`)."""
return type(form) is tuple and form != ()


def is_symbol(form: object) -> TypeGuard[str]:
"""Determines if form is a `symbol`."""
return (type(form) is str and form != "") and all(
return (is_str(form) and form != "") and all(
part.isidentifier() for part in f"{form}_".replace("..", ".", 1).split(".")
)


def is_control(form) -> bool:
"""Determines if form is a `control word`."""
return type(form) is str and form.startswith(":")
def is_import(form: object) -> TypeGuard[str]:
"""Determines if form is a `module handle` or has `full qualification`."""
return is_symbol(form) and (".." in form or form.endswith("."))


def _resolve_env(e: Env | None = None, _e=ENV.get, _cf=inspect.currentframe) -> Env:
return (_cf().f_back.f_back.f_globals if _e() is None else _e()) if e is None else e
def is_control(form: object) -> TypeGuard[str]:
"""Determines if form is a `control word`."""
return is_str(form) and form.startswith(":")


def macroexpand1(form, env: Env | None = None):
@@ -694,7 +749,7 @@ def macroexpand1(form, env: Env | None = None):
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
"""
if type(form) is not tuple or not form or form[0] in ["quote", "lambda"]:
if not is_node(form) or form[0] in ["quote", "lambda"]:
return form
head, *tail = form
env = _resolve_env(env)
@@ -704,21 +759,25 @@ def macroexpand1(form, env: Env | None = None):
return macro(*tail)


def macroexpand(form, env: Env | None = None):
def macroexpand(form, env: Env | None = None, *, preprocess=lambda x: x):
"""Repeatedly macroexpand outermost form until not a macro form.
If form is not a macro form, returns it unaltered.
Unless an alternative ``env`` is specified, uses the current `ENV`
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
``preprocess`` (which defaults to identity function) is called on
the form before each expansion step.
"""
env = _resolve_env(env)
while True:
expanded = macroexpand1(form, env)
if expanded is form:
return form
form = expanded
with macro_context(_resolve_env(env)):
while True:
form = preprocess(form)
expanded = macroexpand1(form)
if expanded is form:
return form
form = expanded


def macroexpand_all(
@@ -746,26 +805,23 @@ def macroexpand_all(
(available in a `macro_context`) when available, otherwise uses the
calling frame's globals.
"""
env = _resolve_env(env)
exp = postprocess(macroexpand(preprocess(form), env))
if type(exp) is not tuple or not exp or exp[0] == "quote":
return exp
if exp[0] != "lambda":
return tuple(
macroexpand_all(e, env, preprocess=preprocess, postprocess=postprocess)
for e in exp
)
return "lambda", _pexpand(exp[1], env), *(macroexpand_all(e, env) for e in exp[2:])
with macro_context(_resolve_env(env)):
exp = macroexpand(form, preprocess=preprocess)
if not is_node(exp) or exp[0] == "quote":
return postprocess(exp)
mx_a = partial(macroexpand_all, preprocess=preprocess, postprocess=postprocess)
if exp[0] != "lambda":
return postprocess((*map(mx_a, exp),))
return postprocess(("lambda", _pexpand(exp[1], mx_a), *map(mx_a, exp[2:])))


def _pexpand(params: Iterable, env: Env) -> Iterable:
def _pexpand(params: Iterable, mx_a: partial) -> Iterable:
if ":" not in params:
return params
singles, pairs = parse_params(params)
stars = {":*", ":**"}
if not pairs.keys() - stars:
if not pairs.keys() - (":*", ":**"):
return params
pairs = {k: v if k in stars else macroexpand_all(v, env) for k, v in pairs.items()}
pairs = {k: v if k in (":*", ":**") else mx_a(v) for k, v in pairs.items()}
return *singles, ":", *chain.from_iterable(pairs.items())


1,164 changes: 983 additions & 181 deletions src/hissp/macros.lissp

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/hissp/munger.py
Original file line number Diff line number Diff line change
@@ -100,7 +100,7 @@ def _munge_part(part):
"+": "PLUS",
# COMMA is fine.
"-": "H", # Hyphen-minus
# Full stop reserved for imports and attributes.
".": "DOT", # Doesn't munge by default.
"/": "SOL",
# Digits only munge if first character.
# COLON is fine.
61 changes: 33 additions & 28 deletions src/hissp/reader.py
Original file line number Diff line number Diff line change
@@ -24,10 +24,19 @@
from keyword import iskeyword as _iskeyword
from pathlib import Path, PurePath
from pprint import pformat
from typing import Any, Callable as Fn, Literal, NewType, NoReturn, cast
from typing import (
Any,
Callable as Fn,
Literal,
NamedTuple,
NewType,
NoReturn,
TypeGuard,
cast,
)

import hissp.compiler as C
from hissp.compiler import Compiler, Env, readerless
from hissp.compiler import Env
from hissp.munger import force_qz_encode, munge

GENSYM_BYTES = 5
@@ -187,18 +196,14 @@ def __repr__(self) -> str:
return f"Comment({self.token!r})"


class Kwarg:
class Kwarg(NamedTuple):
"""Contains a read-time keyword argument for a `tag`.
Normally made with a `kwarg token`, but can be constructed directly.
"""

def __init__(self, k: str, v):
self.k = k
self.v = v

def __repr__(self) -> str:
return f"Kwarg({self.k!r}, {self.v!r})"
k: str
v: Any


class Lissp:
@@ -216,7 +221,7 @@ def __init__(
):
self._template_count = 0
self.qualname = qualname
self.compiler = Compiler(self.qualname, env, evaluate)
self.compiler = C.Compiler(self.qualname, env, evaluate)
self.filename = filename

def template_count(self):
@@ -377,7 +382,7 @@ def unquote_context(self):
def _inject(self, v: str):
with C.macro_context(self.lissp.env):
return eval(
readerless(self._pull(v, self._pos), self.lissp.env), self.lissp.env
C.readerless(self._pull(v, self._pos), self.lissp.env), self.lissp.env
)

def _pull(self, v: str, p: int | None = None):
@@ -397,27 +402,27 @@ def _template(self, v: str):
def _template_form(self, form):
"""Process form as template."""
case = type(form)
if is_lissp_string(form):
if is_lissp_unicode(form):
return "quote", form
if case is tuple and form:
return ("",":",*chain(*self._template_element(form)),":?","") # fmt: skip
if case is str and not form.startswith(":"):
if C.is_node(form):
return ("",":", *chain(*self._template_forms(form)), ":?", "") # fmt: skip
if C.is_str(form) and not form.startswith(":"):
return "quote", self.qualify(form)
if case is _Unquote:
if form.target == ":?":
return form.value
raise SyntaxError("splice not in tuple", self.position())
return form

def _template_element(self, forms: Iterable) -> Iterable[tuple[str, Any]]:
def _template_forms(self, forms: Iterable) -> Iterable[tuple[str, Any]]:
invocation = True
for form in forms:
case = type(form)
if case is str and not form.startswith(":"):
if C.is_str(form) and not form.startswith(":"):
yield ":?", ("quote", self.qualify(form, invocation))
elif case is _Unquote:
yield form
elif case is tuple:
elif C.is_node(form):
yield ":?", self._template_form(form)
else:
yield ":?", form
@@ -486,7 +491,7 @@ def _label(cls, arity: int, tag: str) -> str:
@classmethod
def _collect(cls, args: list, kwargs: dict, x) -> None:
if type(x) is Kwarg:
k, v = x.k, x.v
k, v = x
if k == "*":
args.extend(v)
elif k == "**":
@@ -509,6 +514,7 @@ def _fully_qualified(tag: str):
return cast(Fn, reduce(getattr, function.split("."), import_module(module)))

def _local(self, tag: str):
tag = tag.replace(".", force_qz_encode("."))
try:
return getattr(self.lissp.env[C.MACROS], tag + munge("#"))
except (AttributeError, KeyError):
@@ -551,7 +557,7 @@ def _check_depth(self) -> None:
raise SoftSyntaxError("form missing a `)`", self.position(self.depth.pop()))


def is_hissp_string(form: object) -> bool:
def is_hissp_string(form: object) -> TypeGuard[str | tuple[Literal["quote"], str]]:
"""Determines if form would directly represent a string in Hissp.
(A `Hissp string`.)
@@ -563,31 +569,30 @@ def is_hissp_string(form: object) -> bool:
`repr` on a string object.
"""
match form:
case ["quote", x] if type(form) is tuple and type(x) is str:
case ["quote", x] if C.is_node(form) and C.is_str(x):
return True
return bool(is_string_literal(form))


def is_lissp_string(form) -> bool:
def is_lissp_unicode(form: object) -> TypeGuard[str]:
"""
Determines if form could have been read from a Lissp string literal.
Determines if form could have been read from a Lissp `Unicode token`.
It's not enough to check if the form has a string type.
Several token types such as a `control token`, `symbol token`, or
`fragment token`, read in as a `str atom`. Macros may need to
distinguish these cases.
"""
return type(form) is str and form.startswith("(") and bool(is_string_literal(form))
return C.is_str(form) and form.startswith("(") and bool(is_string_literal(form))


def is_string_literal(form) -> bool | None:
def is_string_literal(form: object) -> TypeGuard[str]:
"""Determines if `ast.literal_eval` on form produces a string.
(A `string literal fragment`.)
``False`` if it produces something else or ``None`` if it raises.
"""
with suppress(Exception):
return type(ast.literal_eval(form)) is str
return C.is_str(ast.literal_eval(form))
return False


def is_qualifiable(symbol: str) -> bool:
59 changes: 58 additions & 1 deletion tests/test_cmd.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Copyright 2020, 2021, 2022, 2024 Matthew Egan Odendahl
# SPDX-License-Identifier: Apache-2.0

import os
import pathlib
import subprocess as sp
from sys import executable as python
from textwrap import dedent
@@ -385,3 +386,59 @@ def test_subrepl():
"> back in __main__\n",
"> #> ",
) # fmt: skip


def test_refresh():
try:
pathlib.Path("__refresh.lissp").write_text("")
pathlib.Path("__refresh.py").write_text("foo=1\n")
import __refresh

call_response(
"> #> ", "< hissp..subrepl#__refresh.\n",
"! >>> (lambda module=__import__('__refresh'):\n",
"! ... # hissp.._macro_.unless\n",
"! ... (lambda b, a: ()if b else a())(\n",
"! ... __name__==module.__name__,\n",
"! ... (lambda :\n",
"! ... (print(\n",
"! ... 'Entering',\n",
"! ... module.__name__),\n",
"! ... __import__('hissp').interact(\n",
"! ... __import__('builtins').vars(\n",
"! ... module)),\n",
"! ... print(\n",
"! ... 'back in',\n",
"! ... __name__)) [-1]\n",
"! ... ))\n",
"! ... )()\n",
"> Entering __refresh\n",
f"! {BANNER}",
"> #> ", "< foo\n",
"! >>> foo\n",
"> 1\n",
"> #> ", """< (.write_text (pathlib..Path '__refresh.lissp) "|foo=2|")\n""",
"! >>> __import__('pathlib').Path(\n",
"! ... '__refresh.lissp').write_text(\n",
"! ... ('|foo=2|'))\n",
"> 7\n",
"> #> ", "< hissp..refresh#:\n",
"! >>> (lambda name=__name__:\n",
"! ... (__import__('hissp.reader',fromlist='*').transpile(\n",
'! ... *name.rpartition(".")[::2]),\n',
"! ... __import__('importlib').reload(\n",
"! ... __import__('importlib').import_module(\n",
"! ... name))) [-1]\n",
"! ... )()\n",
f"> {__refresh!r}\n",
"> #> ", "< foo\n",
"! >>> foo\n",
"> 2\n",
"> #> ",
f"! {EXIT_MSG}",
"> back in __main__\n",
"> #> ",
) # fmt: skip
finally:
os.remove("__refresh.py")
os.remove("__refresh.lissp")
2 changes: 1 addition & 1 deletion tests/test_compiler.py
Original file line number Diff line number Diff line change
@@ -15,7 +15,7 @@
| st.integers()
| st.floats(allow_nan=False)
| st.complex_numbers(allow_nan=False)
| st.binary(max_size=13) # bytes
| st.binary(max_size=3) # bytes
)
literals = st.recursive(
# strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None
27 changes: 23 additions & 4 deletions tests/test_reader.py
Original file line number Diff line number Diff line change
@@ -9,6 +9,7 @@
from unittest.mock import ANY

import hypothesis.strategies as st
from hissp.reader import SoftSyntaxError
from hypothesis import given

from hissp import reader
@@ -156,7 +157,7 @@ def test_reader_missing(self):
next(self.reader.reads("(x#)"))

def test_reader_initial_dot(self):
msg = r"unknown tag 'QzFULLxSTOP_foo'"
msg = r"unknown tag 'QzDOT_foo'"
with self.assertRaisesRegex(SyntaxError, msg):
next(self.reader.reads(".foo# 0"))

@@ -167,7 +168,7 @@ def test_template(self):
) # fmt: skip

def test_is_string_code(self):
self.assertFalse(reader.is_lissp_string("(1+1)"))
self.assertFalse(reader.is_lissp_unicode("(1+1)"))

def test_gensym_equal(self):
self.assertEqual(*next(self.reader.reads(".#`($#G $#G)")))
@@ -183,6 +184,24 @@ def test_gensym_name(self):
self.assertNotEqual(main, name)
self.assertRegex(main + name, r"(?:_Qz[a-z0-7]+__G){2}")

def test_unwrapped_splice(self):
with self.assertRaisesRegex(SyntaxError, "splice not in tuple"):
next(self.reader.reads("`,@()"))

def test_bad_fragment(self):
with self.assertRaisesRegex(SyntaxError, "unpaired |"):
next(self.reader.reads("|foo"))

def test_trivial_template(self):
self.assertEqual(":foo", next(self.reader.reads("`:foo")))

def test_tag_under_arity(self):
msg = "reader tag 'foo##' missing argument"
with self.assertRaisesRegex(SoftSyntaxError, msg):
next(self.reader.reads("foo##2\n"))
with self.assertRaisesRegex(SyntaxError, msg):
next(self.reader.reads("(foo##2)"))


EXPECTED = {
# Numeric
@@ -270,7 +289,7 @@ def test_gensym_name(self):
("quote",
"QzTILDE_QzBANG_QzAT_QzHASH_QzDOLR_QzPCENT_QzHAT_QzET_QzSTAR_QzLPAR_QzRPAR__"
"QzPLUS_QzLCUB_QzRCUB_QzVERT_QzCOLON_QzQUOT_QzLT_QzGT_QzQUERY_QzGRAVE_QzH_QzEQ_"
"QzLSQB_QzRSQB_QzBSOL_QzSEMI_QzAPOS_QzCOMMA_QzFULLxSTOP_QzSOL_",)
"QzLSQB_QzRSQB_QzBSOL_QzSEMI_QzAPOS_QzCOMMA_QzDOT_QzSOL_",)
],

R"""\1 \12 \[] \(\) \{} \[] \: \; \# \` \, \' \" \\ \\. \. \ """: [
@@ -289,7 +308,7 @@ def test_gensym_name(self):
"QzQUOT_",
"QzBSOL_",
'QzBSOL_.',
'QzFULLxSTOP_',
'QzDOT_',
"QzSPACE_",
],

0 comments on commit 5f85883

Please sign in to comment.