-
-
Notifications
You must be signed in to change notification settings - Fork 12
Lissp gotchas
- Lambda forms don't have docstrings (because Python lambdas don't either). However, you can
attach
one afterwards as the.__doc__
attribute. This includes lambdas used as methods in adeftype
or in top-level function definitions. -
deftype
has no special casing for docstrings either. Set the__doc__
attribute if you want one, preferably first. - Module-level docstrings work the same as Python: if the first top-level statement is a string literal, it's the module's docstring. Anything that Hissp would compile to that works. Setting the
__doc__
global also works and may be more useful for metaprogramming because it doesn't have to come first in the compiled output, although it's best if the docstring is easy to find in the source code as well.
Hebigo's class and function macros are more advanced. They work more like Python's statements and may include docstrings.
Targets are not evaluated in the case
macro, similar to a quote
form. If you need a str
target, you can use a symbol if it's an identifier. If it's not, you can use a ||
token or inject (.#
) any read-time string expression.
It's idiomatic to use a symbol in place of a lambda's parameter tuple when only single-character positional parameters are desired. This works because symbols happen to compile to str
s which are also iterables of (character) str
s. But if anything munges, you'll get more parameters than you asked for. Using ||
tokens instead wouldn't help, since without munging, special characters are not valid identifiers in Python.
Most types in Hissp represent themselves, but the tuple
and str
atom types have special evaluation rules. You can't just use them literally, but they can be quoted as data. Tuples are invocation forms and str
s are raw Python code (with some preprocessing).
There are no separate string/symbol/keyword types like in other Lisps, because Python only uses strings for these concepts. Although Lissp has a notation for each of these, Hissp proper only has str
atoms and all three token types get read as those, though with differing rules. The control-word, symbol, and ""
token distinction are reader-level concepts. They exist in Lissp, but not Hissp proper. A quoted Lissp string literal (like '"foo"
) gives you a str
atom containing the Python code for a Python string literal (like "('foo')"
).
This is surprising if you expect Lissp ""
tokens to represent themselves like the other atoms. Remember, literal fragments are spelled with ||
in Lissp, not ""
, which is (approximately) a shorthand for |('')|
. You have to use inject (.#
) on a Lissp ""
token to make it behave like a ||
token. If it happens to be a valid identifier, you could use a symbol token, otherwise it would munge. This also applies to recursively quoted tuples containing Lissp string literals. E.g. '("spam" "ham" "eggs")
compiles to (the equivalent of) ("('spam')", "('ham')", "('eggs')")
, not ('spam', 'ham', 'eggs')
as one might expect. But `(,"spam" .#"ham" |eggs|)
or '(spam \h\a\m |eggs|)
would.
Control word tokens read as str
atoms that happen to begin with :
, which can be meaningful at the Hissp level in certain contexts. If you're used to keywords in Common Lisp, you might expect something like :|foo bar|
to work, since the ||
just \
-escape characters in a range. But in Lissp, ||
are delimiters, like ""
, so the whole fragment must begin and end with them. Therefore, correct equivalent spellings in Lissp would be |:foo bar|
or :foo\ bar
, which would each read as a ':foo bar'
str
atom.
The [#
reader macro is optimized for numeric indexing and slicing. It works well. Although it's possible to write any Python code that begins with that prefix, it has to demunge a single str
atom to do it. This makes it a poor choice for expressions containing Python string literals or munged symbols.
E.g., [#'foo']
seems natural, but tokenizes as the macro [#
, the macro '
, and an atom foo']
. Correct spellings include [#\'foo']
or [#|'foo']|
, which tokenize properly. While acceptable style, a get#"foo"
may be more natural in this case.
Similarly, most Python identifiers work, but something like [#+]
would result in invalid Python, even if +
is defined in scope. You'd instead have to use the munged name, like [#QzPLUS_]
, but even that doesn't work because there's a demunging step, which takes us back to where we started. Instead, a correct spelling double-munges, like [#QzPLUSQzLOWxLINE_]
. The QzPLUS
isn't recognized as a munged name, the QzLOWxLINE_
demunges to _
, and demunging is not recursive, so this results in the desired QzPLUS_
identifier.
Don't do this. You almost never have to write munged names yourself, and certainly writing a double-munged name is a bad sign. They're meant to be human-readable, but aren't intended to be human-written. Use something like get#+
instead. You can also use the slice
builtin and ->
if you need chained lookups.
A ->
is smart enough to wrap non-tuple arguments in a tuple. However, a common need is something like get#foo
. While this doesn't look like a tuple, at macro expansion time, it is, because it's reader shorthand for (operator..itemgetter foo)
, which is what the ->
macro sees. You need to double wrap it: (-> spam (get#foo))
, or it will expand to (operator..itemgetter spam foo)
instead, which means something else.
Using types literally that have no literal representation in Python will result in a pickle expression. Unpicklable types will fail to compile at all. This doesn't usually happen accidentally. Types with no representation are usually compiled to the code to construct them, rather than the objects themselves. However, they can appear in the Hissp directly in two ways, either as the result of an injection, or as the result of a macroexpansion.
If your use case allows adding dependencies and you need Hissp to be able to pickle more types, consider adding Dill. You can use
(setattr hissp.compiler. 'pickle dill.)
or (in Python)
import dill, hissp.compiler; hissp.compiler.pickle = dill
to enable Dill in the compiler.
Comments are not simply ignored by the reader. They're parsed objects of a type that gets dropped by the reader by default. But they can still be arguments for reader macros before that happens. If you put a reader tag on its own line, take care not to accidentally add a line comment after it.
The prelude will replace your _macro_
object if you had one.
`foo
expands to something like __main__..foo
. This usually does what you want when using template quotes to make code. An undesired qualification is a sign that it's a local identifier and you should use a gensym or that it's an intentional anaphor and you should explicitly suppress the qualification.
When using template quotes to make data not intended as code, you don't want qualification at all, and you should just unquote each thing. en#tuple
or @
are often viable alternatives.
It may seem natural to try something like (.update (globals) : _macro_ hissp.._macro_)
. You probably don't want to do this.
You'll lose the standalone property, meaning Hissp must be installed to run the compiled output, and worse, if you do this in multiple modules, their defmacro
s will leak through the shared mutable _macro_
object, so you'll also lose modularity. It's like assigning attributes to builtins
; while there are legitimate uses, you probably don't need to do it, and it can get you into trouble, especially if everyone is doing it.
Use prelude
or alias
instead. They don't have these problems. If you take care to copy the contents of _macro_
, rather than the object itself, you keep modularity. The prelude does this. If you avoid crashing when hissp
isn't installed, then you keep the standalone property. The prelude also does this. So a proper full _macro_
import might look something like,
(hissp.._macro_.when (importlib.util..find_spec 'hissp)
(.update (globals) : _macro_ (types..SimpleNamespace : :** (vars hissp.._macro_))))
The above is only for illustration. You wouldn't write it this way yourself (although you might write a custom prelude that expands to something like it). A more idiomatic approach would be to inject, which simplifies things.
.#(.update (globals) : _macro_ (types..SimpleNamespace : :** (vars hissp.._macro_)))
Beware of side effects in injections, but this would at least be idempotent. .update
always returns None
, so nothing gets compiled in and nothing happens at run time, but the macros do get added as a side effect of the form being read, making them available during compilation.
For sharing macros among your own modules in a larger project, you probably just want alias
, but it is possible to use the fully-qualified names to pick out individual macro functions and attach
them to your module's _macro_
object.
E.g. _#
discards in Lissp, but #_
discards in EDN and Clojure.
Typically, in other Lisps, reader macros use a leading dispatch character, (e.g. '
) with #
used to expand the repertoire without wasting another character by dispatching on the next one, making it used in the majority of the cases. Lissp is tokenized by a backtracking regex and instead has the reader tag tokens end in a #
. This is a bit more compatible with generic Lisp editors which often don't know what to do with reader macros or do the wrong thing. They assume the Lissp reader tags are normal symbols, which is pretty much the correct behavior, although this can still confuse structural editing or automatic indents when in the function position. Lissp reader tags process the next parsed object (like EDN tags), not the raw character stream like Common Lisp reader macros. They can also often be applied to their argument without separating whitespace.
These are unlikely to cause problems but should be noted somewhere.
The _macro_
namespace should only contain the callable objects used in macroexpansion. If you set an attribute of _macro_
to None
and try to expand it as a macro, the compiler will act as if it's not there and compile it like a function call instead. Setting the attribute to any other non-callable will crash when the compiler tries to call it.