diff --git a/peps/pep-0750.rst b/peps/pep-0750.rst index fc7d8467557..3291da787ea 100644 --- a/peps/pep-0750.rst +++ b/peps/pep-0750.rst @@ -1,51 +1,46 @@ PEP: 750 -Title: Tag Strings For Writing Domain-Specific Languages -Author: Jim Baker , Guido van Rossum , Paul Everitt -Sponsor: Lysandros Nikolaou +Title: Template Strings +Author: Jim Baker , + Guido van Rossum , + Paul Everitt , + Koudai Aono , + Lysandros Nikolaou , + Dave Peck Discussions-To: https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408 Status: Draft Type: Standards Track Created: 08-Jul-2024 Python-Version: 3.14 +Post-History: `09-Aug-2024 `__, + `17-Oct-2024 `__, + `21-Oct-2024 `__ + Abstract ======== -This PEP introduces tag strings for custom, repeatable string processing. Tag strings -are an extension to f-strings, with a custom function -- the "tag" -- in place of the -``f`` prefix. This function can then provide rich features such as safety checks, lazy -evaluation, domain-specific languages (DSLs) for web templating, and more. +This PEP introduces template strings for custom string processing. -Tag strings are similar to `JavaScript tagged template literals `_ -and related ideas in other languages. The following tag string usage shows how similar it is to an ``f`` string, albeit -with the ability to process the literal string and embedded values: +Template strings are a generalization of f-strings, using a ``t`` in place of +the ``f`` prefix. Instead of evaluating to ``str``, t-strings evaluate to a new +type, ``Template``: .. code-block:: python - name = "World" - greeting = greet"hello {name}" - assert greeting == "Hello WORLD!" - - -Tag functions accept prepared arguments and return a string: + template: Template = t"Hello {name}" -.. code-block:: python - - def greet(*args): - """Tag function to return a greeting with an upper-case recipient.""" - salutation, recipient, *_ = args - getvalue, *_ = recipient - return f"{salutation.title().strip()} {getvalue().upper()}!" +Templates provide developers with access to the string and its interpolated +values *before* they are combined. This brings native flexible string +processing to the Python language and enables safety checks, web templating, +domain-specific languages, and more. -Below you can find richer examples. As a note, an implementation based on CPython 3.14 -exists, as discussed in this document. Relationship With Other PEPs ============================ Python introduced f-strings in Python 3.6 with :pep:`498`. The grammar was then formalized in :pep:`701` which also lifted some restrictions. This PEP -is based off of PEP 701. +is based on PEP 701. At nearly the same time PEP 498 arrived, :pep:`501` was written to provide "i-strings" -- that is, "interpolation template strings". The PEP was @@ -53,753 +48,995 @@ deferred pending further experience with f-strings. Work on this PEP was resumed by a different author in March 2023, introducing "t-strings" as template literal strings, and built atop PEP 701. -The authors of this PEP consider tag strings as a generalization of the -updated work in PEP 501. +The authors of this PEP consider it to be a generalization and simplification +of the updated work in PEP 501. (That PEP has also recently been updated to +reflect the new ideas in this PEP.) + Motivation ========== -Python f-strings became very popular, very fast. The syntax was simple, convenient, and -interpolated expressions had access to regular scoping rules. However, f-strings have -two main limitations - expressions are eagerly evaluated, and interpolated values -cannot be intercepted. The former means that f-strings cannot be re-used like templates, -and the latter means that how values are interpolated cannot be customized. - -Templating in Python is currently achieved using packages like Jinja2 which bring their -own templating languages for generating dynamic content. In addition to being one more -thing to learn, these languages are not nearly as expressive as Python itself. This -means that business logic, which cannot be expressed in the templating language, must be -written in Python instead, spreading the logic across different languages and files. - -Likewise, the inability to intercept interpolated values means that they cannot be -sanitized or otherwise transformed before being integrated into the final string. Here, -the convenience of f-strings could be considered a liability. For example, a user -executing a query with `sqlite3 `__ -may be tempted to use an f-string to embed values into their SQL expression instead of -using the ``?`` placeholder and passing the values as a tuple to avoid an -`SQL injection attack `__. - -Tag strings address both these problems by extending the f-string syntax to provide -developers access to the string and its interpolated values before they are combined. In -doing so, tag strings may be interpreted in many different ways, opening up the -possibility for DSLs and other custom string processing. - -Proposal -======== - -This PEP proposes customizable prefixes for f-strings. These f-strings then -become a "tag string": an f-string with a "tag function." The tag function is -a callable which is given a sequence of arguments for the parsed tokens in -the string. - -Here's a very simple example. Imagine we want a certain kind of string with -some custom business policies: uppercase the value and add an exclamation point. +Python f-strings are easy to use and very popular. Over time, however, developers +have encountered limitations that make them +`unsuitable for certain use cases `__. +In particular, f-strings provide no way to intercept and transform interpolated +values before they are combined into a final string. -Let's start with a tag string which simply returns a static greeting: - -.. code-block:: python +As a result, incautious use of f-strings can lead to security vulnerabilities. +For example, a user executing a SQL query with :mod:`python:sqlite3` +may be tempted to use an f-string to embed values into their SQL expression, +which could lead to a `SQL injection attack `__. +Or, a developer building HTML may include unescaped user input in the string, +leading to a `cross-site scripting (XSS) `__ +vulnerability. - def greet(*args): - """Give a static greeting.""" - return "Hello!" +More broadly, the inability to transform interpolated values before they are +combined into a final string limits the utility of f-strings in more complex +string processing tasks. - assert greet"Hi" == "Hello!" # Use the custom "tag" on the string +Template strings address these problems by providing +developers with access to the string and its interpolated values. -As you can see, ``greet`` is just a callable, in the place that the ``f`` -prefix would go. Let's look at the args: +For example, imagine we want to generate some HTML. Using template strings, +we can define an ``html()`` function that allows us to automatically sanitize +content: .. code-block:: python - def greet(*args): - """Uppercase and add exclamation.""" - salutation = args[0].upper() - return f"{salutation}!" + evil = "" + template = t"

{evil}

" + assert html(template) == "

<script>alert('evil')</script>

" - greeting = greet"Hello" # Use the custom "tag" on the string - assert greeting == "HELLO!" - -The tag function is passed a sequence of arguments. Since our tag string is simply -``"Hello"``, the ``args`` sequence only contains a string-like value of ``'Hello'``. - -With this in place, let's introduce an *interpolation*. That is, a place where -a value should be inserted: +Likewise, our hypothetical ``html()`` function can make it easy for developers +to add attributes to HTML elements using a dictionary: .. code-block:: python - def greet(*args): - """Handle an interpolation.""" - # The first arg is the string-like value "Hello " with a space - salutation = args[0].strip() - # The second arg is an "interpolation" - interpolation = args[1] - # Interpolations are tuples, the first item is a lambda - getvalue = interpolation[0] - # It gets called in the scope where it was defined, so - # the interpolation returns "World" - result = getvalue() - recipient = result.upper() - return f"{salutation} {recipient}!" + attributes = {"src": "shrubbery.jpg", "alt": "looks nice"} + template = t"" + assert html(template) == 'looks nice' + +Neither of these examples is possible with f-strings. By providing a +mechanism to intercept and transform interpolated values, template strings +enable a wide range of string processing use cases. - name = "World" - greeting = greet"Hello {name}" - assert greeting == "Hello WORLD!" -The f-string interpolation of ``{name}`` leads to the new machinery in tag -strings: +Specification +============= -- ``args[0]`` is still the string-like ``'Hello '``, this time with a trailing space -- ``args[1]`` is an expression -- the ``{name}`` part -- Tag strings represent this part as an *interpolation* object as discussed below +Template String Literals +------------------------ -The ``*args`` list is a sequence of ``Decoded`` and ``Interpolation`` values. A "decoded" object -is a string-like object with extra powers, as described below. An "interpolation" object is a -tuple-like value representing how Python processed the interpolation into a form useful for your -tag function. Both are fully described below in `Specification`_. +This PEP introduces a new string prefix, ``t``, to define template string literals. +These literals resolve to a new type, ``Template``, found in a new top-level +standard library module, ``templatelib``. -Here is a more generalized version using structural pattern matching and type hints: +The following code creates a ``Template`` instance: .. code-block:: python - from typing import Decoded, Interpolation # Get the new protocols + from templatelib import Template + template = t"This is a template string." + assert isinstance(template, Template) - def greet(*args: Decoded | Interpolation) -> str: - """Handle arbitrary args using structural pattern matching.""" - result = [] - for arg in args: - match arg: - case Decoded() as decoded: - result.append(decoded) - case Interpolation() as interpolation: - value = interpolation.getvalue() - result.append(value.upper()) +Template string literals support the full syntax of :pep:`701`. This includes +the ability to nest template strings within interpolations, as well as the ability +to use all valid quote marks (``'``, ``"``, ``'''``, and ``"""``). Like other string +prefixes, the ``t`` prefix must immediately precede the quote. Like f-strings, +both lowercase ``t`` and uppercase ``T`` prefixes are supported. Like +f-strings, t-strings may not be combined with the ``b`` or ``u`` prefixes. +Additionally, f-strings and t-strings cannot be combined, so the ``ft`` +prefix is invalid as well. t-strings *may* be combined with the ``r`` prefix; +see the `Raw Template Strings`_ section below for more information. - return f"{''.join(result)}!" - name = "World" - greeting = greet"Hello {name} nice to meet you" - assert greeting == "Hello WORLD nice to meet you!" +The ``Template`` Type +--------------------- -Tag strings extract more than just a callable from the ``Interpolation``. They also -provide Python string formatting info, as well as the original text: +Template strings evaluate to an instance of a new type, ``templatelib.Template``: .. code-block:: python - def greet(*args: Decoded | Interpolation) -> str: - """Interpolations can have string formatting specs and conversions.""" - result = [] - for arg in args: - match arg: - case Decoded() as decoded: - result.append(decoded) - case getvalue, raw, conversion, format_spec: # Unpack - gv = f"gv: {getvalue()}" - r = f"r: {raw}" - c = f"c: {conversion}" - f = f"f: {format_spec}" - result.append(", ".join([gv, r, c, f])) + class Template: + args: Sequence[str | Interpolation] + + def __init__(self, *args: str | Interpolation): + ... + +The ``args`` attribute provides access to the string parts and +any interpolations in the literal: - return f"{''.join(result)}!" +.. code-block:: python name = "World" - assert greet"Hello {name!r:s}" == "Hello gv: World, r: name, c: r, f: s!" + template = t"Hello {name}" + assert isinstance(template.args[0], str) + assert isinstance(template.args[1], Interpolation) + assert template.args[0] == "Hello " + assert template.args[1].value == "World" -You can see each of the ``Interpolation`` parts getting extracted: +See `Interleaving of Template.args`_ below for more information on how the +``args`` attribute is structured. -- The lambda expression to call and get the value in the scope it was defined -- The raw string of the interpolation (``name``) -- The Python "conversion" field (``r``) -- Any `format specification `_ - (``s``) +The ``Template`` type is immutable. ``Template.args`` cannot be reassigned +or mutated. -Specification -============= -In the rest of this specification, ``my_tag`` will be used for an arbitrary tag. -For example: +The ``Interpolation`` Type +-------------------------- + +The ``Interpolation`` type represents an expression inside a template string. +Like ``Template``, it is a new concrete type found in the ``templatelib`` module: .. code-block:: python - def mytag(*args): - return args + class Interpolation: + value: object + expr: str + conv: Literal["a", "r", "s"] | None + format_spec: str - trade = 'shrubberies' - mytag'Did you say "{trade}"?' + __match_args__ = ("value", "expr", "conv", "format_spec") + + def __init__( + self, + value: object, + expr: str, + conv: Literal["a", "r", "s"] | None = None, + format_spec: str = "", + ): + ... + +Like ``Template``, ``Interpolation`` is shallow immutable. Its attributes +cannot be reassigned. -Valid Tag Names ---------------- +The ``value`` attribute is the evaluated result of the interpolation: -The tag name can be any undotted name that isn't already an existing valid string or -bytes prefix, as seen in the `lexical analysis specification -`_. -Therefore these prefixes can't be used as a tag: +.. code-block:: python -.. code-block:: text + name = "World" + template = t"Hello {name}" + assert template.args[1].value == "World" - stringprefix: "r" | "u" | "R" | "U" | "f" | "F" - : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" +The ``expr`` attribute is the *original text* of the interpolation: - bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" +.. code-block:: python -Python `restricts certain keywords `_ from being -used as identifiers. This restriction also applies to tag names. Usage of keywords should -trigger a helpful error, as done in recent CPython releases. + name = "World" + template = t"Hello {name}" + assert template.args[1].expr == "name" -Tags Must Immediately Precede the Quote Mark --------------------------------------------- +We expect that the ``expr`` attribute will not be used in most template processing +code. It is provided for completeness and for use in debugging and introspection. +See both the `Common Patterns Seen in Processing Templates`_ section and the +`Examples`_ section for more information on how to process template strings. -As with other string literal prefixes, no whitespace can be between the tag and the -quote mark. +The ``conv`` attribute is the :ref:`optional conversion ` +to be used, one of ``r``, ``s``, and ``a``, corresponding to ``repr()``, +``str()``, and ``ascii()`` conversions. As with f-strings, no other conversions +are supported: -PEP 701 -------- +.. code-block:: python -Tag strings support the full syntax of :pep:`701` in that any string literal, -with any quote mark, can be nested in the interpolation. This nesting includes -of course tag strings. + name = "World" + template = t"Hello {name!r}" + assert template.args[1].conv == "r" -Evaluating Tag Strings ----------------------- +If no conversion is provided, ``conv`` is ``None``. -When the tag string is evaluated, the tag must have a binding, or a ``NameError`` -is raised; and it must be a callable, or a ``TypeError`` is raised. The callable -must accept a sequence of positional arguments. This behavior follows from the -de-sugaring of: +The ``format_spec`` attribute is the :ref:`format specification `. +As with f-strings, this is an arbitrary string that defines how to present the value: .. code-block:: python - trade = 'shrubberies' - mytag'Did you say "{trade}"?' + value = 42 + template = t"Value: {value:.2f}" + assert template.args[1].format_spec == ".2f" -to: +Format specifications in f-strings can themselves contain interpolations. This +is permitted in template strings as well; ``format_spec`` is set to the eagerly +evaluated result: .. code-block:: python - mytag(DecodedConcrete(r'Did you say "'), InterpolationConcrete(lambda: trade, 'trade', None, None), DecodedConcrete(r'"?')) + value = 42 + precision = 2 + template = t"Value: {value:.{precision}f}" + assert template.args[1].format_spec == ".2f" -.. note:: +If no format specification is provided, ``format_spec`` defaults to an empty +string (``""``). This matches the ``format_spec`` parameter of Python's +:func:`python:format` built-in. - `DecodedConcrete` and `InterpolationConcrete` are just example implementations. If approved, - tag strings will have concrete types in `builtins`. +Unlike f-strings, it is up to code that processes the template to determine how to +interpret the ``conv`` and ``format_spec`` attributes. +Such code is not required to use these attributes, but when present they should +be respected, and to the extent possible match the behavior of f-strings. +It would be surprising if, for example, a template string that uses ``{value:.2f}`` +did not round the value to two decimal places when processed. -Decoded Strings ---------------- -In the ``mytag'Did you say "{trade}"?'`` example, there are two strings: ``r'Did you say "'`` -and ``r'"?'``. +Processing Template Strings +--------------------------- -Strings are internally stored as objects with a ``Decoded`` structure, meaning: conforming to -a protocol ``Decoded``: +Developers can write arbitrary code to process template strings. For example, +the following function renders static parts of the template in lowercase and +interpolations in uppercase: .. code-block:: python - @runtime_checkable - class Decoded(Protocol): - def __str__(self) -> str: - ... + from templatelib import Template, Interpolation + + def lower_upper(template: Template) -> str: + """Render static parts lowercased and interpolations uppercased.""" + parts: list[str] = [] + for arg in template.args: + if isinstance(arg, Interpolation): + parts.append(str(arg.value).upper()) + else: + parts.append(arg.lower()) + return "".join(parts) - raw: str + name = "world" + assert lower_upper(t"HELLO {name}") == "hello WORLD" +There is no requirement that template strings are processed in any particular +way. Code that processes templates has no obligation to return a string. +Template strings are a flexible, general-purpose feature. -These ``Decoded`` objects have access to raw strings. Raw strings are used because tag strings -are meant to target a variety of DSLs, such as the shell and regexes. Such DSLs have their -own specific treatment of metacharacters, namely the backslash. +See the `Common Patterns Seen in Processing Templates`_ section for more +information on how to process template strings. See the `Examples`_ section +for detailed working examples. -However, often the "cooked" string is what is needed, by decoding the string as -if it were a standard Python string. In the proposed implementation, the decoded object's -``__new__`` will *store* the raw string and *store and return* the "cooked" string. -The protocol is marked as ``@runtime_checkable`` to allow structural pattern matching to -test against the protocol instead of a type. This can incur a small performance penalty. -Since the ``case`` tests are in user-code tag functions, authors can choose to optimize by -testing for the implementation type discussed next. +Template String Concatenation +----------------------------- -The ``Decoded`` protocol will be available from ``typing``. In CPython, ``Decoded`` -will be implemented in C, but for discussion of this PEP, the following is a compatible -implementation: +Template strings support explicit concatenation using ``+``. Concatenation is +supported for two ``Template`` instances as well as for a ``Template`` instance +and a ``str``: .. code-block:: python - class DecodedConcrete(str): - _raw: str + name = "World" + template1 = t"Hello " + template2 = t"{name}" + assert template1 + template2 == t"Hello {name}" + assert template1 + "!" == t"Hello !" + assert "Hello " + template2 == t"Hello {name}" - def __new__(cls, raw: str): - decoded = raw.encode("utf-8").decode("unicode-escape") - if decoded == raw: - decoded = raw - chunk = super().__new__(cls, decoded) - chunk._raw = raw - return chunk +Concatenation of templates is "viral": the concatenation of a ``Template`` and +a ``str`` always results in a ``Template`` instance. - @property - def raw(self): - return self._raw +Python's implicit concatenation syntax is also supported. The following code +will work as expected: -Interpolation -------------- +.. code-block:: python -An ``Interpolation`` is the data structure representing an expression inside the tag -string. Interpolations enable a delayed evaluation model, where the interpolation -expression is computed, transformed, memoized, or processed in any way. + name = "World" + template = t"Hello " "World" + assert template == t"Hello World" + template2 = t"Hello " t"World" + assert template2 == t"Hello World" -In addition, the original text of the interpolation expression is made available to the -tag function. This can be useful for debugging or metaprogramming. -``Interpolation`` is a ``Protocol`` which will be made available from ``typing``. It -has the following definition: +The ``Template`` type implements the ``__add__()`` and ``__radd__()`` methods +roughly as follows: .. code-block:: python - @runtime_checkable - class Interpolation(Protocol): - def __len__(self): - ... + class Template: + def __add__(self, other: object) -> Template: + if isinstance(other, str): + return Template(*self.args[:-1], self.args[-1] + other) + if not isinstance(other, Template): + return NotImplemented + return Template(*self.args[:-1], self.args[-1] + other.args[0], *other.args[1:]) - def __getitem__(self, index: int): - ... + def __radd__(self, other: object) -> Template: + if not isinstance(other, str): + return NotImplemented + return Template(other + self.args[0], *self.args[1:]) - def getvalue(self) -> Callable[[], Any]: - ... +Special care is taken to ensure that the interleaving of ``str`` and ``Interpolation`` +instances is maintained when concatenating. (See the +`Interleaving of Template.args`_ section for more information.) - expr: str - conv: Literal["a", "r", "s"] | None - format_spec: str | None -Given this example interpolation: +Template and Interpolation Equality +----------------------------------- + +Two instances of ``Template`` are defined to be equal if their ``args`` attributes +contain the same strings and interpolations in the same order: .. code-block:: python - mytag'{trade!r:some-formatspec}' + assert t"I love {stilton}" == t"I love {stilton}" + assert t"I love {stilton}" != t"I love {roquefort}" + assert t"I " + t"love {stilton}" == t"I love {stilton}" -these attributes are as follows: +The implementation of ``Template.__eq__()`` is roughly as follows: -* ``getvalue`` is a zero argument closure for the interpolation. In this case, ``lambda: trade``. +.. code-block:: python -* ``expr`` is the *expression text* of the interpolation. Example: ``'trade'``. + class Template: + def __eq__(self, other: object) -> bool: + if not isinstance(other, Template): + return NotImplemented + return self.args == other.args -* ``conv`` is the - `optional conversion `_ - to be used by the tag function, one of ``r``, ``s``, and ``a``, corresponding to repr, str, - and ascii conversions. Note that as with f-strings, no other conversions are supported. - Example: ``'r'``. +Two instances of ``Interpolation`` are defined to be equal if their ``value``, +``expr``, ``conv``, and ``format_spec`` attributes are equal: -* ``format_spec`` is the optional `format_spec string `_. - A ``format_spec`` is eagerly evaluated if it contains any expressions before being passed to the tag - function. Example: ``'some-formatspec'``. +.. code-block:: python -In all cases, the tag function determines what to do with valid ``Interpolation`` -attributes. + class Interpolation: + def __eq__(self, other: object) -> bool: + if not isinstance(other, Interpolation): + return NotImplemented + return ( + self.value == other.value + and self.expr == other.expr + and self.conv == other.conv + and self.format_spec == other.format_spec + ) -In the CPython reference implementation, implementing ``Interpolation`` in C would -use the equivalent `Struct Sequence Objects -`_ (see -such code as `os.stat_result -`_). For purposes of this -PEP, here is an example of a pure Python implementation: -.. code-block:: python +No Support for Ordering +----------------------- - class InterpolationConcrete(NamedTuple): - getvalue: Callable[[], Any] - expr: str - conv: Literal['a', 'r', 's'] | None = None - format_spec: str | None = None +The ``Template`` and ``Interpolation`` types do not support ordering. This is +unlike all other string literal types in Python, which support lexicographic +ordering. Because interpolations can contain arbitrary values, there is no +natural ordering for them. As a result, neither the ``Template`` nor the +``Interpolation`` type implements the standard comparison methods. -Interpolation Expression Evaluation ------------------------------------ -Expression evaluation for interpolations is the same as in :pep:`498#expression-evaluation`, -except that all expressions are always implicitly wrapped with a ``lambda``: +Support for the debug specifier (``=``) +--------------------------------------- - The expressions that are extracted from the string are evaluated in the context - where the tag string appeared. This means the expression has full access to its - lexical scope, including local and global variables. Any valid Python expression - can be used, including function and method calls. +The debug specifier, ``=``, is supported in template strings and behaves similarly +to how it behaves in f-strings, though due to limitations of the implementation +there is a slight difference. -However, there's one additional nuance to consider, `function scope -`_ -versus `annotation scope -`_. -Consider this somewhat contrived example to configure captions: +In particular, ``t'{expr=}'`` is treated as ``t'expr={expr}'``: .. code-block:: python - class CaptionConfig: - tag = 'b' - figure = f'<{tag}>Figure' + name = "World" + template = t"Hello {name=}" + assert template.args[0] == "Hello name=" + assert template.args[1].value == "World" + + +Raw Template Strings +-------------------- -Let's now attempt to rewrite the above example to use tag strings: +Raw template strings are supported using the ``rt`` (or ``tr``) prefix: .. code-block:: python - class CaptionConfig: - tag = 'b' - figure = html'<{tag}>Figure' + trade = 'shrubberies' + t = rt'Did you say "{trade}"?\n' + assert t.args[0] == r'Did you say "' + assert t.args[2] == r'"?\n' + +In this example, the ``\n`` is treated as two separate characters +(a backslash followed by 'n') rather than a newline character. This is +consistent with Python's raw string behavior. -Unfortunately, this rewrite doesn't work if using the usual lambda wrapping to -implement interpolations, namely ``lambda: tag``. When the interpolations are -evaluated by the tag function, it will result in ``NameError: name 'tag' is not -defined``. The root cause of this name error is that ``lambda: tag`` uses function scope, -and it's therefore not able to use the class definition where ``tag`` is -defined. +As with regular template strings, interpolations in raw template strings are +processed normally, allowing for the combination of raw string behavior and +dynamic content. -Desugaring how the tag string could be evaluated will result in the same -``NameError`` even using f-strings; the lambda wrapping here also uses function -scoping: -.. code-block:: python +Interpolation Expression Evaluation +----------------------------------- - class CaptionConfig: - tag = 'b' - figure = f'<{(lambda: tag)()}>Figure' +Expression evaluation for interpolations is the same as in :pep:`498#expression-evaluation`: -For tag strings, getting such a ``NameError`` would be surprising. It would also -be a rough edge in using tag strings in this specific case of working with class -variables. After all, tag strings are supposed to support a superset of the -capabilities of f-strings. + The expressions that are extracted from the string are evaluated in the context + where the template string appeared. This means the expression has full access to its + lexical scope, including local and global variables. Any valid Python expression + can be used, including function and method calls. -The solution is to use annotation scope for tag string interpolations. While the -name "annotation scope" suggests it's only about annotations, it solves this -problem by lexically resolving names in the class definition, such as ``tag``, -unlike function scope. +Template strings are evaluated eagerly from left to right, just like f-strings. This means that +interpolations are evaluated immediately when the template string is processed, not deferred +or wrapped in lambdas. -.. note:: - The use of annotation scope means it's not possible to fully desugar - interpolations into Python code. Instead it's as if one is writing - ``interpolation_lambda: tag``, not ``lambda: tag``, where a hypothetical - ``interpolation_lambda`` keyword variant uses annotation scope instead of - the standard function scope. +Exceptions +---------- - This is more or less how the reference implementation implements this - concept (but without creating a new keyword of course). +Exceptions raised in t-string literals are the same as those raised in f-string +literals. -This PEP and its reference implementation therefore use the support for -annotation scope. Note that this usage is a separable part from the -implementation of :pep:`649` and :pep:`695` which provides a somewhat similar -deferred execution model for annotations. Instead it's up to the tag function to -evaluate any interpolations. -With annotation scope in place, lambda-wrapped expressions in interpolations -then provide the usual lexical scoping seen with f-strings. So there's no need -to use ``locals()``, ``globals()``, or frame introspection with -``sys._getframe`` to evaluate the interpolation. In addition, the code of each -expression is available and does not have to be looked up with -``inspect.getsource`` or some other means. +Interleaving of ``Template.args`` +--------------------------------- -Format Specification --------------------- +In the ``Template`` type, the ``args`` attribute is a sequence that will always +alternate between string literals and ``Interpolation`` instances. Specifically: -The ``format_spec`` is by default ``None`` if it is not specified in the tag string's -corresponding interpolation. +- Even-indexed elements (0, 2, 4, ...) are always of type ``str``, representing + the literal parts of the template. +- Odd-indexed elements (1, 3, 5, ...) are always ``Interpolation`` instances, + representing the interpolated expressions. -Because the tag function is completely responsible for processing ``Decoded`` -and ``Interpolation`` values, there is no required interpretation for the format -spec and conversion in an interpolation. For example, this is a valid usage: +For example, the following assertions hold: .. code-block:: python - html'
{content:HTML|str}
' + name = "World" + template = t"Hello {name}" + assert len(template.args) == 3 + assert template.args[0] == "Hello " + assert template.args[1].value == "World" + assert template.args[2] == "" -In this case the ``format_spec`` for the second interpolation is the string -``'HTML|str'``; it is up to the ``html`` tag to do something with the -"format spec" here, if anything. +These rules imply that the ``args`` attribute will always have an odd length. +As a consequence, empty strings are added to the sequence when the template +begins or ends with an interpolation, or when two interpolations are adjacent: -f-string-style ``=`` Evaluation -------------------------------- +.. code-block:: python -``mytag'{expr=}'`` is parsed to being the same as ``mytag'expr={expr}``', as -implemented in the issue `Add = to f-strings for -easier debugging `_. + a, b = "a", "b" + template = t"{a}{b}" + assert len(template.args) == 5 + assert template.args[0] == "" + assert template.args[1].value == "a" + assert template.args[2] == "" + assert template.args[3].value == "b" + assert template.args[4] == "" + +Most template processing code will not care about this detail and will use +either structural pattern matching or ``isinstance()`` checks to distinguish +between the two types of elements in the sequence. + +The detail exists because it allows for performance optimizations in template +processing code. For example, a template processor could cache the static parts +of the template and only reprocess the dynamic parts when the template is +evaluated with different values. Access to the static parts can be done with +``template.args[::2]``. + +Interleaving is an invariant maintained by the ``Template`` class. Developers can +take advantage of it but they are not required to themselves maintain it. +Specifically, ``Template.__init__()`` can be called with ``str`` and +``Interpolation`` instances in *any* order; the constructor will "interleave" them +as necessary before assigning them to ``args``. + + +Examples +======== -Tag Function Arguments ----------------------- +All examples in this section of the PEP have fully tested reference implementations +available in the public `pep750-examples `_ +git repository. -The tag function has the following signature: -.. code-block:: python +Example: Implementing f-strings with t-strings +---------------------------------------------- - def mytag(*args: Decoded | Interpolation) -> Any: - ... +It is easy to "implement" f-strings using t-strings. That is, we can +write a function ``f(template: Template) -> str`` that processes a ``Template`` +in much the same way as an f-string literal, returning the same result: -This corresponds to the following protocol: .. code-block:: python - class TagFunction(Protocol): - def __call__(self, *args: Decoded | Interpolation) -> Any: - ... + name = "World" + value = 42 + templated = t"Hello {name!r}, value: {value:.2f}" + formatted = f"Hello {name!r}, value: {value:.2f}" + assert f(templated) == formatted -Because of subclassing, the signature for ``mytag`` can of course be widened to -the following, at the cost of losing some type specificity: +The ``f()`` function supports both conversion specifiers like ``!r`` and format +specifiers like ``:.2f``. The full code is fairly simple: .. code-block:: python - def mytag(*args: str | tuple) -> Any: - ... + from templatelib import Template, Interpolation -A user might write a tag string as follows: + def convert(value: object, conv: Literal["a", "r", "s"] | None) -> object: + if conv == "a": + return ascii(value) + elif conv == "r": + return repr(value) + elif conv == "s": + return str(value) + return value -.. code-block:: python - def tag(*args): - return args + def f(template: Template) -> str: + parts = [] + for arg in template.args: + match arg: + case str() as s: + parts.append(s) + case Interpolation(value, _, conv, format_spec): + value = convert(value, conv) + value = format(value, format_spec) + parts.append(value) + return "".join(parts) - tag"\N{{GRINNING FACE}}" -Tag strings will represent this as exactly one ``Decoded`` argument. In this case, ``Decoded.raw`` would be -``'\\N{GRINNING FACE}'``. The "cooked" representation via encode and decode would be: +.. note:: Example code -.. code-block:: python + See `fstring.py`__ and `test_fstring.py`__. - '\\N{GRINNING FACE}'.encode('utf-8').decode('unicode-escape') - '😀' + __ https://github.com/davepeck/pep750-examples/blob/main/pep/fstring.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_fstring.py -Named unicode characters immediately followed by more text will still produce -just one ``Decoded`` argument: -.. code-block:: python +Example: Structured Logging +--------------------------- - def tag(*args): - return args +Structured logging allows developers to log data in both a human-readable format +*and* a structured format (like JSON) using only a single logging call. This is +useful for log aggregation systems that process the structured format while +still allowing developers to easily read their logs. - assert tag"\N{{GRINNING FACE}}sometext" == (DecodedConcrete("😀sometext"),) +We present two different approaches to implementing structured logging with +template strings. +Approach 1: Custom Log Messages +''''''''''''''''''''''''''''''' -Return Value ------------- +The :ref:`Python Logging Cookbook ` +has a short section on `how to implement structured logging `_. -Tag functions can return any type. Often they will return a string, but -richer systems can be built by returning richer objects. See below for -a motivating example. +The logging cookbook suggests creating a new "message" class, ``StructuredMessage``, +that is constructed with a simple text message and a separate dictionary of values: -Function Application --------------------- +.. code-block:: python -Tag strings desugar as follows: + message = StructuredMessage("user action", { + "action": "traded", + "amount": 42, + "item": "shrubs" + }) + logging.info(message) -.. code-block:: python + # Outputs: + # user action >>> {"action": "traded", "amount": 42, "item": "shrubs"} - mytag'Hi, {name!s:format_spec}!' +The ``StructuredMessage.__str__()`` method formats both the human-readable +message *and* the values, combining them into a final string. (See the +`logging cookbook `_ +for its full example.) -This is equivalent to: +We can implement an improved version of ``StructuredMessage`` using template strings: .. code-block:: python - mytag(DecodedConcrete(r'Hi, '), InterpolationConcrete(lambda: name, 'name', - 's', 'format_spec'), DecodedConcrete(r'!')) - -.. note:: + import json + from templatelib import Interpolation, Template + from typing import Mapping - To keep it simple, this and subsequent desugaring omits an important scoping - aspect in how names in interpolation expressions are resolved, specifically - when defining classes. See `Interpolation Expression Evaluation`_. + class TemplateMessage: + def __init__(self, template: Template) -> None: + self.template = template -No Empty Decoded String ------------------------ + @property + def message(self) -> str: + # Use the f() function from the previous example + return f(self.template) -Alternation between decodeds and interpolations is commonly seen, but it depends -on the tag string. Decoded strings will never have a value that is the empty string: + @property + def values(self) -> Mapping[str, object]: + return { + arg.expr: arg.value + for arg in self.template.args + if isinstance(arg, Interpolation) + } -.. code-block:: python + def __str__(self) -> str: + return f"{self.message} >>> {json.dumps(self.values)}" - mytag'{a}{b}{c}' + _ = TemplateMessage # optional, to improve readability + action, amount, item = "traded", 42, "shrubs" + logging.info(_(t"User {action}: {amount:.2f} {item}")) -...which results in this desugaring: + # Outputs: + # User traded: 42.00 shrubs >>> {"action": "traded", "amount": 42, "item": "shrubs"} -.. code-block:: python +Template strings give us a more elegant way to define the custom message +class. With template strings it is no longer necessary for developers to make +sure that their format string and values dictionary are kept in sync; a single +template string literal is all that is needed. The ``TemplateMessage`` +implementation can automatically extract structured keys and values from +the ``Interpolation.expr`` and ``Interpolation.value`` attributes, respectively. - mytag(InterpolationConcrete(lambda: a, 'a', None, None), InterpolationConcrete(lambda: b, 'b', None, None), InterpolationConcrete(lambda: c, 'c', None, None)) -Likewise: +Approach 2: Custom Formatters +''''''''''''''''''''''''''''' -.. code-block:: python +Custom messages are a reasonable approach to structured logging but can be a +little awkward. To use them, developers must wrap every log message they write +in a custom class. This can be easy to forget. - mytag'' +An alternative approach is to define custom ``logging.Formatter`` classes. This +approach is more flexible and allows for more control over the final output. In +particular, it's possible to take a single template string and output it in +multiple formats (human-readable and JSON) to separate log streams. -...results in this desugaring: +We define two simple formatters, a ``MessageFormatter`` for human-readable output +and a ``ValuesFormatter`` for JSON output: .. code-block:: python - mytag() + import json + from logging import Formatter, LogRecord + from templatelib import Interpolation, Template + from typing import Any, Mapping + -HTML Example of Rich Return Types -================================= + class MessageFormatter(Formatter): + def message(self, template: Template) -> str: + # Use the f() function from the previous example + return f(template) -Tag functions can be a powerful part of larger processing chains by returning richer objects. -JavaScript tagged template literals, for example, are not constrained by a requirement to -return a string. As an example, let's look at an HTML generation system, with a usage and -"subcomponent": + def format(self, record: LogRecord) -> str: + msg = record.msg + if not isinstance(msg, Template): + return super().format(record) + return self.message(msg) -.. code-block:: - def Menu(*, logo: str, class_: str) -> HTML: - return html'Site Logo' + class ValuesFormatter(Formatter): + def values(self, template: Template) -> Mapping[str, Any]: + return { + arg.expr: arg.value + for arg in template.args + if isinstance(arg, Interpolation) + } - icon = 'acme.png' - result = html'
<{Menu} logo={icon} class="my-menu"/>
' - img = result.children[0] - assert img.tag == "img" - assert img.attrs == {"src": "acme.png", "class": "my-menu", "alt": "Site Logo"} - # We can also treat the return type as a string of specially-serialized HTML - assert str(result) = '
' # etc. + def format(self, record: LogRecord) -> str: + msg = record.msg + if not isinstance(msg, Template): + return super().format(record) + return json.dumps(self.values(msg)) -This ``html`` tag function might have the following signature: + +We can then use these formatters when configuring our logger: .. code-block:: python - def html(*args: Decoded | Interpolation) -> HTML: - ... + import logging + import sys -The ``HTML`` return class might have the following shape as a ``Protocol``: + logger = logging.getLogger(__name__) + message_handler = logging.StreamHandler(sys.stdout) + message_handler.setFormatter(MessageFormatter()) + logger.addHandler(message_handler) -.. code-block:: python + values_handler = logging.StreamHandler(sys.stderr) + values_handler.setFormatter(ValuesFormatter()) + logger.addHandler(values_handler) - @runtime_checkable - class HTML(Protocol): - tag: str - attrs: dict[str, Any] - children: Sequence[str | HTML] + action, amount, item = "traded", 42, "shrubs" + logger.info(t"User {action}: {amount:.2f} {item}") -In summary, the returned instance can be used as: + # Outputs to sys.stdout: + # User traded: 42.00 shrubs -- A string, for serializing to the final output -- An iterable, for working with WSGI/ASGI for output streamed and evaluated - interpolations *in the order* they are written out -- A DOM (data) structure of nested Python data + # At the same time, outputs to sys.stderr: + # {"action": "traded", "amount": 42, "item": "shrubs"} -In each case, the result can be lazily and recursively composed in a safe fashion, because -the return value isn't required to be a string. Recommended practice is that -return values are "passive" objects. -What benefits might come from returning rich objects instead of strings? A DSL for -a domain such as HTML templating can provide a toolchain of post-processing, as -`Babel `_ does for JavaScript -`with AST-based transformation plugins `_. -Similarly, systems that provide middleware processing can operate on richer, -standard objects with more capabilities. Tag string results can be tested as -nested Python objects, rather than string manipulation. Finally, the intermediate -results can be cached/persisted in useful ways. +This approach has a couple advantages over the custom message approach to structured +logging: -Tool Support -============ +- Developers can log a t-string directly without wrapping it in a custom class. +- Human-readable and structured output can be sent to separate log streams. This + is useful for log aggregation systems that process structured data independently + from human-readable data. -Python Semantics in Tag Strings -------------------------------- -Python template languages and other DSLs have semantics quite apart from Python. -Different scope rules, different calling semantics e.g. for macros, their own -grammar for loops, and the like. +.. note:: Example code + + See `logging.py`__ and `test_logging.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/logging.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_logging.py -This means all tools need to write special support for each language. Even then, -it is usually difficult to find all the possible scopes, for example to autocomplete -values. -However, f-strings do not have this issue. An f-string is considered part of Python. -Expressions in curly braces behave as expected and values should resolve based on -regular scoping rules. Tools such as mypy can see inside f-string expressions, -but will likely never look inside a Jinja2 template. +Example: HTML Templating +------------------------- + +This PEP contains several short HTML templating examples. It turns out that the +"hypothetical" ``html()`` function mentioned in the `Motivation`_ section +(and a few other places in this PEP) exists and is available in the +`pep750-examples repository `_. +If you're thinking about parsing a complex grammar with template strings, we +hope you'll find it useful. -DSLs written with tag strings will inherit much of this value. While we can't expect -standard tooling to understand the "domain" in the DSL, they can still inspect -anything expressible in an f-string. Backwards Compatibility ======================= -Like f-strings, use of tag strings will be a syntactic backwards incompatibility +Like f-strings, use of template strings will be a syntactic backwards incompatibility with previous versions. + Security Implications ===================== -The security implications of working with interpolations, with respect to +The security implications of working with template strings, with respect to interpolations, are as follows: 1. Scope lookup is the same as f-strings (lexical scope). This model has been shown to work well in practice. -2. Tag functions can ensure that any interpolations are done in a safe fashion, - including respecting the context in the target DSL. +2. Code that processes ``Template`` instances can ensure that any interpolations + are processed in a safe fashion, including respecting the context in which + they appear. + How To Teach This ================= -Tag strings have several audiences: consumers of tag functions, authors of tag -functions, and framework authors who provide interesting machinery for tag -functions. - -All three groups can start from an important framing: +Template strings have several audiences: -- Existing solutions (such as template engines) can do parts of tag strings -- But tag strings move logic closer to "normal Python" +- Developers using template strings and processing functions +- Authors of template processing code +- Framework authors who build interesting machinery with template strings -Consumers can look at tag strings as starting from f-strings: +We hope that teaching developers will be straightforward. At a glance, +template strings look just like f-strings. Their syntax is familiar and the +scoping rules remain the same. -- They look familiar -- Scoping and syntax rules are the same +The first thing developers must learn is that template string literals don't +evaluate to strings; instead, they evaluate to a new type, ``Template``. This +is a simple type intended to be used by template processing code. It's not until +developers call a processing function that they get the result they want: +typically, a string, although processing code can of course return any arbitrary +type. -They first thing they need to absorb: unlike f-strings, the string isn't -immediately evaluated "in-place". Something else (the tag function) happens. -That's the second thing to teach: the tag functions do something particular. -Thus the concept of "domain specific languages" (DSLs). What's extra to -teach: you need to import the tag function before tagging a string. +Because developers will learn that t-strings are nearly always used in tandem +with processing functions, they don't necessarily need to understand the details +of the ``Template`` type. As with descriptors and decorators, we expect many more +developers will use t-strings than write t-string processing functions. -Tag function authors think in terms of making a DSL. They have -business policies they want to provide in a Python-familiar way. With tag -functions, Python is going to do much of the pre-processing. This lowers -the bar for making a DSL. +Over time, a small number of more advanced developers *will* wish to author their +own template processing code. Writing processing code often requires thinking +in terms of formal grammars. Developers will need to learn how to parse the +``args`` attribute of a ``Template`` instance and how to process interpolations +in a context-sensitive fashion. More sophisticated grammars will likely require +parsing to intermediate representations like an AST. Great template processing +code will handle format specifiers and conversions when appropriate. Writing +production-grade template processing code -- for instance, to support HTML +templates -- can be a large undertaking. -Tag authors can begin with simple use cases. After authors gain experience, tag strings can be used to add larger -patterns: lazy evaluation, intermediate representations, registries, and more. +We expect that template strings will provide framework authors with a powerful +new tool in their toolbox. While the functionality of template strings overlaps +with existing tools like template engines, t-strings move that logic into +the language itself. Bringing the full power and generality of Python to bear on +string processing tasks opens new possibilities for framework authors. -Each of these points also match the teaching of decorators. In that case, -a learner consumes something which applies to the code just after it. They -don't need to know too much about decorator theory to take advantage of the -utility. -Common Patterns Seen In Writing Tag Functions -============================================= +Common Patterns Seen in Processing Templates +============================================ Structural Pattern Matching --------------------------- -Iterating over the arguments with structural pattern matching is the expected -best practice for many tag function implementations: +Iterating over the ``Template.args`` with structural pattern matching is the expected +best practice for many template function implementations: .. code-block:: python - def tag(*args: Decoded | Interpolation) -> Any: - for arg in args: + from templatelib import Template, Interpolation + + def process(template: Template) -> Any: + for arg in template.args: match arg: - case Decoded() as decoded: - ... # handle each decoded string + case str() as s: + ... # handle each string part case Interpolation() as interpolation: ... # handle each interpolation -Lazy Evaluation ---------------- -The example tag functions above each call the interpolation's ``getvalue`` lambda -immediately. Python developers have frequently wished that f-strings could be -deferred, or lazily evaluated. It would be straightforward to write a wrapper that, -for example, defers calling the lambda until an ``__str__`` was invoked. +Processing code may also commonly sub-match on attributes of the ``Interpolation`` type: + +.. code-block:: python + + match arg: + case Interpolation(int()): + ... # handle interpolations with integer values + case Interpolation(value=str() as s): + ... # handle interpolations with string values + # etc. + Memoizing --------- -Tag function authors have control of processing the static string parts and -the dynamic interpolation parts. For higher performance, they can deploy approaches -for memoizing processing, for example by generating keys. +Template functions can efficiently process both static and dynamic parts of templates. +The structure of ``Template`` objects allows for effective memoization: + +.. code-block:: python + + source = template.args[::2] # Static string parts + values = [i.value for i in template.args[1::2]] # Dynamic interpolated values + +This separation enables caching of processed static parts, while dynamic parts can be +inserted as needed. Authors of template processing code can use the static +``source`` as cache keys, leading to significant performance improvements when +similar templates are used repeatedly. + + +Parsing to Intermediate Representations +--------------------------------------- -Order of Evaluation -------------------- +Code that processes templates can parse the template string into intermediate +representations, like an AST. We expect that many template processing libraries +will use this approach. -Imagine a tag that generates a number of sections in HTML. The tag needs inputs for each -section. But what if the last input argument takes a while? You can't return the HTML for -the first section until all the arguments are available. +For instance, rather than returning a ``str``, our theoretical ``html()`` function +(see the `Motivation`_ section) could return an HTML ``Element`` defined in the +same package: + +.. code-block:: python + + @dataclass(frozen=True) + class Element: + tag: str + attributes: Mapping[str, str | bool] + children: Sequence[str | Element] + + def __str__(self) -> str: + ... + + + def html(template: Template) -> Element: + ... + +Calling ``str(element)`` would then render the HTML but, in the meantime, the +``Element`` could be manipulated in a variety of ways. + + +Context-sensitive Processing of Interpolations +---------------------------------------------- + +Continuing with our hypothetical ``html()`` function, it could be made +context-sensitive. Interpolations could be processed differently depending +on where they appear in the template. + +For example, our ``html()`` function could support multiple kinds of +interpolations: + +.. code-block:: python + + attributes = {"id": "main"} + attribute_value = "shrubbery" + content = "hello" + template = t"
{content}
" + element = html(template) + assert str(element) == '
hello
' + +Because the ``{attributes}`` interpolation occurs in the context of an HTML tag, +and because there is no corresponding attribute name, it is treated as a dictionary +of attributes. The ``{attribute_value}`` interpolation is treated as a simple +string value and is quoted before inclusion in the final string. The +``{content}`` interpolation is treated as potentially unsafe content and is +escaped before inclusion in the final string. + + +Nested Template Strings +----------------------- + +Going a step further with our ``html()`` function, we could support nested +template strings. This would allow for more complex HTML structures to be +built up from simpler templates: + +.. code-block:: python + + name = "World" + content = html(t"

Hello {name}

") + template = t"
{content}
" + element = html(template) + assert str(element) == '

Hello World

' + +Because the ``{content}`` interpolation is an ``Element`` instance, it does +not need to be escaped before inclusion in the final string. + +One could imagine a nice simplification: if the ``html()`` function is passed +a ``Template`` instance, it could automatically convert it to an ``Element`` +by recursively calling itself on the nested template. + +We expect that nesting and composition of templates will be a common pattern +in template processing code and, where appropriate, used in preference to +simple string concatenation. + + +Approaches to Lazy Evaluation +----------------------------- + +Like f-strings, interpolations in t-string literals are eagerly evaluated. However, +there are cases where lazy evaluation may be desirable. + +If a single interpolation is expensive to evaluate, it can be explicitly wrapped +in a ``lambda`` in the template string literal: + +.. code-block:: python + + name = "World" + template = t"Hello {(lambda: name)}" + assert callable(template.args[1].value) + assert template.args[1].value() == "World" + +This assumes, of course, that template processing code anticipates and handles +callable interpolation values. (One could imagine also supporting iterators, +awaitables, etc.) This is not a requirement of the PEP, but it is a common +pattern in template processing code. + +In general, we hope that the community will develop best practices for lazy +evaluation of interpolations in template strings and that, when it makes sense, +common libraries will provide support for callable or awaitable values in +their template processing code. + + +Approaches to Asynchronous Evaluation +------------------------------------- + +Closely related to lazy evaluation is asynchronous evaluation. + +As with f-strings, the ``await`` keyword is allowed in interpolations: + +.. code-block:: python + + async def example(): + async def get_name() -> str: + await asyncio.sleep(1) + return "Sleepy" + + template = t"Hello {await get_name()}" + # Use the f() function from the f-string example, above + assert f(template) == "Hello Sleepy" + +More sophisticated template processing code can take advantage of this to +perform asynchronous operations in interpolations. For example, a "smart" +processing function could anticipate that an interpolation is an awaitable +and await it before processing the template string: + +.. code-block:: python + + async def example(): + async def get_name() -> str: + await asyncio.sleep(1) + return "Sleepy" + + template = t"Hello {get_name}" + assert await aformat(template) == "Hello Sleepy" + +This assumes that the template processing code in ``aformat()`` is asynchronous +and is able to ``await`` an interpolation's value. + +.. note:: Example code + + See `aformat.py`__ and `test_aformat.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/aformat.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_aformat.py + + +Approaches to Template Reuse +---------------------------- + +If developers wish to reuse template strings multiple times with different +values, they can write a function to return a ``Template`` instance: + +.. code-block:: python + + def reusable(name: str, question: str) -> Template: + return t"Hello {name}, {question}?" + + template = reusable("friend", "how are you") + template = reusable("King Arthur", "what is your quest") + +This is, of course, no different from how f-strings can be reused. -You'd prefer to emit markup as the inputs are available. Some templating tools support -this approach, as does tag strings. Reference Implementation ======================== @@ -807,65 +1044,168 @@ Reference Implementation At the time of this PEP's announcement, a fully-working implementation is `available `_. -This implementation is not final, as the PEP discussion will likely provide changes. +There is also a public repository of `examples and tests `_ +built around the reference implementation. If you're interested in playing with +template strings, this repository is a great place to start. + Rejected Ideas ============== +This PEP has been through several significant revisions. In addition, quite a few interesting +ideas were considered both in revisions of :pep:`501` and in the `Discourse discussion `_. + +We attempt to document the most significant ideas that were considered and rejected. + + +Arbitrary String Literal Prefixes +--------------------------------- -Enable Exact Round-Tripping of ``conv`` and ``format_spec`` ------------------------------------------------------------ +Inspired by `JavaScript tagged template literals `_, +an earlier version of this PEP allowed for arbitrary "tag" prefixes in front +of literal strings: -There are two limitations with respect to exactly round-tripping to the original -source text. +.. code-block:: python + + my_tag'Hello {name}' -First, the ``format_spec`` can be arbitrarily nested: +The prefix was a special callable called a "tag function". Tag functions +received the parts of the template string in an argument list. They could then +process the string and return an arbitrary value: .. code-block:: python - mytag'{x:{a{b{c}}}}' + def my_tag(*args: str | Interpolation) -> Any: + ... + +This approach was rejected for several reasons: -In this PEP and corresponding reference implementation, the format_spec -is eagerly evaluated to set the ``format_spec`` in the interpolation, thereby losing the -original expressions. +- It was deemed too complex to build in full generality. JavaScript allows for + arbitrary expressions to precede a template string, which is a significant + challenge to implement in Python. +- It precluded future introduction of new string prefixes. +- It seemed to needlessly pollute the namespace. -While it would be feasible to preserve round-tripping in every usage, this would -require an extra flag ``equals`` to support, for example, ``{x=}``, and a -recursive ``Interpolation`` definition for ``format_spec``. The following is roughly the -pure Python equivalent of this type, including preserving the sequence -unpacking (as used in case statements): +Use of a single ``t`` prefix was chosen as a simpler, more Pythonic approach and +more in keeping with template strings' role as a generalization of f-strings. + + +Delayed Evaluation of Interpolations +------------------------------------ + +An early version of this PEP proposed that interpolations should be lazily +evaluated. All interpolations were "wrapped" in implicit lambdas. Instead of +having an eagerly evaluated ``value`` attribute, interpolations had a +``getvalue()`` method that would resolve the value of the interpolation: .. code-block:: python - class InterpolationConcrete(NamedTuple): - getvalue: Callable[[], Any] - raw: str - conv: str | None = None - format_spec: str | None | tuple[Decoded | Interpolation, ...] = None - equals: bool = False + class Interpolation: + ... + _value: Callable[[], object] + + def getvalue(self) -> object: + return self._value() + +This was rejected for several reasons: + +- The overwhelming majority of use cases for template strings naturally call + for immediate evaluation. +- Delayed evaluation would be a significant departure from the behavior of + f-strings. +- Implicit lambda wrapping leads to difficulties with type hints and + static analysis. + +Most importantly, there are viable (if imperfect) alternatives to implicit +lambda wrapping when lazy evaluation is desired. See the section on +`Approaches to Lazy Evaluation`_, above, for more information. - def __len__(self): - return 4 - def __iter__(self): - return iter((self.getvalue, self.raw, self.conv, self.format_spec)) +Making ``Template`` and ``Interpolation`` Into Protocols +-------------------------------------------------------- + +An early version of this PEP proposed that the ``Template`` and ``Interpolation`` +types be runtime checkable protocols rather than concrete types. + +In the end, we felt that using concrete types was more straightforward. + + +An Additional ``Decoded`` Type +------------------------------ + +An early version of this PEP proposed an additional type, ``Decoded``, to represent +the "static string" parts of a template string. This type derived from ``str`` and +had a single extra ``raw`` attribute that provided the original text of the string. +We rejected this in favor of the simpler approach of using plain ``str`` and +allowing combination of ``r`` and ``t`` prefixes. + + +Other Homes for ``Template`` and ``Interpolation`` +-------------------------------------------------- + +Previous versions of this PEP proposed that the ``Template`` and ``Interpolation`` +types be placed in the ``types`` module. This was rejected in favor of creating +a new top-level standard library module, ``templatelib``. This was done to avoid +polluting the ``types`` module with seemingly unrelated types. + + +Enable Full Reconstruction of Original Template Literal +------------------------------------------------------- + +Earlier versions of this PEP attempted to make it possible to fully reconstruct +the text of the original template string from a ``Template`` instance. This was +rejected as being overly complex. + +There are several limitations with respect to round-tripping to the original +source text: + +- ``Interpolation.format_spec`` defaults to ``""`` if not provided. It is therefore + impossible to distinguish ``t"{expr}"`` from ``t"{expr:}"``. +- The debug specifier, ``=``, is treated as a special case. It is therefore not + possible to distinguish ``t"{expr=}"`` from ``t"expr={expr}"``. +- Finally, format specifiers in f-strings allow arbitrary nesting. In this PEP + and in the reference implementation, the specifier is eagerly evaluated + to set the ``format_spec`` in the ``Interpolation``, thereby losing + the original expressions. For example: + +.. code-block:: python + + value = 42 + precision = 2 + template = t"Value: {value:.{precision}f}" + assert template.args[1].format_spec == ".2f" -However, the additional complexity to support exact round-tripping seems -unnecessary and is thus rejected. +We do not anticipate that these limitations will be a significant issue in practice. +Developers who need to obtain the original template string literal can always +use ``inspect.getsource()`` or similar tools. -No Implicit String Concatenation + +Disallowing String Concatenation -------------------------------- -Implicit tag string concatenation isn't supported, which is `unlike other string literals -`_. +Earlier versions of this PEP proposed that template strings should not support +concatenation. This was rejected in favor of allowing concatenation. + +There are reasonable arguments in favor of rejecting one or all forms of +concatenation: namely, that it cuts off a class of potential bugs, particularly +when one takes the view that template strings will often contain complex grammars +for which concatenation doesn't always have the same meaning (or any meaning). + +Moreover, the earliest versions of this PEP proposed a syntax closer to +JavaScript's tagged template literals, where an arbitrary callable could be used +as a prefix to a string literal. There was no guarantee that the callable would +return a type that supported concatenation. + +In the end, we decided that the surprise to developers of a new string type +*not* supporting concatenation was likely to be greater than the theoretical +harm caused by supporting it. (Developers concatenate f-strings all the time, +after all, and while we are sure there are cases where this introduces bugs, +it's not clear that those bugs outweigh the benefits of supporting concatenation.) -The expectation is that triple quoting is sufficient. If implicit string -concatenation is supported, results from tag evaluations would need to -support the ``+`` operator with ``__add__`` and ``__radd__``. +While concatenation is supported, we expect that code that uses template strings +will more commonly build up larger templates through nesting and composition +rather than concatenation. -Because tag strings target embedded DSLs, this complexity introduces other -issues, such as determining appropriate separators. This seems unnecessarily -complicated and is thus rejected. Arbitrary Conversion Values --------------------------- @@ -873,17 +1213,106 @@ Arbitrary Conversion Values Python allows only ``r``, ``s``, or ``a`` as possible conversion type values. Trying to assign a different value results in ``SyntaxError``. -In theory, tag functions could choose to handle other conversion types. But this +In theory, template functions could choose to handle other conversion types. But this PEP adheres closely to :pep:`701`. Any changes to allowed values should be in a separate PEP. + +Removing ``conv`` From ``Interpolation`` +---------------------------------------- + +During the authoring of this PEP, we considered removing the ``conv`` attribute +from ``Interpolation`` and specifying that the conversion should be performed +eagerly, before ``Interpolation.value`` is set. + +This was done to simplify the work of writing template processing code. The +``conv`` attribute is of limited extensibility (it is typed as +``Literal["r", "s", "a"] | None``). It is not clear that it adds significant +value or flexibility to template strings that couldn't better be achieved with +custom format specifiers. Unlike with format specifiers, there is no +equivalent to Python's :func:`python:format` built-in. (Instead, we include an +sample implementation of ``convert()`` in the `Examples`_ section.) + +Ultimately we decided to keep the ``conv`` attribute in the ``Interpolation`` type +to maintain compatibility with f-strings and to allow for future extensibility. + + +Alternate Interpolation Symbols +------------------------------- + +In the early stages of this PEP, we considered allowing alternate symbols for +interpolations in template strings. For example, we considered allowing +``${name}`` as an alternative to ``{name}`` with the idea that it might be useful +for i18n or other purposes. See the +`Discourse thread `_ +for more information. + +This was rejected in favor of keeping t-string syntax as close to f-string syntax +as possible. + + +A Lazy Conversion Specifier +--------------------------- + +We considered adding a new conversion specifier, ``!()``, that would explicitly +wrap the interpolation expression in a lambda. + +This was rejected in favor of the simpler approach of using explicit lambdas +when lazy evaluation is desired. + + +Alternate Layouts for ``Template.args`` +--------------------------------------- + +During the development of this PEP, we considered several alternate layouts for +the ``args`` attribute of the ``Template`` type. This included: + +- Instead of ``args``, ``Template`` contains a ``strings`` attribute of type + ``Sequence[str]`` and an ``interpolations`` attribute of type + ``Sequence[Interpolation]``. There are zero or more interpolations and + there is always one more string than there are interpolations. Utility code + could build an interleaved sequence of strings and interpolations from these + separate attributes. This was rejected as being overly complex. + +- ``args`` is typed as a ``Sequence[tuple[str, Interpolation | None]]``. Each + static string is paired with is neighboring interpolation. The final + string part has no corresponding interpolation. This was rejected as being + overly complex. + +- ``args`` remains a ``Sequence[str | Interpolation]`` but does not support + interleaving. As a result, empty strings are not added to the sequence. It is + no longer possible to obtain static strings with ``args[::2]``; instead, + instance checks or structural pattern matching must be used to distinguish + between strings and interpolations. We believe this approach is easier to + explain and, at first glance, more intuitive. However, it was rejected as + offering less future opportunty for performance optimization. We also believe + that ``args[::2]`` may prove to be a useful shortcut in template processing + code. + + +Mechanism to Describe the "Kind" of Template +-------------------------------------------- + +If t-strings prove popular, it may be useful to have a way to describe the +"kind" of content found in a template string: "sql", "html", "css", etc. +This could enable powerful new features in tools such as linters, formatters, +type checkers, and IDEs. (Imagine, for example, ``black`` formatting HTML in +t-strings, or ``mypy`` checking whether a given attribute is valid for an HTML +tag.) While exciting, this PEP does not propose any specific mechanism. It is +our hope that, over time, the community will develop conventions for this purpose. + + Acknowledgements ================ Thanks to Ryan Morshead for contributions during development of the ideas leading -to tag strings. Thanks also to Koudai Aono for infrastructure work on contributing -materials. Special mention also to Dropbox's `pyxl `_ -as tackling similar ideas years ago. +to template strings. Special mention also to Dropbox's +`pyxl `_ for tackling similar ideas years ago. +Finally, thanks to Joachim Viide for his pioneering work on the `tagged library +`_. Tagged was not just the precursor to +template strings, but the place where the whole effort started via a GitHub issue +comment! + Copyright =========