Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Documentation tests #34

Open
tingerrr opened this issue Jul 21, 2024 · 9 comments
Open

FR: Documentation tests #34

tingerrr opened this issue Jul 21, 2024 · 9 comments
Labels
A-runner Area: test runner A-tests Area: anything actually test related C-feature Category: feature request P-medium Priority: medium
Milestone

Comments

@tingerrr
Copy link
Owner

tingerrr commented Jul 21, 2024

At some point, possibly after 0.1.0, tytanic should support doc tests by searching in doc comments for code blocks marked either example or test, run them and possibly compare them if this is desired.

@tingerrr tingerrr added C-feature Category: feature request A-api Area: rust crate API A-tests Area: anything actually test related P-medium Priority: medium labels Jul 21, 2024
@tingerrr tingerrr added this to the 0.2.0 milestone Jan 16, 2025
@tingerrr tingerrr changed the title Doc tests FR: Documentation tests Jan 19, 2025
@tingerrr tingerrr added A-runner Area: test runner and removed A-api Area: rust crate API labels Feb 2, 2025
@tingerrr tingerrr modified the milestones: 0.2.0, 0.3.0+ Feb 2, 2025
@tingerrr
Copy link
Owner Author

tingerrr commented Feb 2, 2025

Considering that most of the ecosystem uses tidy-like doc comments in some form or fashion, parsing such tests should be relatively easy. It is probably enough to use typst-syntax and see if it allows accessing trivia, then simply collect anything between the opening and closing triple backticks on leading comments with more triple forward slashes.

/// Frobnicates the input
///
/// ```example
/// // Verify frobnication!
/// #assert(frobnicate(0) != 0)
/// ```
///
/// -> int
#let frobnicate(input) = input + 42

There are some considerations.

Identifying and Querying doc tests

At the very least, a doc() test set should be added which contains all doc tests. But how should a user identify doc tests? Since they are tied to the documentation of an item, it may be beneficial to also add test sets which refer to doc tests on specific items like func(), value() and module(), but this might be overkill.

Regardless of this, every test needs a unique and somewhat stable identifier.

Identifying the item itself should be relatively easy, it should be the file path and top-level item name. This means that we exclude inner functions, renames and re-exports and other non-standard items for now.

We need some kind of separator for doc tests, perhaps : to indicate the end of a path and the start of the item identifier, e.g. if frobnicate was placed inside /src/util.typ, then its doc tests would be identified by src/util:frobnicate. We could even remove common prefixes such as src/, with an optional escape hatch to disable that behavior. Typst allows shadowing, which means frobnicate could be declared multiple times, I'm unsure how this is handled in doc generators, but I'm tempted to simply add a discriminator to the identifier if this happens, e.g. src/util:frobnicate:1, src/util:frobnicate:2, etc. Which would also be properly highlighted on the CLI UI.

However, if we want to allow refering to individual examples, then we likely need a label mechanism. Otherwise, adding/removing or reordering examples will affect other example's identifiers, making them unstable.

Refering to individual examples will likely be left out in the MVP and be added later if at all, but it should be considered for the syntax of discriminators.

Binding availability

Most doc generators allow adding a prelude which is not shown in the doc comment to make examples concise. We can't easily know where these are defined because they usually get put into the manual.

An example would be mantys which adds itself for documentation, I'm unsure if it handled examples, but it uses tidy under the hood, so I assume it passes itself to the examples tidy validates.

We need to let the user define these in the config, we can't know the defaults every doc generator defines because neither do we know all doc generators (some may be internal) nor is it feasible to maintain this if we did know them all. This also doesn't account for any other bindings a suer may add in their manual config.

References

Should we store references for doc tests and have the user update them?

We could treat this like regression tests with a different structure, create the references somewhere else and have them be regression tests, but this means we need another mechanism to mark if a test should be compiled or compared.


cc: @jneug, could you tell me how mantys handles examples currently? Does it add anything to the prelude there, does it simply render them as is and nothing else? Are you aware of any other doc generators which doe it differently?

cc: @Mc-Zen, does tidy do anything special here, other than adding the function itself?

How do you guys handle name conflicts, shadowing and module paths, i.e. using a function from /src/foo.typ inside /src/foo/bar.typ for example? Is this up to the user to include the appropriate module/function?


I'm going to move this to 0.3.0 for now, it's not urgent.

@Mc-Zen
Copy link

Mc-Zen commented Feb 2, 2025

Hi, this is indeed a really tricky matter. Personally, I'm not a huge fan of doc tests and just added doc test functionality to tidy upon a request. Imho, tests and docs should really be separated. On the other hand, I understand the need to ensure that the doc examples pass and for that usecase, I'm all in!

Scope

With tidy, the scope of the evaluated code examples is extended by the scope argument that the user can pass to tidy.parse-module(). Usually, you will want to pass scope: dictionary(my-module) to bring your module into scope for all examples.

Preamble

Currently, tidy supports a preamble that can be set and that will be prepended to all code examples. BUT I might drop this support entirely some day since I find these preambles (or preludes as you call them) very hacky and in general unrecommendable and inflexible.

Execute-only code

Instead, tidy has a special syntax (an idea inherited from the Typst source) where you can write additional lines starting with >>> that are only executed but not shown in the previewed source code of the example.

>>> import draw: * // executed but not displayed
#cetz.canvas({
  line(..)
})

This provides way more flexibility, leads to more predictable code and presumably better integration with tooling.

Another important aspect to consider here (especially for testing) is layout. Most Typst output depends on the size of the parent container. Usually, a code example in a documentation should just take as much space as it needs, i.e., it should be placed inside a container with automatic dimensions. But in certain cases you'd want a fixed container, for example to demonstrate the usage of repeat. Execute-only code as described above can help you out of this: just wrap everything in a block with fixed height and/or width but don't display the creation of that block in the example code.

@Mc-Zen
Copy link

Mc-Zen commented Feb 2, 2025

Do you already have a specific design in mind? I would maybe suggest, that doc tests need their own test suite that needs to be added manually, like

tests/
  | − basic-tests/ 
  | − ...
  | − docs/

where docs/ contains some test.typ that somehow activates these tests (by calling a special function or having a special filename or directory). Here would also be a good place for any Typst-side configuration.

@tingerrr
Copy link
Owner Author

tingerrr commented Feb 3, 2025

With tidy, the scope of the evaluated code examples is extended by the scope argument that the user can pass to tidy.parse-module(). Usually, you will want to pass scope: dictionary(my-module) to bring your module into scope for all examples.

Does this receive any defaults? Is the item which is documented included by default, for example? Otherwise, the example I wrote would already fail by tidy standards.

>>> import draw: * // executed but not displayed
#cetz.canvas({
  line(..)
})

Ok, that sounds good. I would prefer this too, especially for more elaborate examples which may refer back to values from previous ones.

Another important aspect to consider here (especially for testing) is layout. Most Typst output depends on the size of the parent container. Usually, a code example in a documentation should just take as much space as it needs, i.e., it should be placed inside a container with automatic dimensions. But in certain cases you'd want a fixed container, for example to demonstrate the usage of repeat. Execute-only code as described above can help you out of this: just wrap everything in a block with fixed height and/or width but don't display the creation of that block in the example code.

Presumably so, I haven't thought about this yet, but if references are to be supported, then my first instinct would be to let the user decide what the default page size is for their doc tests. Perhaps in the manifest config.

Do you already have a specific design in mind? I would maybe suggest, that doc tests need their own test suite that needs to be added manually, like

tests/
  | − basic-tests/ 
  | − ...
  | − docs/

where docs/ contains some test.typ that somehow activates these tests (by calling a special function or having a special filename or directory). Here would also be a good place for any Typst-side configuration.

I don't want to add extra directories unless I have to, but if I want to support references, I'll likely have to add this extra indirection. I've already thought about a structure like this for template tests. But I don't want a magi file which activates doc tests, if people want to disable doc tests they can simply wrap their expressions in (...) ~ docs(), possibly with some kind of config for automatically doing this.


With all that said, does tidy only run example blocks? Are others ignored by default?

@Mc-Zen
Copy link

Mc-Zen commented Feb 3, 2025

Does this receive any defaults? Is the item which is documented included by default, for example? Otherwise, the example I wrote would already fail by tidy standards.

No, it doesn't. You explicitly need to provide it. The reason is that depending on the number of definitions, some packages prefer to use the import "..": * style while others - especially ones with large namespaces - suggest not to import everything but maybe import the package with some proposed alias short-hand, like tidy.parse-module().

I don't want to add extra directories unless I have to, but if I want to support references, I'll likely have to add this extra indirection. I've already thought about a structure like this for template tests. But I don't want a magi file which activates doc tests, if people want to disable doc tests they can simply wrap their expressions in (...) ~ docs(), possibly with some kind of config for automatically doing this.

Personally, I feel like I'd prefer doc-example testing to be opt-in.

With all that said, does tidy only run example blocks? Are others ignored by default?

Yes, example (and examplec = example in code mode) are the only ones that are intended to be run. On the contrary, typ and typc are only displayed, not run. I think this distinction is important to make since there are use cases, where one would want to show Typst code but not have it evaluated.

The >>> syntax could be rethought of course, Laurenz comment on this was that it was really just ad-hoc for the Typst documentation. But it think it's not bad:

  • it's super easy and straight-forward to parse
  • easy to "parse" for humans that read the doc comment in the source code
  • and >>> at the start of the line is not common in document content

@jneug
Copy link

jneug commented Feb 3, 2025

could you tell me how mantys handles examples currently? Does it add anything to the prelude there, does it simply render them as is and nothing else? Are you aware of any other doc generators which doe it differently?

Mantys was very much inspired by Tidy in that regard. Nothing is added by default, but users can provide an examples-scope to Mantys, that is a dictionary with a scope and an imports key.

The scope is extended by the scope passed to the #example function and then added to all examples. The imports is used to generate a preamble for example code. For example:

#show: #mantys(
	// ...
	
	examples-scope: (
		// Scope for all examples
		scope: (
			my-module: mymodule
		),
		// Preamble for all examples
		imports: (
			// #import my-module: *
			my-module: "*",
			// #import my-other-module: func-a, func-b
			my-other-module: ("func-a", "func-b")
		)
	)
)

Users may also set use-examples-scope to false when calling #example to not use the default scope.

Instead, tidy has a special syntax (an idea inherited from the Typst source) where you can write additional lines starting with >>> that are only executed but not shown in the previewed source code of the example.

I like this. Maybe would warrant a separate package like self-example that standardizes the compilation of examples.

Personally, I'm not a huge fan of doc tests and just added doc test functionality to tidy upon a request.

That was my request back in the day when there was no good testing solution like tytanic. I'm all in favor of separating concerns. Back then, I was developing codetastic and had all these small functions that generated checksums for qr-codes and stuff. For these cases, the doctests were pretty useful, since they had no output and pretty much were a bunch of asserts.

Since tytanic can now do compile-only and panic tests, the need for doctests is superfluous. Maybe their use should not be encouraged by supporting them in tytanic and rather provide a tutorial on how to move from doctests to tytanic tests?

@tingerrr
Copy link
Owner Author

tingerrr commented Feb 3, 2025

So from what I can gather at this moment, by default the scope is quite literally empty and the example I wrote in my first comment would at the very least need to import frobnicate to compile?

I like this. Maybe would warrant a separate package like self-example that standardizes the compilation of examples.

Maybe, on the other hand, it's more of a convention for packages to use than a useful feature on its own. self-example could add it as a feature at some point, for example.

Since tytanic can now do compile-only and panic tests, the need for doctests is superfluous. Maybe their use should not be encouraged by supporting them in tytanic and rather provide a tutorial on how to move from doctests to tytanic tests?

I don't think that makes them superfluous, examples are often copy-pasted verbatim by a user, ensuring they compile is important, moving them either requires removing them in the docs, or syncing changes between docs and tests. But because they are just part of a comment, the author may not notice that an example in file a/b/c/d.typ stopped compiling due to a change in e/f/g/h.typ.


Thank you two for your swift answers.

@jneug
Copy link

jneug commented Feb 3, 2025

I don't think that makes them superfluous, examples are often copy-pasted verbatim by a user, ensuring they compile is important, moving them either requires removing them in the docs, or syncing changes between docs and tests. But because they are just part of a comment, the author may not notice that an example in file a/b/c/d.typ stopped compiling due to a change in e/f/g/h.typ.

Hm, maybe I misunderstood. I was talking about Tidy doc-tests like:

/// #test(
///   `num.my-square(2) == 4`,
///   `num.my-square(4) == 16`,
/// )
#let my-square(n) = n * n

@tingerrr
Copy link
Owner Author

tingerrr commented Feb 3, 2025

Ah, yes indeed, those would likely be moved to standalone tytanic tests. I'm just thinking about example code blocks like in the first comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runner Area: test runner A-tests Area: anything actually test related C-feature Category: feature request P-medium Priority: medium
Projects
None yet
Development

No branches or pull requests

3 participants