Add the package std module #2104

jneem · 2024-11-22T03:08:30Z

Depends on #2110

Adds a package module to std, with contracts related to package manifests.

As I was making this PR, it occurred to me that maybe some of the contracts (e.g. the semver ones) could be parsing into a structured representation instead of just validating. What do you think?

github-actions · 2024-11-22T03:20:58Z

Bencher Report

Branch	std-package
Testbed	ubuntu-latest

Click to view all benchmark results

Benchmark	Latency	microseconds (µs)
fibonacci 10	📈 view plot 🚷 view threshold	493.50
foldl arrays 50	📈 view plot 🚷 view threshold	1,767.90
foldl arrays 500	📈 view plot 🚷 view threshold	6,875.90
foldr strings 50	📈 view plot 🚷 view threshold	7,204.50
foldr strings 500	📈 view plot 🚷 view threshold	63,841.00
generate normal 250	📈 view plot 🚷 view threshold	46,226.00
generate normal 50	📈 view plot 🚷 view threshold	2,101.60
generate normal unchecked 1000	📈 view plot 🚷 view threshold	3,439.90
generate normal unchecked 200	📈 view plot 🚷 view threshold	785.72
pidigits 100	📈 view plot 🚷 view threshold	3,214.10
pipe normal 20	📈 view plot 🚷 view threshold	1,534.90
pipe normal 200	📈 view plot 🚷 view threshold	9,991.70
product 30	📈 view plot 🚷 view threshold	858.09
scalar 10	📈 view plot 🚷 view threshold	1,556.90
sum 30	📈 view plot 🚷 view threshold	852.35

🐰 View full continuous benchmarking report in Bencher

yannham

Looks reasonable so far.

With respect to semantic versioning, we could indeed parse it on the way (since we're running the regex anyway), I guess that will spare re-parsing it on the Rust side. It would also allow people (if we write the right contract) to use either a structured representation or a textual representation as an input. The final representation would be always structured (in fact even that is not required, we could keep a union, but I don't think it has much of an advantage).

Regarding optionals, I don't have a strong opinion myself but I guess we could look at the manifest format of some prominent languages that seem to have done things right and try to extract a common base of attributes that are required everywhere.

core/stdlib/std.ncl

yannham · 2024-11-25T10:56:02Z

core/stdlib/std.ncl

+          "%,
+
+        keywords
+          | Array String


Should this be enum tags instead? But it's less nice to express as a static type (we can use a contract like std.enum.EnumTag or I don't remember how it's called, but for a static type we would need proper existential like exists a. [| ; a |])

Mostly I picked String because the set of reasonable values is open and possibly large.

core/stdlib/std.ncl

yannham · 2024-11-25T10:58:11Z

core/stdlib/std.ncl

+          | {
+            _ : [|
+              'Path String,
+              'Git {


I was going to say that we should put each payload into its own type, but then that won't be usable in statically typed anymore as we don't have let types yet. Is that why you inlined all the structure here?

Honestly, it was just because the first version had less fields, and then I didn't split it out as the number of fields grew. I'm not sure that static typing is a big concern here, as we don't have functions that consume and produce Manifests. (At least, not yet. But by the time we do, maybe we'll also have let type...)

Well, if we put it in the stdib, people might want to do that, so we could be tempted to leave it as it is and then later split it in a nicer way once we have let types, without disturbing user code.

On the other hand if we split it into contracts now this reduces what users can do with manifests but we can still introduce let-types later and that would be a backward-compatible improvement, which has been our strategy for many things up to now. So indeed if the stdlib doesn't really need the statically typed side, I guess we can go both ways.

dpulls · 2024-12-24T13:46:27Z

🎉 All dependencies have been resolved !

jneem · 2025-01-13T05:32:21Z

Ok, I think this is ready for another look.

I made the semver contracts parse their input to records. I think it's potentially useful to have the bare record contracts accessible (if nothing else, it provides a convenient way to document the structure of the records), so I adopted a pattern where std.package.structured.Semver is a record contract, while std.package.Semver is the parsing contract that accepts either a record or a string.

I had a look at npm, cargo, and opam for inspiration on optionality of the fields, and there wasn't a lot of consensus. They all require a name, and both npm and opam require a version (although opam can try to infer the name and version from the directory structure). Opam requires a maintainer, but the others don't. Also, opam is the only one to require a tooling version ("opam-version"). I decided to go with mandatory name, version, authors, and minimal_nickel_version. I think the first two are uncontroversial. Authors is less necessary, but also very easy to fulfill (it accepts an empty array if you intentionally don't want to add it). Requiring minimal_nickel_version is a little opinionated, but (a) it has precedence in opam and (b) I think it's important if we're going to add new language features.

yannham

The overall structure is fine. I'm a bit worried about the normalizing contracts, that can be very surprising if you don't expect it. There is a precedent in the stdlib already (TagOrString), and it's useful, but I wonder if we should make it clearer in the documentation of each and every normalizing contract introduced here, with a caution at the beginning or something (for some it's not mentioned).

core/stdlib/std.ncl

jneem · 2025-01-21T14:38:46Z

Yeah, I'm also not completely sure about the normalizing contracts. They came about because

I think things like version numbers should be specifiable as strings in the manifest. This string representation is familiar to everyone, and less verbose than the alternative.
It was a little awkward to have string-validating contracts in nickel, and then parse the strings again in rust.
I thought maybe it would be occasionally useful to have access to the structured representation in nickel code. Maybe.

These are basically in decreasing order of importance; I think the string-validating contracts are also an ok solution. Maybe we can just put a big "unstable" warning on the whole package module for now?

yannham · 2025-01-21T15:57:24Z

Yeah, it's not the first time that this validation vs parsing comes into play, and there are good arguments for both. I tend to be ok with parsing, it makes it much easier to handle the resulting package in a standard way without having to worry about the 3 possible representations of each subfield. I just wonder how surprising it can be that a contract normalizes for a random user.

I think I like the unstable for now; we still put a bit of thought into this, but at least it buys a bit of time to flesh out the normalization question in general

jneem requested a review from yannham November 22, 2024 03:08

yannham reviewed Nov 25, 2024

View reviewed changes

jneem force-pushed the std-package branch from 3801ac3 to aa90380 Compare December 24, 2024 09:02

jneem added 4 commits January 13, 2025 10:07

Add the package std module

c75486b

Indent docs

38e7716

WIP on structured semver stuff

ee0bb57

Structured semver parsing

bb28514

jneem force-pushed the std-package branch from aa90380 to bb28514 Compare January 13, 2025 03:07

jneem requested a review from yannham January 13, 2025 03:08

yannham approved these changes Jan 20, 2025

View reviewed changes

jneem added 4 commits January 21, 2025 21:08

Use contract.check instead of contract application

927260e

Doc indentation

f1bbcb7

Better docs, and mention normalization more

a082369

More std.contract.check

9499804

Use match instead of if/then

e79a10f

Add a stability warning to the package module

d926c50

jneem enabled auto-merge January 22, 2025 01:14

jneem added this pull request to the merge queue Jan 22, 2025

Merged via the queue into master with commit b2c8349 Jan 22, 2025
5 checks passed

jneem deleted the std-package branch January 22, 2025 01:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the package std module #2104

Add the package std module #2104

jneem commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading

yannham left a comment

yannham Nov 25, 2024

jneem Nov 26, 2024

yannham Nov 25, 2024

jneem Nov 26, 2024

yannham Nov 26, 2024

dpulls bot commented Dec 24, 2024

jneem commented Jan 13, 2025

yannham left a comment

jneem commented Jan 21, 2025 •

edited

Loading

yannham commented Jan 21, 2025

Add the package std module #2104

Add the package std module #2104

Conversation

jneem commented Nov 22, 2024 • edited Loading

github-actions bot commented Nov 22, 2024 • edited Loading

Bencher Report

yannham left a comment

Choose a reason for hiding this comment

yannham Nov 25, 2024

Choose a reason for hiding this comment

jneem Nov 26, 2024

Choose a reason for hiding this comment

yannham Nov 25, 2024

Choose a reason for hiding this comment

jneem Nov 26, 2024

Choose a reason for hiding this comment

yannham Nov 26, 2024

Choose a reason for hiding this comment

dpulls bot commented Dec 24, 2024

jneem commented Jan 13, 2025

yannham left a comment

Choose a reason for hiding this comment

jneem commented Jan 21, 2025 • edited Loading

yannham commented Jan 21, 2025

jneem commented Nov 22, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024 •

edited

Loading

jneem commented Jan 21, 2025 •

edited

Loading