Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and clarify cache file handling #4325

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

sbergen
Copy link
Contributor

@sbergen sbergen commented Mar 9, 2025

Functional changes:

Refactoring:

  • Remove very repetitive use of replace("/", "@"). Add two versions of it: One for erlang module names, and one for cache file names. My understanding is that they just happen to do the same thing, so I duplicated it. If there's some reason the cache file names must match the erlang module names, then we should probably change it further.
  • For the use of replace("/", "@") related to cache files, group these operations together as CacheFiles, as there was further duplication of the same logic in different parts of the code base.
  • Group logic related to discovery, validation, and module name derivation from gleam source files under GleamFile

I was originally going to refactor first separately, but it was hard to keep doing the wrong thing, and since the actual fixes are quite small, I thought it might be better if I just outlined them in the PR.

Comment on lines +50 to +54
pub fn load(&self, file: GleamFile) -> Result<Input> {
let name = file.module_name.clone();
let source_mtime = self.io.modification_time(&file.path)?;

let read_source = |name| self.read_source(path, name, source_mtime);
let read_source = |name| self.read_source(file.path.clone(), name, source_mtime);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spend too much time trying to get rid of these two clones, but there might be something we could still optimize. I thought I'd get overall feedback first before going too detailed.

if (!inputs.contains_key(&module)) {
tracing::debug!(%module, "module_removed");
CacheFiles::new(&self.artefact_directory, &module).delete(&self.io);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the fix for the temporarily deleted files not being always recompiled.

.with_extension("cache_meta");

let _ = self.io.delete_file(&meta_path);
CacheFiles::new(&self.artefact_directory, &cached.name).delete(&self.io);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we delete more files than before.

Comment on lines +1723 to +1740
fn module_name(path: &Utf8Path, dir: &Utf8Path) -> EcoString {
// /path/to/project/_build/default/lib/the_package/src/my/module.gleam

// my/module.gleam
let mut module_path = path
.strip_prefix(dir)
.expect("Stripping package prefix from module path")
.to_path_buf();

// my/module
let _ = module_path.set_extension("");

// Stringify
let name = module_path.to_string();

// normalise windows paths
name.replace("\\", "/").into()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved, and made private so that it wouldn't be used with cache files (which caused some extra recompilation because we thought the modules were removed).

Comment on lines +1742 to +1760
fn is_gleam_path(path: &Utf8Path, dir: &Utf8Path) -> bool {
use regex::Regex;
use std::cell::OnceCell;
const RE: OnceCell<Regex> = OnceCell::new();

RE.get_or_init(|| {
Regex::new(&format!(
"^({module}{slash})*{module}\\.gleam$",
module = "[a-z][_a-z0-9]*",
slash = "(/|\\\\)",
))
.expect("is_gleam_path() RE regex")
})
.is_match(
path.strip_prefix(dir)
.expect("is_gleam_path(): strip_prefix")
.as_str(),
)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just moved, as it wasn't used anywhere else.

Comment on lines +1809 to +1822
fn module_name(dir: &Utf8Path, path: &Utf8Path) -> EcoString {
// /path/to/artefact/dir/[email protected]_meta

// [email protected]_meta
let mut module_path = path
.strip_prefix(dir)
.expect("Stripping package prefix from module path")
.to_path_buf();

// my@module
let _ = module_path.set_extension("");

// my/module
module_path.to_string().replace("@", "/").into()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the corrected and modified version of getting the module name from any cache file.

assert!(response.result.is_ok());
assert_eq!(response.result, Ok(()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this test failing as a result of a typo, and the output wasn't too helpful. This makes it a lot better, so I thought I'd commit it.

@@ -1715,3 +1682,143 @@ impl<'a> Inputs<'a> {
Ok(())
}
}

/// A Gleam source file (`.gleam`) and the module name deduced from it
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For everything downwards from here, I'm not sure if this is the best module, or the best location in the source file to put these.

@sbergen sbergen marked this pull request as ready for review March 9, 2025 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Temporarily removed file does not always get recompiled
1 participant