-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple copies of IDE distribution in Gradle cache + reindexing problem #1601
Comments
I'm aware of this problem and investigating the root cause. Expected behavior is that Gradle reuses already transformed artifact even if build configuration changes. |
I encountered the same problem. I ended up with 197 GiB worth of JetBrains IDEs in PS: #1639 look like a duplicate of this issue.
|
@hsz any updates on this (I've read #1639 (comment))? |
The Kotlin Script I've created for purging IntelliJ Platform extracted archives from the Gradle Transformers Cache: https://gist.github.com/hsz/0fc45e1a6fc9ef73d4e4f5960058bded |
|
Unfortunately, it is impossible to extract the resolved IntelliJ Platform artifact (no matter if this is installer or a ZIP archive from the IntelliJ Maven repository) to a custom directory and reuse it later with Gradle dependencies mechanism without bReaking some of Gradle foundations. I consider implementing the fix in the Gradle build system as the only possible solution. |
Maybe use the HASH of file (zip, dmg, tag.gz) instead of Gradle calculates one. @hsz |
I have no impact on what is taken as an input: Gradle considers the whole implementation classpath of the build script. The idea is to limit the input classpath and isolate it by specifying dependencies that affect the transformer output, such as: dependencies {
registerTransform(MyTransform::class.java) {
from...
to...
isolated {
classpath("transform:dependency-1:1.0")
classpath("transform:dependency-2:1.0")
}
}
} Having that implemented, I could pass the IntelliJ Platform dependency input, and Gradle will calculate the hash of the archive you mentioned. |
Not sure if this is the right place to discuss that, but I have a few ideas about that.
|
Thank you for your input, @JojOatXGME!
|
Yes, I understood that. My thought was that technically, Gradle could hash the classes recursively referred by the transform, instead of hashing the whole classpath. If someone uses
That is left to Gradle to define. (My ideas were about how Gradle could refine its transform feature, similar to how you mentioned the introduction of the |
@hsz Just out of curiosity and because I would like to receive updates on this topic, is there are issue for this topic on the site of Gradle? |
@JojOatXGME There's no input on that from the Gradle side. I'll start working on this story in September. All the progress will be communicated in this thread. |
How to fix the issue with the transforms cacheSolution 1IDE should be split into parts and published as separate artifacts into a real repository, so that we can declare the conventional Gradle dependencies on them without having to download 1-2Gb archive and gutting it during build into separate sub-artifacts. There are already seems to be quite a few artifacts being published https://mvnrepository.com/artifact/com.jetbrains.intellij.platform Solution 2Gradle faces the same dilemma while downloading JDKs. It does not use transforms for that. So maybe we should not either and the problem won't exist. Solution 3The problem seems to originate from this constraint:
It may not work, but the above does not say that we can not read outside of the input artifact path. When we extract the IDEs we can create in the dir an additional file with a hash of the original ZIP file. Later when a new transform runs, we can search other transforms for an already extracted version and either create a sym link to it and in the code we can do If sym links for some reason do not work, we can just write some special market file, which would let anyone trying to use this dir as a platform path know that it should be looked at another location. And even if that does not work, we can still skip the extraction if we found an already extracted location and create a dependency on this empty dir created by the transform. Then we can rewrite artifact path using ComponentMetadataRule just like I did it in this PR: Solution 4The reason why this transform runs so often is this: Whenever build classpath changes, Gradle does not want to reuse the old transform cache. To workaround this we can create a sub-plugin just for downloading & unzipping the IDE and almost never change it. But it may not help if changes in the build classpath of the "IntelliJ plugin project" also cause it to run. In that case a sub-plugin could be created with a sole purpose download a zip, split it into artifacts and do a publish to maven local. Then the main plugin will declare dependencies on that. Somewhat relatedThere is also a somewhat related issue that we have a fake Ivy repo in How to get rid of Ivy XMLs in ".intellijPlatform/localPlatformArtifacts"Solution 5bundledPlugins & mobules are not really separate from the IDE, they are a part of it, but depending on which are required we may need to create different "variants" of the IDE. So instead of creating dependencies "bundledPlugin:Tomcat:242-EAP-SNAPSHOT" we will register a transformer parameterized with a list of requested plugins & modules, which will select proper bundled plugins & libs, like CollectorTransformer. Line 4 in b9b6699 Nothing changes in the current API for declaring dependencies, except that bundledPlugins won't be creating any real dependencies, but just communicating parameters to the CollectorTransformer registered on the fly. Solution 6I have also explored if we could Gradle's capabilities, in theory they seem to apply here well, because the IDE is a platform with many capabilities, like modules & plugins, which we want to request separately from the IDE. In theory, with capabilities, we could be declaring IDE dependency like on the example below. This can be done either internally (when we process added bundled plugins & modules) in this Gradle plugin or exposed to the developers, so that they write something like the below in their build scripts. dependencies {
// https://plugins.jetbrains.com/docs/intellij/tools-intellij-platform-gradle-plugin-dependencies-extension.html
intellijPlatform {
create(
// To avoid hard coding these names this plugin can create a catalog on the fly
// https://docs.gradle.org/current/userguide/platforms.html#sec:importing-published-catalog
// so that we use it just like the catalog from libs.versions.toml but it may be a bit too much magic.
properties("intellijPlatform.type"),
properties("intellijPlatform.version"),
properties("intellijPlatform.useInstaller").map { it.toBoolean() }
) {
capabilities {
// These too could be referenced using the generated catalog
requireCapability("Tomcat")
requireCapability("Java")
}
}
}
} Capabilities here are not "real" capabilities per se. Meaning that we use capabilities API only as a means to communicate to the Gradle plugin what sub-artifacts we want to be included. Then the plugin, depending on which capabilities were requested dynamically registers a variant with exactly the same capabilities (so that Gradle does not complain) and corresponding jars included. It somewhat the other way around compared to how Gradle wants it to be, i.e. an artifact declares capabilities and consumers request them, but in our case we can only generate them the other way around or we will have to register an infinite number of variants. It is very similar to the solution 5, but here we just use capabilities API instead of custom The only question now is what other corner cases I do not know about. In solutions 5 and 6 the Gradle's dependencies tree we will have just This may improve IDE performance because we will not be duplicating transitive Jars many times in the dependencies tree. Another positive thing, we can control the order of all jars. |
I also made this suggestion as part of #1696. I can imagine that making this work for all cases could take some time. For example, multiple teams at JetBrains might be affected as build pipelines of the different IDEs might need to be adjusted. Anyway, I think this would be the right direction for the long term. |
Yeah, but technically literally anyone probably can use this plugin to generate separate artifacts from an IDE and e.g. publish them to maven local, then depend on them. But this plugin provides a few other features related to running the IDE, testing etc, which we still are going to need. |
Interesting, this actually suggests another solution, similar to 4. A sub-plugin could be created with a sole purpose download a zip, split it into artifacts and do a publish to maven local. Then the main plugin will declare dependencies on that. Problem solved. |
FYI, I just created an issue at the Gradle project, since all the straightforward solutions would require changes on their site. |
Good idea. Solutions 1-3 could be done without changes in Gradle. 4 is possible too, but probably only if that other plugin is used separately to prepare the environment for the build. |
What happened?
I migrated a project to IntelliJ Platform Gradle Plugin 2.0 and specified
intellijIdeaUltimate("2024.1")
in build.gradle.kts and didn't change it. However, after a few hours of work, I found that the Gradle cache contains six copies of IDEA Ultimate distribution:Since each distribution occupies 3.3Gb, it's quite a lot of space.
Steps to reproduce
Not when exactly new copies are created.
Gradle IntelliJ Plugin version
2.0.0-SNAPSHOT
Gradle version
8.5
Operating System
Linux
The text was updated successfully, but these errors were encountered: