-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use gzip to compress self profile json #1966
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,9 @@ | ||
[package] | ||
authors = ["Mark-Simulacrum <[email protected]>", "Nicholas Cameron <[email protected]>", "The rustc-perf contributors"] | ||
authors = [ | ||
"Mark-Simulacrum <[email protected]>", | ||
"Nicholas Cameron <[email protected]>", | ||
"The rustc-perf contributors", | ||
] | ||
name = "site" | ||
version = "0.1.0" | ||
edition = "2021" | ||
|
@@ -42,10 +46,14 @@ mime = "0.3" | |
prometheus = { version = "0.13", default-features = false } | ||
uuid = { version = "1.3.0", features = ["v4"] } | ||
tera = { version = "1.19", default-features = false } | ||
rust-embed = { version = "6.6.0", features = ["include-exclude", "interpolate-folder-path"] } | ||
rust-embed = { version = "6.6.0", features = [ | ||
"include-exclude", | ||
"interpolate-folder-path", | ||
] } | ||
humansize = "2" | ||
lru = "0.12.0" | ||
ruzstd = "0.7.0" | ||
flate2 = { workspace = true } | ||
|
||
[target.'cfg(unix)'.dependencies] | ||
jemallocator = "0.5" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,7 +9,10 @@ use analyzeme::ProfilingData; | |
use anyhow::Context; | ||
use bytes::Buf; | ||
use database::ArtifactId; | ||
use flate2::write::GzEncoder; | ||
use flate2::Compression; | ||
use lru::LruCache; | ||
use std::io::Write; | ||
use std::num::NonZeroUsize; | ||
use std::time::Duration; | ||
use std::{collections::HashMap, io::Read, time::Instant}; | ||
|
@@ -37,9 +40,15 @@ pub fn generate( | |
ProcessorType::Crox => { | ||
let opt = serde_json::from_str(&serde_json::to_string(¶ms).unwrap()) | ||
.context("crox opts")?; | ||
let data = crox::generate(self_profile_data, opt).context("crox")?; | ||
s7tya marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please experiment with the |
||
encoder.write_all(&data)?; | ||
let gzip_compressed_data = encoder.finish()?; | ||
|
||
Ok(Output { | ||
filename: "chrome_profiler.json", | ||
data: crox::generate(self_profile_data, opt).context("crox")?, | ||
filename: "chrome_profiler.json.gz", | ||
s7tya marked this conversation as resolved.
Show resolved
Hide resolved
|
||
data: gzip_compressed_data, | ||
is_download: true, | ||
}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is the correct way to do this. If we return GZIPed data from a web server, we should return the normal file extension (in this case
.json
) and use the MIME corresponding to it. And then separately, we should setContent-Encoding: gzip
to specify that the file is being compressed. See https://stackoverflow.com/a/59999751/1107768.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought there was a difference between these two approaches. Using Content-Encoding lets browsers automatically decompress the content, whereas if you use Content-Type: gzip, the responsibility falls on the Perfetto side. By using Content-Encoding, we could potentially use Brotli instead of gzip, since almost all modern browsers support it. If we didn't have to worry about WebKit, we could even use zstd, though that might be a bit overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And currently we're downloading blob and then move to perfetto's website, so things work like this:
compress (server) → fetch (client) → decompress (client) → PAGE TRANSITION → load (perfetto)
compress (server) → fetch (client) → PAGE TRANSITION → decompress & load (perfetto)
In this case, I thought it would be nice if we could reduce the time we wait on the RLO page after clicking "query trace" button as there's no animation or something that tells it's working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is indeed a difference! We load the data that we pass to Perfetto using "normal browser logic", i.e. the
fetch
call, which should do the decompression for us in the background (hopefully using optimized C/C++ code).I get your idea about this:
that seems interesting. So the question is whether it is better to go to Perfetto faster, and let it unGZIP the data using its decompression algorithm (presumably implemented in some C++ -> JS way, hopefully in webassembly?) or if it is faster to let the browser decompress the data using its native code (which should, I expect, be faster than in Perfetto), and only then go to Perfetto. Using the second approach, we could even use e.g. Brotli, as you said (we even have support for it already in the website).
Without measuring this, I think that the browser decompression approach could be faster, especially if we switch to Brotli. If that is the case, we could resolve the slightly longer wait time until Perfetto opens by adding some loading indicator, that shouldn't be that difficult.
Could you please do some small benchmark to evaluate the two approaches? Ideally with as a large a file as possible 😆 You could download the 600 MiB cargo trace locally, and simply load it from disk (and then compress & return it) on your local endpoint, to avoid overloading our live server. The goal of the benchmark would be to evaluate:
Content-encoding: gzip
, let the browser decompress it, then pass it to PerfettoIdeally, for both cases, it would be great to know both the duration of the actual file download (which will include also decompression time with approach 2), and also the end-to-end duration between clicking and the trace being fully loaded in Perfetto (you can use a stopwatch xD).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. I've created new PR #1968 because they're much different. So mark the related conversations resolved.
and the brotli is so much faster than the current gzip one. it only takes 0.5s to compress...