-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: add page_trace
API for debugging
#10293
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
.await?; | ||
|
||
let (page_trace, mut trace_rx) = PageTrace::new(event_limit); | ||
timeline.page_trace.store(Arc::new(Some(page_trace))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this error if there's already a trace in progress?
// Above code is infallible, so we guarantee to switch the trace off when done | ||
timeline.page_trace.store(Arc::new(None)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could also stream to the client, and cancel if the client goes away.
pub(crate) fn new( | ||
size_limit: u64, | ||
) -> (Self, tokio::sync::mpsc::UnboundedReceiver<PageTraceEvent>) { | ||
let (trace_tx, trace_rx) = tokio::sync::mpsc::unbounded_channel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could also use a buffered channel with the max size here, to avoid the size accounting.
7227 tests run: 6875 passed, 0 failed, 352 skipped (full report)Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
6bf00a6 at 2025-01-07T15:04:52.449Z :recycle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
I think this is safe to deploy, barring the check_permission
problem.
Nits can be addressed in a follow-up.
async fn timeline_page_trace_handler( | ||
request: Request<Body>, | ||
_cancel: CancellationToken, | ||
) -> Result<Response<Body>, ApiError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check_permission is missing
|
||
let size_limit = | ||
parse_query_param::<_, u64>(&request, "size_limit_bytes")?.unwrap_or(1024 * 1024); | ||
let time_limit_secs = parse_query_param::<_, u64>(&request, "time_limit_secs")?.unwrap_or(5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not parse a humantime::Duration
?
loop { | ||
let timeout = deadline.saturating_duration_since(Instant::now()); | ||
tokio::select! { | ||
event = trace_rx.recv() => { | ||
buffer.extend(bincode::serialize(&event).unwrap()); | ||
|
||
if buffer.len() >= size_limit as usize { | ||
// Size threshold reached | ||
break; | ||
} | ||
} | ||
_ = tokio::time::sleep(timeout) => { | ||
// Time threshold reached | ||
break; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: instead of doing a repeat select!(), I think it's better style to declare one async
block that does the loop { trace_rx.recv().await; }
, then poll that block inside a timeout.
Roughly like so:
tokio::time::timeout(time_limit_secs, async {
loop {
let event = trace_rx.recv().await;
...
}
}).await;
event = trace_rx.recv() => { | ||
buffer.extend(bincode::serialize(&event).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first thought event
is always Ok()
but it isn't if this handler is called concurrently on the same timeline.
We should
- be only writing the Ok() value to the buffer and
- bail out of the loop as soon as recv() fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to busyloop if the timeline is dropped, but seems fine to deploy temporarily for now.
let event_size = bincode::serialized_size(&PageTraceEvent { | ||
key: (0 as i128).into(), | ||
effective_lsn: Lsn(0), | ||
time: SystemTime::now(), | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're deserializing PageTraceEvent here, but we need to deserialize Option<PageTraceEvent>
with the current impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're getting away with it because of the event_size + 1
below. But yeah, we have to decode the actual bytes as Option
for now to get the proper values.
return Err(e.into()); | ||
} | ||
} | ||
let event = bincode::deserialize::<PageTraceEvent>(&event_bytes)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
Problem
When a pageserver is receiving high rates of requests, we don't have a good way to efficiently discover what the client's access pattern is.
Closes: #10275
Summary of changes
/v1/tenant/x/timeline/y/page_trace?size_limit_bytes=...&time_limit_secs=...
API, which returns a binary buffer. Tool to decode and report on the output will follow separately