Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otel: fix segfaults when otel not configured #1531

Merged
merged 2 commits into from
Jan 8, 2025

Conversation

avahahn
Copy link
Contributor

@avahahn avahahn commented Jan 7, 2025

Fixes #1523 and #1526 by adding null checks to cover unconfigured OpenTelemetry cases.

Both tests and traffic were tested on both x86_64 Linux and arm64 Linux.

@avahahn avahahn requested review from callahad and ac000 January 7, 2025 18:34
Copy link
Member

@ac000 ac000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otel: fix segfaults when otel not configured

This commit adds null checks for the request->otel object that
were missed in the Traceparent and Tracestate routines. This
fix solves issues 1526 and 1523

Signed-off-by: Ava Hahn <[email protected]>

s/null/NULL/

This could also do with some closes and a fixes tag i.e.

...

Closes: https://github.com/nginx/unit/issues/1523
Closes: https://github.com/nginx/unit/issues/1526
Fixes: 9d3dcb800 ("otel: add build tooling to include otel code")
Signed-off-by: Ava Hahn <[email protected]>

and then remove the The fix solves... line.

Did you audit the other functions that assume r->otel is not NULL? E.g.

  • nxt_otel_request_error_path()
  • nxt_otel_parse_traceparent()
  • nxt_otel_parse_tracestate()

Also taking a closer look at nxt_otel_request_error_path()

void                                                                            
nxt_otel_request_error_path(nxt_task_t *task, nxt_http_request_t *r)            
{                                                                               
    if (r->otel->trace == NULL) {                                               
        return;                                                                 
    }                                                                           
                                                                                
    // response headers have been cleared                                       
    nxt_otel_propagate_header(task, r);                                         
                                                                                
    // collect span immediately                                                 
    if (r->otel) {                                                              
        nxt_otel_state_transition(r->otel, NXT_OTEL_COLLECT_STATE);             
    }                                                                           
    nxt_otel_test_and_call_state(task, r);                                      
} 

The if (r->otel) check (aside from the fact it should do an explicit NULL check), looks superfluous.

We've already checked r->otel->trace == NULL, so assuming we didn't crash, then we know r->otel isn't NULL. But maybe we should be checking that at the top of the function?

For the avoidance of any doubt. If it's superfluous it should be removed in a separate commit, OTOH if we should be checking r->otel == NULL at the top of the function then just do it in this commit...

While we're at it and whatever the fix here is it should be done as a separate commit (and then we can decide if it warrants going into 1.34.1), in nxt_otel_propagate_header() we have

if (r->otel->trace_id != NULL) {
...
} else if (r->otel->trace_id == NULL) {
...
} else {
...
}

->trace_id can only be NULL or !NULL, you see where this is going...

src/nxt_otel.c Outdated Show resolved Hide resolved
@avahahn
Copy link
Contributor Author

avahahn commented Jan 7, 2025

Did you audit the other functions that assume r->otel is not NULL? E.g.

nxt_otel_request_error_path()
nxt_otel_parse_traceparent()
nxt_otel_parse_tracestate()

...

OTOH if we should be checking r->otel == NULL at the top of the function then just do it in this commit...

I am not sure what commit you are looking at but that is what this commit does. Those three specific functions and your OTOH comment are all directly addressed in the patch that github shows. Did you end up with something wildly different somehow?

This commit adds NULL checks for the request->otel object that
were missed in the Traceparent and Tracestate routines.

Closes: nginx#1523
Closes: nginx#1526
Fixes: 9d3dcb8 ("otel: add build tooling to include otel code")
Signed-off-by: Ava Hahn <[email protected]>
The superfluous else condition in nxt_otel_propagate_header was dead code.
This commit removes it.

Signed-off-by: Ava Hahn <[email protected]>
@ac000
Copy link
Member

ac000 commented Jan 8, 2025

No, you're right (in my mind I'd applied the patch locally)

Looks good then...

@avahahn avahahn merged commit 753e298 into nginx:master Jan 8, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault when receiving the traceparent header and otel is not configured
2 participants