Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dotnet] Serialize command parameters directly to UTF-8 #15151

Closed

Conversation

RenderMichael
Copy link
Contributor

@RenderMichael RenderMichael commented Jan 24, 2025

User description

Thanks for contributing to Selenium!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

Serialize command parameters directly into UTF-8 bytes, and pass the array around.

Motivation and Context

System.Text.Json operates natively in UTF-8, and it is much more efficient to serialize to a byte[] rather than materializing a string.

The HTTP request contents are in the form of a byte[] anyway, so we are avoiding a lot of back and forth between formats here.

Old Flow: Dictionary<string, object> -> UTF-8 inside STJ -> string returned by STJ -> Encoded to byte[]
New Flow: Dictionary<string, object> -> UTF-8 returned by STJ

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

PR Type

Enhancement, Tests


Description

  • Introduced UTF-8 byte serialization for command parameters.

  • Updated HTTP request handling to use byte arrays directly.

  • Enhanced event arguments to support UTF-8 request bodies.

  • Added unit tests to validate new serialization logic.


Changes walkthrough 📝

Relevant files
Enhancement
Command.cs
Add UTF-8 serialization for command parameters                     

dotnet/src/webdriver/Command.cs

  • Added GetParametersAsUtf8Bytes method for UTF-8 serialization.
  • Updated ToString method for improved parameter representation.
  • +17/-1   
    HttpCommandExecutor.cs
    Use byte arrays for HTTP request bodies                                   

    dotnet/src/webdriver/Remote/HttpCommandExecutor.cs

  • Replaced string-based request body with byte array.
  • Updated HTTP request content handling to use byte arrays.
  • +3/-3     
    SendingRemoteHttpRequestEventArgs.cs
    Support UTF-8 byte arrays in event arguments                         

    dotnet/src/webdriver/Remote/SendingRemoteHttpRequestEventArgs.cs

  • Added support for UTF-8 byte arrays in request body.
  • Enhanced RequestBody property to handle string and byte arrays.
  • +30/-2   
    Tests
    CommandTests.cs
    Add tests for UTF-8 serialization of parameters                   

    dotnet/test/common/CommandTests.cs

  • Added tests for null and empty parameter serialization.
  • Updated existing tests to validate UTF-8 byte serialization.
  • +17/-1   

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Memory Allocation

    The empty JSON object "{}" is allocated as a new array on each call to GetParametersAsUtf8Bytes() for null/empty parameters. Consider caching this as a static readonly field to avoid repeated allocations.

    return "{}"u8.ToArray();
    Potential Race Condition

    The lazy initialization of _requestBody field is not thread-safe. Multiple threads accessing the RequestBody property could result in the string being decoded multiple times.

        if (_requestBody is not null)
        {
            return _requestBody;
        }
    
        return _requestBody = _utf8RequestBody is null ? null : Encoding.UTF8.GetString(_utf8RequestBody);
    }

    Copy link
    Contributor

    qodo-merge-pro bot commented Jan 24, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Prevent null reference exception

    Add null check for SessionId in ToString() method to prevent potential
    NullReferenceException when SessionId is null.

    dotnet/src/webdriver/Command.cs [126]

    -return $"[{this.SessionId}]: {this.Name} Parameters: {this.ParametersAsJsonString}";
    +return $"[{this.SessionId?.ToString() ?? "null"}]: {this.Name} Parameters: {this.ParametersAsJsonString}";
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a potential NullReferenceException that could occur in production when SessionId is null, making the code more robust and preventing runtime crashes.

    8
    General
    Improve performance through caching

    Cache the UTF-8 bytes result in GetParametersAsUtf8Bytes() to avoid repeated
    serialization of the same parameters.

    dotnet/src/webdriver/Command.cs [108-118]

    +private byte[]? _cachedUtf8Bytes;
     public byte[] GetParametersAsUtf8Bytes()
     {
    -    if (this.Parameters != null && this.Parameters.Count > 0)
    +    if (_cachedUtf8Bytes == null)
         {
    -        return JsonSerializer.SerializeToUtf8Bytes(this.Parameters, s_jsonSerializerOptions);
    +        _cachedUtf8Bytes = this.Parameters != null && this.Parameters.Count > 0
    +            ? JsonSerializer.SerializeToUtf8Bytes(this.Parameters, s_jsonSerializerOptions)
    +            : "{}"u8.ToArray();
         }
    -    else
    -    {
    -        return "{}"u8.ToArray();
    -    }
    +    return _cachedUtf8Bytes;
     }
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: The suggestion introduces caching for serialized UTF-8 bytes, which can improve performance by avoiding repeated serialization operations, especially useful when the same command is executed multiple times.

    6

    @@ -42,7 +45,21 @@ public SendingRemoteHttpRequestEventArgs(string method, string fullUrl, string?
    {
    this.Method = method ?? throw new ArgumentNullException(nameof(method));
    this.FullUrl = fullUrl ?? throw new ArgumentNullException(nameof(fullUrl));
    this.RequestBody = requestBody;
    _utf8RequestBody = requestBody is null ? null : Encoding.UTF8.GetBytes(requestBody);
    Copy link
    Contributor Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Old constructor kept around for backwards compatibility, but unused by our code.

    @@ -58,7 +75,18 @@ public SendingRemoteHttpRequestEventArgs(string method, string fullUrl, string?
    /// <summary>
    /// Gets the body of the HTTP request as a string.
    /// </summary>
    public string? RequestBody { get; }
    public string? RequestBody
    Copy link
    Contributor Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    This event args seems to be an obscure mechanism to add headers to requests. I don't know how many people are accessing the SendingRemoteHttpRequestEventArgs.RequestBody property in the world.

    For this reason, I feel that the added complexity is warranted: We will not decode our data into a string unless the user requests it (and we cache it here so users do not have to worry about it).

    /// <summary>
    /// Returns a string of the Command object
    /// </summary>
    /// <returns>A string representation of the Command Object</returns>
    public override string ToString()
    {
    return string.Concat("[", this.SessionId, "]: ", this.Name, " ", this.ParametersAsJsonString);
    return $"[{this.SessionId}]: {this.Name} {this.ParametersAsJsonString}";
    Copy link
    Contributor Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Sidenote: ParametersAsJsonString does JSON serialization, here inside the ToString() method. I missed this as part of #14741. Do we want to change this? (Follow-up PR either way)

    @nvborisenko
    Copy link
    Member

    This is interesting. I propose to create simple benchmark to measure how much memory is allocated before/after.

    Let's say user wants to get WebElement.Displayed property - most memory consuming command.

    /// Serializes the parameters of the comand to JSON as UTF-8 bytes.
    /// </summary>
    /// <returns></returns>
    public byte[] GetParametersAsUtf8Bytes()
    Copy link
    Member

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Command is an abstraction. Command to bytes... we should mention json in method name. Like GetParametersAsJsoonUtf8Bytes()

    @RenderMichael
    Copy link
    Contributor Author

    Per offline discussion, this change is not worth the added complexity and maintenance burden. Closing this PR.

    @RenderMichael RenderMichael deleted the command-serialize branch January 25, 2025 05:37
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants