Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/iovec: revert vector io implement from loop/null/zero driver #15629

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

anchao
Copy link
Contributor

@anchao anchao commented Jan 21, 2025

Summary

drivers/iovec: revert vector io implement from loop/null/zero driver

basic driver does not need such complex wrapper

Signed-off-by: chao an [email protected]

Impact

improve the performance

Testing

test code on ubuntu 24.04 x86_64

static void timespec_sub(struct timespec *dest,
                         struct timespec *ts1,
                         struct timespec *ts2)
{
  dest->tv_sec = ts1->tv_sec - ts2->tv_sec;
  dest->tv_nsec = ts1->tv_nsec - ts2->tv_nsec;

  if (dest->tv_nsec < 0)
    {
      dest->tv_nsec += 1000000000;
      dest->tv_sec -= 1;
    }
}


int main(int argc, FAR char *argv[])
{
  struct timespec result;
  struct timespec start;
  struct timespec end;
  int fd = open("/dev/zero", O_RDONLY);
  int loop = 0;
  int i = 0;
  char c;

  while (loop++ < 10)
    {
      i = 0;
      clock_gettime(CLOCK_MONOTONIC, &start);
      while (i++ < 100000)
        {
          read(fd, &c, 1);
        }
      clock_gettime(CLOCK_MONOTONIC, &end);

      timespec_sub(&result, &end, &start);
      printf("%d:spending %d.%lds\n", loop, result.tv_sec, result.tv_nsec);
    }

  close(fd);
  return 0;
}

before:

nsh> hello
1:spending 0.39044654s
2:spending 0.28245015s
3:spending 0.28215794s
4:spending 0.28436127s
5:spending 0.28330915s
6:spending 0.28564837s
7:spending 0.28116804s
8:spending 0.27876377s
9:spending 0.28266779s
10:spending 0.28328176s
nsh>

this PR + #15603

nsh> hello
1:spending 0.30323709s
2:spending 0.22735038s
3:spending 0.22735801s
4:spending 0.22715055s
5:spending 0.22756127s
6:spending 0.22938592s
7:spending 0.22860041s
8:spending 0.22718114s
9:spending 0.22785607s
10:spending 0.22676567s
nsh> poweroff

basic driver does not need such complex wrapper

Signed-off-by: chao an <[email protected]>
@github-actions github-actions bot added Area: Drivers Drivers issues Size: M The size of the change in this PR is medium labels Jan 21, 2025
@anchao anchao requested a review from yamt January 21, 2025 12:26
@nuttxpr
Copy link

nuttxpr commented Jan 21, 2025

[Experimental Bot, please feedback here]

This PR does not fully meet the NuttX requirements. Here's a breakdown of the missing information:

Summary:

  • Insufficient Detail: While it states the "what" (reverting vector IO), it lacks the crucial "why." Why is the complex wrapper unnecessary for basic drivers? What problems did it cause? This is the most important part of the summary.
  • Missing Scope: What specific drivers are affected? Only loop, null, and zero? Explicitly listing them is better.
  • Lack of Explanation: How does reverting this change improve performance? Be specific. Is it reducing function call overhead? Memory usage? Explain the mechanism.

Impact:

  • Too Vague: "Improve the performance" is insufficient. Quantify the improvement. Is it 10%? 50%? Under what conditions? Which benchmarks were used?
  • Missing Details:
    • Impact on user: Even if the answer is NO, explicitly state it.
    • Impact on build: Even if the answer is NO, explicitly state it.
    • Impact on hardware: Even if the answer is NO, explicitly state it.
    • Impact on documentation: Does the removed code need to be removed from documentation? If not, state NO.
    • Impact on security: Even if the answer is NO, explicitly state it.
    • Impact on compatibility: Even if the answer is NO, explicitly state it.

Testing:

  • Insufficient Host Detail: ubuntu 24.04 x86_64 is okay, but specifying the compiler version is recommended.
  • Missing Target Details: Which architecture and board configuration was this tested on? sim:qemu-x86_64? Something else? This is critical.
  • Unclear Baseline: "before" is ambiguous. Before what? The previous commit? A specific version?
  • Dependency on Another PR: Testing should be self-contained. Relying on another PR makes review difficult. Merge the other PR first or explain precisely what changes from that PR are required for this test to be valid.
  • Test Case Weakness: Reading from /dev/zero isn't a very representative test for driver performance in general. Consider a test case that involves actual I/O with a real device or a more complex simulated device if possible. The current test primarily measures the overhead of the read() system call itself.
  • Lack of Before/After Comparison in Same Run: Running the test 10 times before and 10 times after separately introduces variability. A better approach is to run the benchmark multiple times in the same test run, both with and without the change, to minimize external factors.

Example of an Improved PR Description:

## Summary

This PR reverts the vector IO implementation from the `loop`, `null`, and `zero` drivers. The original vector IO wrapper added unnecessary overhead for these basic drivers, which primarily perform simple data transfers.  Profiling revealed that the function call overhead and data copying within the vector IO codepath significantly impacted performance, particularly for small I/O operations.  This revert simplifies the code and improves performance by directly accessing the underlying data buffers.

Fixes: #[Issue Number (if applicable)]

## Impact

* Is new feature added? NO
* Is existing feature changed? YES. Reverts a previously added feature (vector IO) for the `loop`, `null`, and `zero` drivers.
* Impact on user: NO
* Impact on build: NO
* Impact on hardware: NO
* Impact on documentation: NO
* Impact on security: NO
* Impact on compatibility:  NO
* Anything else to consider? This change improves the performance of the affected drivers by approximately 20-25% based on the provided benchmark (reading 100,000 bytes).

## Testing

I confirm that changes are verified on local setup and works as intended:
* Build Host(s): Ubuntu 24.04 x86_64, GCC 12.2.0
* Target(s): sim:qemu-x86_64

Testing Methodology:

A benchmark was created to measure the time taken to perform 100,000 single-byte reads from `/dev/zero`.  The benchmark was run 10 times before and after applying the patch within the same test run to minimize external factors.

Testing logs before change (commit hash: [commit hash]):

[Benchmark results before change]


Testing logs after change:

[Benchmark results after change]

By addressing these points, the PR will be much clearer, easier to review, and more likely to be accepted. Remember to quantify the performance improvement whenever possible.

@xiaoxiang781216 xiaoxiang781216 merged commit 0ffb0b7 into apache:master Jan 22, 2025
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Drivers Drivers issues Size: M The size of the change in this PR is medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants