Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feedback #93

Open
zivkan opened this issue Jan 14, 2025 · 0 comments
Open

feedback #93

zivkan opened this issue Jan 14, 2025 · 0 comments

Comments

@zivkan
Copy link
Member

zivkan commented Jan 14, 2025

I thought I'd give some feedback from the point of view of a package consumer

lack of non-async APIs

Chet from the .NET SDK team created a PR on the NuGet.Client repo to use this library for dotnet list package and dotnet nuget why

But since MSBuild's API was not async, and this package only provides an async API, it required numerous modifications to make the top-to-bottom stack async. But async doesn't help this use-case. It's a console app, not a gui app, so there is no UI thread to unblock for responsiveness. And it's not doing a lot of concurrent work, so preventing the OS from needing to context switch a bunch of blocked threads isn't necessary either.

Anyway, the PR on the NuGet.Client repo is currently failing a test in a command that wasn't even modified by the PR, and the failure sounds like it might be related to making additional methods async. In other words, migrating NuGet's dotnet CLI commands to support slnx is being slowed down by needing to make code async because of this package's API design choices.

Also, I had a look through the code, and it appears that files are being opened via the new StreamReader constructor, which doesn't allow you to pass it FileOptions. Ben Watson's book titled "Writing High Performance .NET Code" says that using FileOptions.Asyncronous is not necessary, but if you don't specify it, while the .NET code will be async, the underlying I/O will not use I/O completion ports and will therefore complete synchronously. So, there might be an opportunity for improvement (at least on .NET Framework, I'm not sure how much it matters on .NET)

public API design

Maybe I'm just an old man, but it surprised me that as a package consumer, it's necessary to have 3 using statements to read a sln file, which I think is the most basic usage of this library.

performance

I wrote a little BenchmarkDotNet app to compare reading NuGet.Client's NuGet.sln with both this package, and MSBuild's SolutionFile API

code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.Build.Construction;
using Microsoft.VisualStudio.SolutionPersistence;
using Microsoft.VisualStudio.SolutionPersistence.Model;
using Microsoft.VisualStudio.SolutionPersistence.Serializer;

[MemoryDiagnoser]
public class Program
{
    private static void Main(string[] args)
    {
        BenchmarkRunner.Run<Program>();
    }

    private string _path;

    [GlobalSetup]
    public void GlobalSetup()
    {
        _path = Environment.GetEnvironmentVariable("SLN_FILE").Trim();
        if (string.IsNullOrEmpty(_path))
        {
            throw new Exception("SLN_FILE environment variable is not set.");
        }

        if (!File.Exists(_path))
        {
            throw new Exception($"File {_path} does not exist.");
        }
    }

    [Benchmark]
    public async Task<int> Package()
    {
        ISolutionSerializer? serializer = SolutionSerializers.GetSerializerByMoniker(_path);
        if (serializer is null)
        {
            throw new Exception($"Serializer for file {_path} is not found.");
        }

        SolutionModel solution = await serializer.OpenAsync(_path, CancellationToken.None);
        return solution.SolutionProjects.Count;
    }

    [Benchmark(Baseline = true)]
    public int MSBuild()
    {
        var solution = SolutionFile.Parse(_path);
        return solution.ProjectsInOrder.Count;
    }
}

The results on .NET Framework on my machine are:

Method Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
Package 2.040 ms 0.0403 ms 0.0509 ms 1.59 0.06 162.1094 1.9531 1001.16 KB 0.79
MSBuild 1.286 ms 0.0256 ms 0.0390 ms 1.00 0.04 205.0781 1.9531 1264.19 KB 1.00

So, memory allocations are down about 30% (well done!), but wall clock time is about 60% worse.

I had a hypothesis that maybe it's related to needing to schedule IO completion tasks and/or GC due to additional async state machine allocations, and the reduced memory usage makes the GC guess irrelevant. So, I cloned the repo and deleted every async and await keyword, and made whatever other changes needed to make it build. However, the perf didn't change in any meaningful way.

I haven't (yet?) run this repo's code under a perf profiler, but since the MSBuild API is 60% faster, I think there's significant potential for perf improvements.

On one hand 2ms isn't long, I certainly won't notice it with my human reflexes being 800us slower than MSBuild's API. But in relative terms that is a significant perf drop, and learning what specifically caused the current implementation to be much slower than MSBuild's API will help anyone who learns the root cause to write better code in the future (please keep me in the loop if you investigate, I'd love to learn what caused it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant