Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It may be better to review the benchmark data #28

Open
goccy opened this issue Apr 7, 2021 · 1 comment
Open

It may be better to review the benchmark data #28

goccy opened this issue Apr 7, 2021 · 1 comment

Comments

@goccy
Copy link

goccy commented Apr 7, 2021

When I read the benchmark code of json-iterator/go in the benchmark of BenchmarkGetRepoValues,
I noticed that it uses the feature that can stop the decoding process in the middle.
( https://github.com/WillAbides/rjson/blob/main/benchmarks/jsoniter.go#L100-L102 )

Of course, it is correct to use the specific feature to improve performance, but this optimization is highly dependent on the input data, so for example, if the key name of full_name is the end of the input data, benchmark results so different.

Therefore, when measuring using the best case, it is better to consider the worst case as well, or to use a benchmark that does not depend on the input data.

BEST

If archived , full_name , forks are in the top of input data .

$ go test -bench BenchmarkGetRepoValues/jsoniter
goos: darwin
goarch: amd64
pkg: github.com/willabides/rjson/benchmarks
BenchmarkGetRepoValues/jsoniter-16               5982798               192 ns/op              48 B/op          4 allocs/op
PASS
ok      github.com/willabides/rjson/benchmarks  1.655s

WORST

If full_name is in the end of input data .

$ go test -bench BenchmarkGetRepoValues/jsoniter
goos: darwin
goarch: amd64
pkg: github.com/willabides/rjson/benchmarks
BenchmarkGetRepoValues/jsoniter-16                117582              9576 ns/op            1680 B/op        123 allocs/op
PASS
ok      github.com/willabides/rjson/benchmarks  2.846s
@WillAbides
Copy link
Owner

I know what you mean, and it's not just jsoniter. I exited early in all the parsers where that was available, so rjson and jsonparser are also exiting early.

I did the test that way because that is a feature I use frequently for work. We need to get just a few items from each document we parse, so we only read far enough into the document to find what we need then discard it.

It is a bit arbitrary that the last element is where it is. Maybe we should add a second version of this benchmark that also gets "watchers_count" at the end of the document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants