It may be better to review the benchmark data #28

goccy · 2021-04-07T13:36:29Z

When I read the benchmark code of json-iterator/go in the benchmark of BenchmarkGetRepoValues,
I noticed that it uses the feature that can stop the decoding process in the middle.
( https://github.com/WillAbides/rjson/blob/main/benchmarks/jsoniter.go#L100-L102 )

Of course, it is correct to use the specific feature to improve performance, but this optimization is highly dependent on the input data, so for example, if the key name of full_name is the end of the input data, benchmark results so different.

Therefore, when measuring using the best case, it is better to consider the worst case as well, or to use a benchmark that does not depend on the input data.

BEST

If archived , full_name , forks are in the top of input data .

$ go test -bench BenchmarkGetRepoValues/jsoniter
goos: darwin
goarch: amd64
pkg: github.com/willabides/rjson/benchmarks
BenchmarkGetRepoValues/jsoniter-16               5982798               192 ns/op              48 B/op          4 allocs/op
PASS
ok      github.com/willabides/rjson/benchmarks  1.655s

WORST

If full_name is in the end of input data .

$ go test -bench BenchmarkGetRepoValues/jsoniter
goos: darwin
goarch: amd64
pkg: github.com/willabides/rjson/benchmarks
BenchmarkGetRepoValues/jsoniter-16                117582              9576 ns/op            1680 B/op        123 allocs/op
PASS
ok      github.com/willabides/rjson/benchmarks  2.846s

The text was updated successfully, but these errors were encountered:

WillAbides · 2021-04-07T15:50:18Z

I know what you mean, and it's not just jsoniter. I exited early in all the parsers where that was available, so rjson and jsonparser are also exiting early.

I did the test that way because that is a feature I use frequently for work. We need to get just a few items from each document we parse, so we only read far enough into the document to find what we need then discard it.

It is a bit arbitrary that the last element is where it is. Maybe we should add a second version of this benchmark that also gets "watchers_count" at the end of the document.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It may be better to review the benchmark data #28

It may be better to review the benchmark data #28

goccy commented Apr 7, 2021 •

edited

Loading

WillAbides commented Apr 7, 2021

It may be better to review the benchmark data #28

It may be better to review the benchmark data #28

Comments

goccy commented Apr 7, 2021 • edited Loading

BEST

WORST

WillAbides commented Apr 7, 2021

goccy commented Apr 7, 2021 •

edited

Loading