-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make run output the raw numbers #2
Comments
Well, I tried... main...mindplay-dk:iso-bench:collect-test-results I tested this in the context of the benchmark, and it did work - but it takes way longer than the specified 2 seconds to actually measure 2 seconds worth of runs. So this is somehow adding a lot of runtime overhead. I'm not sure why. But I honestly don't fully understand this code. 🤔 |
@mindplay-dk is because you changed the The What you want is to run multiple small samples and get the average, so if you want 2 seconds, just set In short, /**
* The maximum time a benchmark is allowed to run before finishing (secs).
*
* Note: Cycle delays aren't counted toward the maximum time.
*
* @default 5
*/
maxTime: number
/**
* The time needed to reduce the percent uncertainty of measurement to 1% (secs).
*
* @default 0
*/
minTime: number benny will increase the cycles of each sample until they reach the PS: I'm gonna clean the code. It was made yesterday afternoon in a couple hours just to see that everything works as expected. |
I see. Shouldn't When is it useful to have control of this value? |
On that note, I was confused by the "samples" value output in the console as well - why would you need to know how many cycles the framework used internally to collect the results? I've been trying some different math on the collected values - not sure if it makes sense to include any or some of this in the library itself? You can see an example here: For a benchmark like the one I'm trying to fix, probably the most reliable and meaningful value is the The The Of course, the point of exposing the individual values was so you didn't need to own this code, but it might be nice if something a bit more immediately useful came out of it. What do you think? |
It had a 100ms default value, which worked well for all my tests. The best way to run IsoBench is to run it with default values. Anyway, I published a new update with different options, with default values that return more consistent results.
I don't know. benchmark.js shows it so I also done so xD
I'm not sure what those results mean :-(
I'm not an algorithms guy, so I went with the mean which is the one I can implement easily, and I think that is the one that benny and benchmark.js use. The thing is that they clean outputs depending on the percentaje difference between samples and IsoBench still doesn't. That's something I will add in the future because I think that this logic that they explain in the post is the one that can give the more stable results. The new release also returns the test array in the EDIT: Cleaned the code and added a Processor system so benchmark output is easier to manage. This is in preparation to add events during the benchmarking process, and also to be able to output charts, tables, HTML and so on. |
ha, no, of course not, I didn't actually post the code! sorry 😅 well, the changes I had locally were only working with my branch that exposes the raw numbers, and you current version still doesn't do that. but I can dump my functions here, in case any of it is useful to you. here's what I was printing out: for (const { name, values } of tests) {
console.log({
name,
min: minOf(values),
mean: meanOf(values),
typical: weightedMean(values, 10)
})
} and here's all the math behind that: function weightedMean(values: number[], factor = 10) {
const buckets = sortIntoBuckets(values, 100)
.filter((bucket) => bucket.values.length > 0)
.map((bucket) => ({
...bucket,
weightedAverage: distanceWeightedMean(bucket.values, factor)
}))
let totalWeight = 0
let weightedSum = 0
for (const bucket of buckets) {
const weight = Math.pow(bucket.values.length, factor)
totalWeight += weight
weightedSum += weight * bucket.weightedAverage
}
return weightedSum / totalWeight
}
function meanOf(values: number[]): number {
let sum = 0
for (const value of values) {
sum += value
}
return sum / values.length
}
function minOf(values: number[]): number {
let min = Number.POSITIVE_INFINITY
for (const value of values) {
if (value < min) {
min = value
}
}
return min
}
function maxOf(values: number[]): number {
let max = 0
for (const value of values) {
if (value > max) {
max = value
}
}
return max
}
type Bucket = { min: number; max: number; values: number[] }
function sortIntoBuckets(values: number[], numBuckets: number): Bucket[] {
const min = minOf(values)
const max = maxOf(values)
const range = max - min
const buckets: Bucket[] = []
for (let i = 0; i < numBuckets; i++) {
buckets.push({
min: min + (range * i) / numBuckets,
max: min + (range * (i + 1)) / numBuckets,
values: []
})
}
for (const value of values) {
const index = Math.min(
Math.floor(((value - min) / range) * numBuckets),
numBuckets - 1 // note: max of bucket range is inclusive *only* for the last bucket
)
buckets[index].values.push(value)
}
return buckets
}
function distanceWeightedMean(values: number[], factor = 2): number {
if (values.length === 0) {
throw new Error('cannot computed weighted average of an empty list')
}
if (values.length === 1) {
return values[0]
}
if (values.length === 2) {
return (values[0] + values[1]) * 0.5
}
let sum = 0
for (const value of values) {
sum += value
}
const avg = sum / values.length
const error = (value: number) => Math.abs(avg - value)
let deviation = 0
for (const value of values) {
deviation = Math.max(error(value), deviation)
}
if (deviation === 0) {
return values[0]
}
let totalWeight = 0
let weightedSum = 0
for (const value of values) {
const weight = Math.pow(1 - error(value) / deviation, factor)
totalWeight += weight
weightedSum += weight * value
}
return weightedSum / totalWeight
} as said, the but I guess, technically, that could depend on what you're benchmarking... which is why I was trying a bunch of different things. the my I have no idea if this is better or worse than what benchmark.js is doing - it looks like hard core statistical stuff, so I'm out of my league... personally, I would favor something non-scientists have a shot and understanding... or if I had to use whatever that statistical algorithm is, I would find a third party package with a well tested implementation. I wouldn't suggest copy/pasting stuff from benchmark.js unless you really understand what you're copying - even if it works well, because you're going to own it then. 😅 but you do you of course. 👍 I'm having a bit of trouble keeping up with all the changes, so I will probably set this project on the back burner until you think you're finished. 🙂 |
oh, one little thing, since you can't merge my changes now - I do think you need to make a small change here: Lines 155 to 161 in 1194a7a
Something like this instead: private _runTest(test:Test, cycles:number): number[] {
const values: number[] = [];
while(cycles-- > 0) {
const startTS = process.hrtime.bigint();
test.callback();
values.push(Number(process.hrtime.bigint() - startTS) / 1000000);
}
return values;
} You're measuring the time it takes for the loop itself to execute - while this may seem insignificant, the faster/smaller the function you're measuring, the more it'll skew the results. |
Yeah sorry 😅 it is now getting closer to what I want to achieve. Just pushed a new update with all the new Processor logic, so it will be easier to implement new Processors or custom ones. Thank you for all the code. When I finish with the Processor thingy I'm going to analyze all these algorithms, and maybe even add an option to calculate different ones (min, mean, "typical"), because what each person is trying to achieve in each case is different.
Good catch but this is intentional because of exactly that reason. For fast methods (for example a You can test it by running this simple script: const ITERATIONS = 10_000_000;
const values = [];
function expensiveFunction() {
/asdasd/.test("dfghsdfghdfgjasdasddftghsdfhg");
}
function testEachRun() {
let cycles = ITERATIONS;
while(cycles-- > 0) {
const startTS = process.hrtime.bigint();
expensiveFunction();
values.push(Number(process.hrtime.bigint() - startTS) / 1000000);
}
console.log("testEachRun", values.reduce((total, value) => total + value, 0));
}
function testAllRuns() {
let cycles = ITERATIONS;
const startTS = process.hrtime.bigint();
while(cycles-- > 0) {
expensiveFunction();
}
console.log("testAllRuns", Number(process.hrtime.bigint() - startTS) / 1000000);
}
testAllRuns();
testEachRun(); As you can see, if you measure each run, the total time is almost triplicated for the very same amount of iterations, and that's because all the extra time it takes to call the By the way, the skew is not important as long as all the tests have the very same skew. These benchmark libraries are created to compare different code between them and see which one is faster, not to calculate the exact CPU cycles that an operation takes, as there's always a skew because at some point you need to access a clock, a counter or whatever thing to calculate the differences. |
Hmm, have you tried using the platform standard https://developer.mozilla.org/en-US/docs/Web/API/Performance/now |
In Node.js that's just an alias for What I don't remember is why I decided to go with bigint to later convert it to a Number. I think that because internally nodejs works with bigint when using |
This is nuts. How did they manage to create a function just for accessing a micro timer that is this slow. 😐 |
Is not that. Is just that simply calling a native function has its overhead. JavaScript drawbacks. It was not designed for performance, although V8 guys are doing whatever they can 😅 |
Calling any native function causes this much overhead? I mean, they do say "javascript is slow", but... yikes. 😅 |
Run will output the raw numbers, and with built-in processors they can be shown in the console, a table, a file, or you can even write your own processor.
Idea: Just add a processor list to the IsoBench constructor (or by using methods) and they will handle events when one test sample ends, when one test ends completely and when all tests end completely, to show logs meanwhile the test is running and so on. This processor will follow an interface, which is easily typed with TypeScript.
The text was updated successfully, but these errors were encountered: