-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Optimize Header Validation #3994
Comments
Regular expressions start winning after about 20 characters, so you can use different implementations based on the number of characters. export function isValidHeaderValue(value: string) {
for (let i = 0, len = value.length; i < len; i++) {
if (!isHeaderValueToken(value.charCodeAt(i))) return false;
}
return true;
}
export function isValidHeaderValueOptimized(value: string){
return value.length >= 20 ? isValidHeaderValue2(value) : isValidHeaderValue(value);
}
// undici
const headerCharRegex = /[^\t\x20-\x7e\x80-\xff]/;
export function isValidHeaderValue2(characters: string): boolean {
return !headerCharRegex.test(characters);
} import { bench, compact, lineplot, run } from 'mitata';
import * as utils from './constants.ts'; //
async function benchmark() {
lineplot(() => {
compact(() => {
bench('isValidHeaderValue($size)', function* (state) {
const size = state.get('size');
yield () => utils.isValidHeaderValue('a'.repeat(size));
}).range('size', 1, 512, 2);
bench('isValidHeaderValue2($size)', function* (state) {
const size = state.get('size');
yield () => utils.isValidHeaderValue2('a'.repeat(size));
}).range('size', 1, 512, 2).highlight('green');
bench('isValidHeaderValueOp($size)', function* (state) {
const size = state.get('size');
yield () => utils.isValidHeaderValueOptimized('a'.repeat(size));
}).range('size', 1, 512, 2).highlight('blue');
});
});
run();
}
benchmark();
|
Wow, this is a comprehensive anaylisis. Would you like to send a PR? We'd indeed need to use both approaches, one for short headers and one for long ones. How did you come by this improvement? Did you find that header validation was a bottleneck for a specific case in Undici? Does this improvement speed up our benchmarks? |
I would prefer to just use regular expressions. Seems like a good balance between performance and maintainability. |
@mcollina , In applications where a large number of network requests are required, it is important that they are as productive as possible, not only without headers, but also with headers. I measured queries without headers and with headers in profiling mode. Also, such services often have the same requests, and it would be good to do something like a PreparedRequest that will create a structure that does not require further validation of headers and transformation of the request body. Here are the functions that need to be reviewed for optimization: // @ts-nocheck
import { Agent, request } from 'undici';
const dispatcher = new Agent({ connections: 100 });
const header = new Headers();
header.set(`x-authorization-user-id-1`, `a`.repeat(10 * 20 * 25));
const headers = new Headers();
for (let index = 0; index < 20; index++) {
headers.set(`x-authorization-user-id-${index}`, `a`.repeat(10));
}
const optionsWithHeaders = {
method: 'GET',
dispatcher,
headers: headers,
};
const optionsWithOneHeader = {
method: 'GET',
dispatcher,
headers: header,
};
const optionsWithoutHeaders = {
method: 'GET',
dispatcher,
};
let rpc = 0;
const rpcInterval = setInterval(() => {
const curr = rpc;
rpc = 0;
curr > 0 && console.log(`rps:`, curr);
}, 1000);
async function worker(options) {
const resp = await request('http://localhost:8000', options);
await resp.body.dump();
rpc += 1;
}
async function main() {
{
const start = performance.now();
const tasks = Array.from({ length: 1e6 }, () => worker(optionsWithOneHeader));
await Promise.all(tasks);
const end = performance.now();
console.log(`with one header:`, (end - start).toFixed(2)); // with one header: 23557.95
}
{
const start = performance.now();
const tasks = Array.from({ length: 1e6 }, () => worker(optionsWithHeaders));
await Promise.all(tasks);
const end = performance.now();
console.log(`with Headers:`, (end-start).toFixed(2)); // with Headers: 20957.07
}
{
const start = performance.now();
const tasks = Array.from({ length: 1e6 }, () => worker(optionsWithoutHeaders));
await Promise.all(tasks);
const end = performance.now();
console.log(`without Headers:`, (end-start).toFixed(2)); // without Headers: 17486.71
}
}
main().then(() => clearInterval(rpcInterval)); one header (5000 len)
20 headers with 10 len
without headers
|
The current implementation of header validation in the codebase uses Uint8Array to store character validity maps for HTTP tokens, URIs, and header values. While this approach works, it can be further optimized using bitmasking to reduce memory usage and improve performance.
undici/lib/core/util.js
Line 733 in e1496cf
Benchmarks
Results
The text was updated successfully, but these errors were encountered: