Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new string format "dirty" #760

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nigrosimone
Copy link
Contributor

@nigrosimone nigrosimone commented Jan 18, 2025

For string known to contain non-printable characters or surrogate pairs.

For very long strings whose presence of characters to escape is known and certain, it is useless to call asString, it is just a waste of time (always execute a regex STR_ESCAPE ), eg. long product description with new lines.

Checklist

Benchmark

yarn bench:cmp
yarn run v1.22.21
$ node ./benchmark/bench-cmp-branch.js
Select the branch you want to compare (feature branch):
format-dirty
Select the branch you want to compare with (main branch):
master
Checking out "format-dirty"
Execute "npm run bench"

> [email protected] bench
> node ./benchmark/bench.js

short string............................................. x 27,026,062 ops/sec ±2.09% (177 runs sampled)
unsafe short string...................................... x 130,512,211 ops/sec ±3.62% (166 runs sampled)
dirty short string....................................... x 8,763,958 ops/sec ±4.34% (159 runs sampled)
short string with double quote........................... x 8,033,044 ops/sec ±3.94% (156 runs sampled)
long string without double quotes........................ x 7,091 ops/sec ±3.82% (168 runs sampled)
unsafe long string without double quotes................. x 96,393,035 ops/sec ±7.53% (152 runs sampled)
long string.............................................. x 7,488 ops/sec ±3.29% (168 runs sampled)
unsafe long string....................................... x 109,393,568 ops/sec ±5.16% (158 runs sampled)
dirty long string........................................ x 7,453 ops/sec ±3.66% (168 runs sampled)
number................................................... x 101,599,969 ops/sec ±4.69% (158 runs sampled)
integer.................................................. x 82,205,298 ops/sec ±3.76% (165 runs sampled)
formatted date-time...................................... x 785,873 ops/sec ±3.60% (171 runs sampled)
formatted date........................................... x 633,353 ops/sec ±3.75% (171 runs sampled)
formatted time........................................... x 626,472 ops/sec ±3.27% (170 runs sampled)
short array of numbers................................... x 53,906 ops/sec ±3.70% (168 runs sampled)
short array of integers.................................. x 50,111 ops/sec ±4.44% (162 runs sampled)
short array of short strings............................. x 12,758 ops/sec ±3.93% (170 runs sampled)
short array of long strings.............................. x 12,911 ops/sec ±3.99% (163 runs sampled)
short array of objects with properties of different types x 5,770 ops/sec ±5.88% (165 runs sampled)
object with number property.............................. x 96,703,326 ops/sec ±5.06% (147 runs sampled)
object with integer property............................. x 76,339,796 ops/sec ±4.44% (166 runs sampled)
object with short string property........................ x 18,824,168 ops/sec ±4.01% (162 runs sampled)
object with long string property......................... x 7,370 ops/sec ±3.59% (169 runs sampled)
object with properties of different types................ x 1,410,933 ops/sec ±3.41% (170 runs sampled)
simple object............................................ x 6,736,949 ops/sec ±3.99% (161 runs sampled)
simple object with required fields....................... x 7,927,630 ops/sec ±4.13% (167 runs sampled)
object with const string property........................ x 106,811,745 ops/sec ±4.26% (160 runs sampled)
object with const number property........................ x 103,253,355 ops/sec ±5.37% (161 runs sampled)
object with const bool property.......................... x 101,990,040 ops/sec ±4.94% (156 runs sampled)
object with const object property........................ x 112,596,974 ops/sec ±5.07% (171 runs sampled)
object with const null property.......................... x 126,780,151 ops/sec ±4.24% (164 runs sampled)

Checking out "master"
Execute "npm run bench"

> [email protected] bench
> node ./benchmark/bench.js

Execute "npm run bench"

short string............................................. x 24,526,364 ops/sec ±4.26% (171 runs sampled)
unsafe short string...................................... x 120,260,088 ops/sec ±4.36% (165 runs sampled)
short string with double quote........................... x 11,477,523 ops/sec ±3.31% (173 runs sampled)
long string without double quotes........................ x 7,326 ops/sec ±3.58% (165 runs sampled)
unsafe long string without double quotes................. x 108,178,762 ops/sec ±4.63% (161 runs sampled)
long string.............................................. x 7,296 ops/sec ±3.53% (167 runs sampled)
unsafe long string....................................... x 113,185,107 ops/sec ±4.03% (163 runs sampled)
number................................................... x 111,471,208 ops/sec ±4.83% (160 runs sampled)
integer.................................................. x 86,990,622 ops/sec ±3.73% (173 runs sampled)
formatted date-time...................................... x 836,821 ops/sec ±3.88% (174 runs sampled)
formatted date........................................... x 619,073 ops/sec ±3.59% (169 runs sampled)
formatted time........................................... x 695,776 ops/sec ±4.32% (182 runs sampled)
short array of numbers................................... x 73,147 ops/sec ±1.99% (185 runs sampled)
short array of integers.................................. x 70,433 ops/sec ±1.61% (184 runs sampled)
short array of short strings............................. x 18,219 ops/sec ±1.32% (189 runs sampled)
short array of long strings.............................. x 17,044 ops/sec ±2.67% (175 runs sampled)
short array of objects with properties of different types x 7,507 ops/sec ±4.40% (169 runs sampled)
object with number property.............................. x 105,780,275 ops/sec ±4.80% (163 runs sampled)
object with integer property............................. x 88,649,536 ops/sec ±4.00% (172 runs sampled)
object with short string property........................ x 20,180,984 ops/sec ±3.61% (166 runs sampled)
object with long string property......................... x 7,201 ops/sec ±3.80% (162 runs sampled)
object with properties of different types................ x 1,373,796 ops/sec ±3.80% (168 runs sampled)
simple object............................................ x 7,099,119 ops/sec ±4.60% (172 runs sampled)
simple object with required fields....................... x 8,013,101 ops/sec ±4.61% (175 runs sampled)
object with const string property........................ x 125,143,506 ops/sec ±3.50% (163 runs sampled)
object with const number property........................ x 128,166,008 ops/sec ±3.48% (164 runs sampled)
object with const bool property.......................... x 127,128,952 ops/sec ±4.18% (169 runs sampled)
object with const object property........................ x 128,349,042 ops/sec ±3.67% (167 runs sampled)
object with const null property.......................... x 127,034,327 ops/sec ±3.77% (169 runs sampled)

short string.............................................+10.19%
unsafe short string.......................................+8.52%
short string with double quote...........................-30.01%
long string without double quotes.........................-3.21%
unsafe long string without double quotes.................-10.89%
long string...............................................+2.63%
unsafe long string........................................-3.35%
number....................................................-8.86%
integer....................................................-5.5%
formatted date-time.......................................-6.09%
formatted date............................................+2.31%
formatted time............................................-9.96%
short array of numbers....................................-26.3%
short array of integers..................................-28.85%
short array of short strings.............................-29.97%
short array of long strings..............................-24.25%
short array of objects with properties of different types-23.14%
object with number property...............................-8.58%
object with integer property.............................-13.89%
object with short string property.........................-6.72%
object with long string property..........................+2.35%
object with properties of different types..................+2.7%
simple object..............................................-5.1%
simple object with required fields........................-1.07%
object with const string property........................-14.65%
object with const number property........................-19.44%
object with const bool property..........................-19.77%
object with const object property........................-12.27%
object with const null property............................-0.2%
Back to format-dirty 5262ee0
Done in 804.46s.

Copy link
Member

@Eomm Eomm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if setting the threshold at (via the fastJson(schema, {stringOptimizations: {...} }):

https://github.com/fastify/fast-json-stringify/blob/7a2a31add85286c7019634daaf78f53096b403d6/lib/serializer.js#L120C30-L120C40

would do the job without introducing a non-standard format. It may break other side-features such as a swagger viewer.

@nigrosimone
Copy link
Contributor Author

I wonder if setting the threshold at (via the fastJson(schema, {stringOptimizations: {...} }):

https://github.com/fastify/fast-json-stringify/blob/7a2a31add85286c7019634daaf78f53096b403d6/lib/serializer.js#L120C30-L120C40

would do the job without introducing a non-standard format. It may break other side-features such as a swagger viewer.

I have also introduced the "unsafe" string format into FJS. But stringOptimizations seem good for me, for both unsafe and dirty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants