feat: thread autoscaling #1266

Alliballibaba2 · 2024-12-19T15:14:53Z

I originally wanted to just create a PR that allows adding threads via the admin API, but after letting threads scale automatically, that PR kind of didn't make sense anymore by itself.

So here is what this PR does:

It adds 4 Caddy admin endpoints

POST     /frankenphp/workers/restart   # restarts workers (this can also be put into a smaller PR if necessary)
GET      /frankenphp/threads           # prints the current state of all threads (for debugging/caddytests)
PUT      /frankenphp/threads           # Adds a thread at runtime - accepts 'worker' and 'count' query parameters
DELETE   /frankenphp/threads           # Removes a thread at runtime - accepts 'worker' and 'count' query parameters

Additionally, the PR also introduces a new directive in the config: max_threads.

frankenphp {
    max_threads 200
    num_threads 40
}

If it's bigger than num_threads, worker and regular threads will attempt to autoscale after a request on a few different conditions:

no thread was available to immediately handle the request
the request was stalled for more than a few ms (15ms currently)
no other scaling is happening at that time
A CPU probe (50ms) successfully determines that PHP threads are consuming less than a predefined amount of CPU (80% currently)
we have not reached max_threads yet

This is all still a WIP. I'm not yet sure if max_threads is the best way to configure autoscaling or if it's even necessary to have the PUT/DELETE endpoints. Maybe it would also make sense to determine max_threads based on available memory.
I'll conduct some benchmarks showing that this approach performs better than default settings in a lot of different scenarios (and makes people worry less about thread configuration).

In regards to recent issues, spawning and destroying threads would also make the server more stable if we're experiencing timeouts (not sure yet how to safely destroy running threads).

# Conflicts: # frankenphp.c # frankenphp.go # php_thread.go # worker.go

# Conflicts: # frankenphp.go

AlliBalliBaba · 2025-01-16T11:45:26Z

I removed the POST and DELETE admin endpoints for now, we can add them back in if needed.

# Conflicts: # thread-worker.go # watcher_test.go

dunglas · 2025-01-27T22:07:29Z

Is this one ready for prime time?

# Conflicts: # phpmainthread.go # phpthread.go

AlliBalliBaba · 2025-01-27T23:00:51Z

Yes, I think this one should be good to go for an initial implementation via the max_threads configuration.
I set the channel used for scaling to nil in case scaling is inactive, so there shouldn't be any difference in behavior if max_threads is not set.

I might create a separate project with integration tests and some extensions/framework at some point so there's more confidence before releases.

withinboredom · 2025-01-29T19:06:15Z

AlliBalliBaba · 2025-01-30T21:02:51Z

I created a small testing repo with Laravel, since it's the PHP Framework I'm most familiar with. Right now it just tests a few of the most popular extensions in zts-bookworm, but I'll probably add more if there are any problematic ones.

The only issue I found so far is that calling opcache_reset can lead to a crash under high load (also on main). Not sure if this only affects ZTS, but there exist simliar issues for FPM.

On worker restarts all threads are first stopped, which prevents the crashing. It's just problematic if opcache_reset is called directly , so maybe documenting this might be enough for now.

AlliBalliBaba · 2025-02-02T23:48:34Z

Alright I think this is ready to merge. Not sure why watcher installation is failing right now in asan and msan

Alliballibaba2 added 30 commits November 1, 2024 23:10

Decouple workers.

fe1158f

Moves code to separate file.

ad34140

Cleans up the exponential backoff.

89b211d

Initial working implementation.

7d2ab8c

Refactors php threads to take callbacks.

f7e7d41

Cleanup.

c03c59b

Cleanup.

a9857dc

Cleanup.

bac9555

Cleanup.

a2f8d59

Merge branch 'main' into refactor/start-worker-threads-directly

279924c

Adjusts watcher logic.

0825453

Adjusts the watcher logic.

17d5cbe

Fix opcache_reset race condition.

09e0ca6

Merge branch 'main' into refactor/start-worker-threads-directly

a726a2c

# Conflicts: # frankenphp.c # frankenphp.go # php_thread.go # worker.go

Fixing merge conflicts and formatting.

7f13ada

Prevents overlapping of TSRM reservation and script execution.

13fb4bb

Adjustments as suggested by @dunglas.

a8a00c8

Adds error assertions.

b4dd138

Adds comments.

03f98fa

Removes logs and explicitly compares to C.false.

e52dd0f

Resets check.

cd98e33

Adds cast for safety.

4e2a2c6

Fixes waitgroup overflow.

c51eb93

Resolves waitgroup race condition on startup.

89d8e26

Moves worker request logic to worker.go.

3587243

Removes defer.

ec32f0c

Removes call from go to c.

4e35698

Merge branch 'main' into refactor/start-worker-threads-directly

740fac7

# Conflicts: # frankenphp.go

Fixes merge conflict.

8a272cb

Adds fibers test back in.

ecce5d5

Alliballibaba2 added 4 commits January 16, 2025 10:28

Puts MaxThreads for tests back in.

0bba5f6

Uses gomaxprocs instead.

bace48c

Adjusts docs.

a43d2c0

Uses go's monotonic clock instead.

7614435

Alliballibaba2 added 5 commits January 16, 2025 13:10

Removes unused func.

32a3129

Removes old notations.

d189010

Locks the thread handler on debug status.

32517fe

Locks the thread handler on debug status.

1d899db

Merge branch 'main' into feat/auto-scale-clock-time

9a83ab1

# Conflicts: # thread-worker.go # watcher_test.go

Alliballibaba2 added 5 commits January 27, 2025 23:14

Merge branch 'main' into feat/auto-scale-clock-time

b3fd756

# Conflicts: # phpmainthread.go # phpthread.go

go fmt

e3d38b5

Makes scale chan nil if scaling is not active.

755551a

locks before modification.

2135354

Fixes race condition on restarts in tests.

69597ef

Alliballibaba2 added 9 commits January 31, 2025 00:12

Merge branch 'main' into feat/auto-scale-clock-time

57a1343

Resets the opcache more safely.

16e8169

trigger build

c038273

Starts all threads as inactive.

064a368

Properly removes autoscaled threads.

60b437f

Fix.

4a5cfd1

Marks a single thread to call opcache_reset.

954ac9e

Marks a single thread to call opcache_reset.

3085a65

Resets opcache_reset logic.

a1f89e3

AlliBalliBaba mentioned this pull request Feb 3, 2025

FrankenPHP aarch64 on Termux: Error Cannot create lock - Permission denied (13) #1356

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: thread autoscaling #1266

feat: thread autoscaling #1266

Alliballibaba2 commented Dec 19, 2024

AlliBalliBaba commented Jan 16, 2025

dunglas commented Jan 27, 2025

AlliBalliBaba commented Jan 27, 2025

withinboredom commented Jan 29, 2025

AlliBalliBaba commented Jan 30, 2025

AlliBalliBaba commented Feb 2, 2025

feat: thread autoscaling #1266

Are you sure you want to change the base?

feat: thread autoscaling #1266

Conversation

Alliballibaba2 commented Dec 19, 2024

AlliBalliBaba commented Jan 16, 2025

dunglas commented Jan 27, 2025

AlliBalliBaba commented Jan 27, 2025

withinboredom commented Jan 29, 2025

AlliBalliBaba commented Jan 30, 2025

AlliBalliBaba commented Feb 2, 2025