Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: CGo thread stacks stay in physical memory #71150

Open
kelanyll opened this issue Jan 7, 2025 · 4 comments
Open

runtime: CGo thread stacks stay in physical memory #71150

kelanyll opened this issue Jan 7, 2025 · 4 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@kelanyll
Copy link

kelanyll commented Jan 7, 2025

Go version

go1.23.4 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/ykelani/.cache/go-build'
GOENV='/home/ykelani/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ykelani/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ykelani/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/ykelani/sdk/go1.23.4'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/ykelani/sdk/go1.23.4/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.4'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/ykelani/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ykelani/TestCGoScaling/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4071249400=/tmp/go-build -gno-record-gcc-switches'

What did you do?

package main

/*
#cgo CFLAGS: -O0
#include <unistd.h>

void a(int depth) {
	if (depth > 0) a(depth - 1);
}
*/
import "C"
import (
	"fmt"
	"runtime/debug"
	"sync"
)

func main() {
	fmt.Println("Running multiple CGo calls")

	var wg sync.WaitGroup

	for i := 0; i < 1000000; i++ {
		go func() {
			wg.Add(1)
			defer wg.Done()
			C.a(100000)
		}()
	}

	wg.Wait()

	debug.FreeOSMemory()

	select {}
}

What did you see happen?

After the program pauses on the select call, RSS is 497MB for 150 OS threads. I see many segments of size 3132KB in the pmap output - my uninformed guess is that these are thread stacks but it's unclear why they are still in physical memory.

dev-dsk-ykelani-1a-f1f9d672 % ps -o nlwp 15514
NLWP
 150
dev-dsk-ykelani-1a-f1f9d672 % pmap -x 15514
15514:   ./build/bin/cgo-scaling
Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000400000    1352    1352       0 r-x-- cgo-scaling
0000000000751000       4       4       4 r---- cgo-scaling
0000000000752000      44      44      20 rw--- cgo-scaling
000000000075d000     144      72      72 rw---   [ anon ]
00000000120cf000     132       8       8 rw---   [ anon ]
000000c000000000   65536   28900   28900 rw---   [ anon ]
00007fbeb51f5000       4       0       0 -----   [ anon ]
00007fbeb51f6000   10240    3132    3132 rw---   [ anon ]
00007fbeb5bf6000       4       0       0 -----   [ anon ]
00007fbeb5bf7000   10240    3132    3132 rw---   [ anon ]
00007fbeb65f7000       4       0       0 -----   [ anon ]
00007fbeb65f8000   10240    3132    3132 rw---   [ anon ]
00007fbeb6ff8000       4       0       0 -----   [ anon ]
00007fbeb6ff9000   10240    3132    3132 rw---   [ anon ]
00007fbeb79f9000       4       0       0 -----   [ anon ]
00007fbeb79fa000   10240    3132    3132 rw---   [ anon ]
00007fbeb83fa000       4       0       0 -----   [ anon ]
00007fbeb83fb000   10240    3132    3132 rw---   [ anon ]
00007fbeb8dfb000       4       0       0 -----   [ anon ]
00007fbeb8dfc000   10240    3132    3132 rw---   [ anon ]
00007fbeb97fc000       4       0       0 -----   [ anon ]
00007fbeb97fd000   10240    3132    3132 rw---   [ anon ]
00007fbeba1fd000       4       0       0 -----   [ anon ]
00007fbeba1fe000   10240    3132    3132 rw---   [ anon ]
00007fbebabfe000       4       0       0 -----   [ anon ]
00007fbebabff000   10240    3132    3132 rw---   [ anon ]
00007fbebb5ff000       4       0       0 -----   [ anon ]
00007fbebb600000   10240    3132    3132 rw---   [ anon ]
00007fbebc000000     132       4       4 rw---   [ anon ]
00007fbebc021000   65404       0       0 -----   [ anon ]
00007fbec0000000     132       4       4 rw---   [ anon ]
00007fbec0021000   65404       0       0 -----   [ anon ]
00007fbec4000000     132       4       4 rw---   [ anon ]
00007fbec4021000   65404       0       0 -----   [ anon ]
00007fbec83fa000       4       0       0 -----   [ anon ]
...
00007fc1117e5000   10496    3296    3296 rw---   [ anon ]
00007fc112225000       4       0       0 -----   [ anon ]
00007fc112226000   10240    3132    3132 rw---   [ anon ]
00007fc112c26000       4       0       0 -----   [ anon ]
00007fc112c27000   10240    3132    3132 rw---   [ anon ]
00007fc113627000       4       0       0 -----   [ anon ]
00007fc113628000   10240       8       8 rw---   [ anon ]
00007fc114028000   33792       8       8 rw---   [ anon ]
00007fc116128000  263680       0       0 -----   [ anon ]
00007fc1262a8000       4       4       4 rw---   [ anon ]
00007fc1262a9000  524284       0       0 -----   [ anon ]
00007fc1462a8000       4       4       4 rw---   [ anon ]
00007fc1462a9000  293564       0       0 -----   [ anon ]
00007fc158158000       4       4       4 rw---   [ anon ]
00007fc158159000   36692       0       0 -----   [ anon ]
00007fc15a52e000       4       4       4 rw---   [ anon ]
00007fc15a52f000    4068       0       0 -----   [ anon ]
00007fc15a928000    1680    1184       0 r-x-- libc-2.26.so
00007fc15aacc000    2044       0       0 ----- libc-2.26.so
00007fc15accb000      16      16      16 r---- libc-2.26.so
00007fc15accf000       8       8       8 rw--- libc-2.26.so
00007fc15acd1000      16      12      12 rw---   [ anon ]
00007fc15acd5000      96      96       0 r-x-- libpthread-2.26.so
00007fc15aced000    2048       0       0 ----- libpthread-2.26.so
00007fc15aeed000       4       4       4 r---- libpthread-2.26.so
00007fc15aeee000       4       4       4 rw--- libpthread-2.26.so
00007fc15aeef000      16       4       4 rw---   [ anon ]
00007fc15aef3000     144     144       0 r-x-- ld-2.26.so
00007fc15af22000     516     332     332 rw---   [ anon ]
00007fc15afa3000     512       0       0 -----   [ anon ]
00007fc15b023000       4       4       4 rw---   [ anon ]
00007fc15b024000     508       0       0 -----   [ anon ]
00007fc15b0a3000     404      76      76 rw---   [ anon ]
00007fc15b116000       4       4       4 r---- ld-2.26.so
00007fc15b117000       4       4       4 rw--- ld-2.26.so
00007fc15b118000       4       4       4 rw---   [ anon ]
00007ffc4edb4000    3136    3136    3136 rw---   [ stack ]
00007ffc4f17d000      16       0       0 r----   [ anon ]
00007ffc4f181000       8       4       0 r-x--   [ anon ]
ffffffffff600000       4       0       0 r-x--   [ anon ]
---------------- ------- ------- ------- 
total kB         11088092  497548  494744

What did you expect to see?

RSS to fall much lower for the process - low enough just to support the Go runtime. This is a toy example but I'm seeing the same behaviour slowly "leak" memory for a real long-lived Go process that uses CGo.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 7, 2025
@seankhliao
Copy link
Member

what if you run it with GODEBUG=madvdontneed=0

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Jan 7, 2025
@kelanyll
Copy link
Author

kelanyll commented Jan 7, 2025

@seankhliao It uses more memory (527MB) with GODEBUG=madvdontneed=0 which I guess makes sense as the OS should be less aggressive about reclaiming memory.

@prattmic prattmic added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 8, 2025
@prattmic
Copy link
Member

prattmic commented Jan 8, 2025

While goroutine stacks can grow and shrink, cgo calls do not run on the goroutine stack [1]. Instead cgo calls run on what we call the "system stack". i.e., the pthread-allocated stack for the thread that the goroutine is currently executing on.

This pthread stack does not ever shrink (or grow), which is why you see the resident memory stick around.

I suppose that we could theoretically have the GC / debug.FreeOSMemory go around and MADV_DONTNEED currently-unused pages of the system stack. They do not attempt to do so today. It would be a bit tricky to determine which pages to keep and which to drop, as we can't measure the high watermark that cgo calls reach.

This is somewhat related to #14592, as exiting threads would free their stack.

[1] Because it might be too small, and we can't grow the stack while running C code like we can while running Go code.

cc @golang/runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants