Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld][WebAssembly] Skip BSS when combining data segments #127735

Merged
merged 1 commit into from
Feb 20, 2025

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Feb 19, 2025

In most circumstances BSS segments are not required in the output binary but combineOutputSegments was erroneously including them. This meant that PIC binaries were including the BSS data as zero in the binary.

Fixes: emscripten-core/emscripten#23683

@llvmbot
Copy link
Member

llvmbot commented Feb 19, 2025

@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-lld

Author: Sam Clegg (sbc100)

Changes

In most circumstances BSS segments are not required in the output binary but combineOutputSegments was erroneously including them. This meant that PIC binaries were including the BSS data as zero in the binary.

Fixes: emscripten-core/emscripten#23683


Full diff: https://github.com/llvm/llvm-project/pull/127735.diff

3 Files Affected:

  • (modified) lld/test/wasm/data-segments.ll (+35-4)
  • (modified) lld/wasm/OutputSections.cpp (+2-1)
  • (modified) lld/wasm/Writer.cpp (+6-1)
diff --git a/lld/test/wasm/data-segments.ll b/lld/test/wasm/data-segments.ll
index 79f1d384919d9..e36f0a7a95332 100644
--- a/lld/test/wasm/data-segments.ll
+++ b/lld/test/wasm/data-segments.ll
@@ -17,6 +17,11 @@
 ; RUN: wasm-ld -mwasm64 -no-gc-sections --no-entry %t.bulk-mem64.o -o %t.bulk-mem64.wasm
 ; RUN: obj2yaml %t.bulk-mem64.wasm | FileCheck %s --check-prefixes ACTIVE,ACTIVE64
 
+;; In -pie mode segments are combined into one active segment.
+; RUN: wasm-ld --experimental-pic --import-memory -pie -no-gc-sections --no-entry %t.atomics.bulk-mem.pic.o -o %t.pic.wasm
+; RUN: obj2yaml %t.pic.wasm | FileCheck %s --check-prefixes ACTIVE-PIC
+; RUN: llvm-objdump --disassemble-symbols=__wasm_call_ctors,__wasm_init_memory --no-show-raw-insn --no-leading-addr %t.pic.wasm | FileCheck %s --check-prefixes PIC-NON-SHARED-DIS
+
 ;; atomics, bulk memory, shared memory => passive segments
 ; RUN: wasm-ld -no-gc-sections --no-entry --shared-memory --max-memory=131072 %t.atomics.bulk-mem.o -o %t.atomics.bulk-mem.wasm
 ; RUN: obj2yaml %t.atomics.bulk-mem.wasm | FileCheck %s --check-prefix PASSIVE
@@ -28,9 +33,9 @@
 ; RUN: llvm-objdump --disassemble-symbols=__wasm_call_ctors,__wasm_init_memory --no-show-raw-insn --no-leading-addr %t.atomics.bulk-mem64.wasm | FileCheck %s --check-prefixes DIS,NOPIC-DIS -DPTR=i64
 
 ;; Also test in combination with PIC/pie
-; RUN: wasm-ld --experimental-pic -pie -no-gc-sections --no-entry --shared-memory --max-memory=131072 %t.atomics.bulk-mem.pic.o -o %t.pic.wasm
-; RUN: obj2yaml %t.pic.wasm | FileCheck %s --check-prefixes PASSIVE-PIC,PASSIVE32-PIC
-; RUN: llvm-objdump --disassemble-symbols=__wasm_call_ctors,__wasm_init_memory --no-show-raw-insn --no-leading-addr %t.pic.wasm | FileCheck %s --check-prefixes DIS,PIC-DIS -DPTR=i32
+; RUN: wasm-ld --experimental-pic -pie -no-gc-sections --no-entry --shared-memory --max-memory=131072 %t.atomics.bulk-mem.pic.o -o %t.shared.pic.wasm
+; RUN: obj2yaml %t.shared.pic.wasm | FileCheck %s --check-prefixes PASSIVE-PIC,PASSIVE32-PIC
+; RUN: llvm-objdump --disassemble-symbols=__wasm_call_ctors,__wasm_init_memory --no-show-raw-insn --no-leading-addr %t.shared.pic.wasm | FileCheck %s --check-prefixes DIS,PIC-DIS -DPTR=i32
 
 ;; Also test in combination with PIC/pie + wasm64
 ; RUN: wasm-ld -mwasm64 --experimental-pic -pie -no-gc-sections --no-entry --shared-memory --max-memory=131072 %t.atomics.bulk-mem.pic-mem64.o -o %t.pic-mem64.wasm
@@ -47,7 +52,7 @@
 
 @g = thread_local global i32 99, align 4
 
-; ERROR: 'bulk-memory' feature must be used in order to use shared memory
+;; ERROR: 'bulk-memory' feature must be used in order to use shared memory
 
 ; ACTIVE-LABEL: - Type:            CODE
 ; ACTIVE-NEXT:    Functions:
@@ -76,6 +81,20 @@
 ; ACTIVE-NEXT:      - Index:           0
 ; ACTIVE-NEXT:        Name:            __wasm_call_ctors
 
+;; In ACTIVE-PIC mode the memory is imported which means all data segments
+;; (except BSS) are combined in the single one.
+;; BSS is not included here, and instead initialized using `memory.init` in
+;; `__wasm_init_memory`
+
+; ACTIVE-PIC:  - Type:            DATA
+; ACTIVE-PIC-NEXT:    Segments:
+; ACTIVE-PIC-NEXT:      - SectionOffset:   6
+; ACTIVE-PIC-NEXT:        InitFlags:       0
+; ACTIVE-PIC-NEXT:        Offset:
+; ACTIVE-PIC-NEXT:          Opcode:          GLOBAL_GET
+; ACTIVE-PIC-NEXT:          Index:           1
+; ACTIVE-PIC-NEXT:        Content:         63000000636F6E7374616E74000000002B00000068656C6C6F00676F6F646279650000002A000000
+
 ; PASSIVE-LABEL: - Type:            START
 ; PASSIVE-NEXT:    StartFunction:   2
 ; PASSIVE-LABEL: - Type:            DATACOUNT
@@ -151,6 +170,18 @@
 ; PASSIVE-PIC-NEXT:      - Index:           2
 ; PASSIVE-PIC-NEXT:        Name:            __wasm_init_memory
 
+;; For the non-shared PIC case the __wasm_init_memory only deals with BSS since
+;; all other segments are active
+; PIC-NON-SHARED-DIS:       <__wasm_init_memory>:
+; PIC-NON-SHARED-DIS-EMPTY:
+; PIC-NON-SHARED-DIS-NEXT:  i32.const	40
+; PIC-NON-SHARED-DIS-NEXT:  global.get	1
+; PIC-NON-SHARED-DIS-NEXT:  i32.add 
+; PIC-NON-SHARED-DIS-NEXT:  i32.const	0
+; PIC-NON-SHARED-DIS-NEXT:  i32.const	10000
+; PIC-NON-SHARED-DIS-NEXT:  memory.fill	0
+; PIC-NON-SHARED-DIS-NEXT:  end
+
 ;; no data relocations.
 ; DIS-LABEL:       <__wasm_call_ctors>:
 ; DIS-EMPTY:
diff --git a/lld/wasm/OutputSections.cpp b/lld/wasm/OutputSections.cpp
index 95f7ecc29de6b..d679d1e676479 100644
--- a/lld/wasm/OutputSections.cpp
+++ b/lld/wasm/OutputSections.cpp
@@ -101,7 +101,8 @@ void DataSection::finalizeContents() {
   });
 #ifndef NDEBUG
   unsigned activeCount = llvm::count_if(segments, [](OutputSegment *segment) {
-    return (segment->initFlags & WASM_DATA_SEGMENT_IS_PASSIVE) == 0;
+    return segment->requiredInBinary() &&
+           (segment->initFlags & WASM_DATA_SEGMENT_IS_PASSIVE) == 0;
   });
 #endif
 
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index 76e38f548157c..7770bdcfc1f16 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -1081,7 +1081,12 @@ void Writer::combineOutputSegments() {
     return;
   OutputSegment *combined = make<OutputSegment>(".data");
   combined->startVA = segments[0]->startVA;
+  std::vector<OutputSegment *> newSegments = {combined};
   for (OutputSegment *s : segments) {
+    if (!s->requiredInBinary()) {
+      newSegments.push_back(s);
+      continue;
+    }
     bool first = true;
     for (InputChunk *inSeg : s->inputSegments) {
       if (first)
@@ -1100,7 +1105,7 @@ void Writer::combineOutputSegments() {
     }
   }
 
-  segments = {combined};
+  segments = newSegments;
 }
 
 static void createFunction(DefinedFunction *func, StringRef bodyContent) {

@sbc100 sbc100 changed the title [lld][WebAssembly] Honor requiredInBinary when combining data segments [lld][WebAssembly] Skip BSS when combining data segments Feb 19, 2025
@sbc100 sbc100 requested a review from brendandahl February 19, 2025 20:01
In most circumstances BSS segments are not required in the output binary
but combineOutputSegments was erroneously including them.  This meant
that PIC binaries were including the BSS data as zero in the binary.

Fixes: emscripten-core/emscripten#23683
@sbc100 sbc100 merged commit 74c6111 into llvm:main Feb 20, 2025
8 of 10 checks passed
@sbc100 sbc100 deleted the dylink_bss branch February 20, 2025 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Huge WASM binary size with -sMAIN_MODULE=1 or 2
3 participants