Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage 2 failure building generator #3

Open
simonjwright opened this issue May 28, 2024 · 31 comments
Open

Stage 2 failure building generator #3

simonjwright opened this issue May 28, 2024 · 31 comments

Comments

@simonjwright
Copy link
Contributor

A colleague is building gcc-14.1-darwin-r0 in a Github runner (macos-14 runs on an M1-based machine) and has problems with the gen_il-main built in stage 2.

The stage1 compiler is aarch64-apple-darwin 13.2.0, which successfully builds and runs:

2024-05-27T16:01:53.6688970Z mkdir -p ada/gen_il
2024-05-27T16:01:53.6836290Z cd ada/gen_il; gnatmake -q -g -gnata -gnat2012 -gnatw.g -gnatyg -gnatU -I/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/src/gcc/ada gen_il-main
[...]
2024-05-27T16:01:57.9128640Z cd ada/gen_il; ./gen_il-main
2024-05-27T16:01:58.1111820Z install.texi:261: warning: @anchor should not appear on @item line

The stage2 compiler builds the same:

2024-05-27T16:14:20.4760580Z mkdir -p ada/gen_il
2024-05-27T16:14:20.4823240Z cd ada/gen_il; gnatmake -q -g -gnata -gnat2012 -gnatw.g -gnatyg -gnatU -I/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/src/gcc/ada gen_il-main

but when it’s run, this happens:

2024-05-27T16:14:24.8200660Z cd ada/gen_il; ./gen_il-main
2024-05-27T16:14:24.8266370Z dyld[74097]: Symbol not found: ___builtin_nested_func_ptr_created
2024-05-27T16:14:24.8326660Z   Referenced from: <A3324F46-453C-314E-B26B-51C847B1E704> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/gcc/ada/gen_il/gen_il-main
2024-05-27T16:14:24.8329370Z   Expected in:     <B3E386AD-E6E3-3A23-B4B3-C06CA85CFE57> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/prev-gcc/libgcc_s.1.1.dylib
2024-05-27T16:14:24.8337720Z /bin/sh: line 1: 74097 Abort trap: 6           ./gen_il-main
2024-05-27T16:14:24.8340570Z make[3]: [ada/stamp-gen_il] Error 134 (ignored)

Could the Xcode/CLT version make a difference? The runner is running 14.4.1, and the default Xcode is 15.0.1.

@iains
Copy link
Owner

iains commented May 28, 2024

I recall earlier Xcode 15 having quite a few problems (mostly with the new linker) that we worked around by configuring to use "ld-classic" .. If possible, update to Xcode 15.3 which is known to work (and has quite a few wrinkles ironed out). If that is not possible, then perhaps we can find a configure recipe that will use the 'classic' linker.

@simonjwright
Copy link
Contributor Author

Our problem with 15.3 (well, CLT 15.3) is that it doesn’t provide m4 (it does provide gm4, though, so you can configure GMP with M4=gm4). I just ran the job locally with XC 15.4 (no CLT yet, though) and it succeeded.

Since this compiler is being provided to users, would it make sense to go for as early an XC version as actually builds the compiler? I’m trying XC 15.1 on Github as I write ...

@iains
Copy link
Owner

iains commented May 28, 2024

If you are trying to have one build that runs using several different XC installs, we are going to need to be careful - the compiler configuration determines the capabilities of the support toolchain (as, ld, dsymutil) and adjusts accordingly. Similarly, the SDK in use is relevant (since some of the SDKs require special handling).

If you are building the compiler, and providing it as a built item - then perhaps building m4 is not the end of the world.

For the record, I build GMP et. al as "in-tree" sources - they are bootrapped along with the compiler and I have not run into trouble with XC15.1b. As noted CLT 15.0 did have (dyld-linker) problems that are show-stoppers .. so .. you need to stop and take pause about what you want to offer and how to communicate that to your users.

edit: note that generally speaking toolchains are considered to consist of all the items in use - compiler, linker, assembler, debug linker. Xcode does not try to marry an arbitrary clang version with any arbitrary linker .. so perhaps you can constrain what you offer without being seen as providing a poor solution?

@simonjwright
Copy link
Contributor Author

We’re building the compiler, and providing it to our users as a built item.

I build gmp etc in-tree, my colleagues out-of-tree, for reasons. Anyway, I ran export M4=gm4 and all worked for their build (of gmp, since we still have compiler issues).

I ran on Github with XC 15.1 and 15.3 - no change.

I noticed that using their successful x86_64 build of GCC 14.1 to compile this trampoline-using program

with Ada.Text_IO;
procedure Trampo is
   type T is access procedure;
   procedure P (The_T : T) is
   begin
      The_T.all;
   end P;
   procedure A is
   begin
      Ada.Text_IO.Put_Line ("foo.");
   end A;
begin
   P (A'Access);
end Trampo;

generated a reference to libgcc_s.1.1.dylib

$ otool -L trampo
trampo:
	@rpath/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

in spite of the fact that it didn’t need to, because of linking with -lheapt_w.

My build of GCC 14.1.0 (using the FSF release) didn’t.

I think this might be down to a specs difference: theirs has

*libgcc:
%{static-libgcc|static:						      %:version-compare(!> 10.6 mmacosx-version-min= -lgcc_eh);		     shared-libgcc|fexceptions|fobjc-exceptions|fgnu-runtime:		     -lgcc_s.1.1						      %:version-compare(!> 10.3.9 mmacosx-version-min= -lgcc_eh)		      %:version-compare(>< 10.3.9 10.5 mmacosx-version-min= -lgcc_s.10.4)       %:version-compare(>< 10.5 10.6 mmacosx-version-min= -lgcc_s.10.5)	    } -lgcc

and mine has

*libgcc:
%{static-libgcc|static:						      %:version-compare(!> 10.6 mmacosx-version-min= -lgcc_eh);		     shared-libgcc|fexceptions|fobjc-exceptions|fgnu-runtime:		     %:version-compare(!> 10.11 mmacosx-version-min= -lgcc_s.1.1)						      %:version-compare(!> 10.3.9 mmacosx-version-min= -lgcc_eh)		      %:version-compare(>< 10.3.9 10.5 mmacosx-version-min= -lgcc_s.10.4)       %:version-compare(>< 10.5 10.6 mmacosx-version-min= -lgcc_s.10.5)	    } -lgcc

(the difference is that theirs has shared-libgcc|fexceptions|fobjc-exceptions|fgnu-runtime: -lgcc_s.1.1 whereas mine has shared-libgcc|fexceptions|fobjc-exceptions|fgnu-runtime: %:version-compare(!> 10.11 mmacosx-version-min= -lgcc_s.1.1))

Why that should result in their failure to build the aarch64 compiler I don’t know.

I found some interaction between gcc/configure and gcc/config/darwin.h to do with DARWIN_AT_RPATH, which looks as though it's related to the two different specs entries above.

@iains
Copy link
Owner

iains commented May 30, 2024

getting the specs right for the various permutations of Darwin's support for DYLD_LIBRARY_PATH has been somewhat of a labour.

Since (at least what I infer from your report) it seems that you are using different versions/branches/patches (?) it's very hard for me to figure out what might be wrong.

Really, the recommendation (for people who are not using homebrew, macports etc.) would be to use the branch(es) here - and if something does not work properly (or how you need it to) report issues so we can fix (or work around) them .. once we diverge significantly it's outside of what my meagre resources can handle :) and specs are a powerful tool ..

@simonjwright
Copy link
Contributor Author

Both I and my colleagues are using gcc-14.1-darwin-r0, unpatched. The build script we are both using

  • downloads a built 13.2.0 aarch64 compiler
  • downloads gmp etc sources
  • downloads gcc-14.1-darwin-r0
  • builds gmp etc
  • builds the 14.1 compiler

When run on Github, the compiler build crashes with the failure to run the gen_il-main built in stage 2, because of the problem I quoted at the top of this issue.

When run locally, the compiler build succeeds.

I’ve tried XC 15.1 locally and on Github, no change.

What I’m going to do next (after family commitments) is to compare the build logs.

After that, I’ll try to find why my own set of build scripts (again, using unpatched gcc-14.1-darwin-r0) doesn’t build gen_il-main using libgcc_s.1.1.dylib.

@iains
Copy link
Owner

iains commented May 31, 2024

Both I and my colleagues are using gcc-14.1-darwin-r0, unpatched.

That the specs are different is, in that case, pretty odd.

Having said that the configuration does adjust to the capabilities of the host system - including to whatever is detected in terms of linker capabilities***

The build script we are both using

  • downloads a built 13.2.0 aarch64 compiler
  • downloads gmp etc sources
  • downloads gcc-14.1-darwin-r0
  • builds gmp etc

On the GH/colleague's version this is a separate step - where you (and I) usually just build them in-tree? (not that I expect that to make the difference).

  • builds the 14.1 compiler

When run on Github, the compiler build crashes with the failure to run the gen_il-main built in stage 2, because of the problem I quoted at the top of this issue.

Can we get the output of uname -a on the GH instance?
we might also need to get the versions of some key installed utilities - see *** below.

When run locally, the compiler build succeeds.

I’ve tried XC 15.1 locally and on Github, no change.

no change == builds locally, fails on GH?

What I’m going to do next (after family commitments) is to compare the build logs.

  • configure logs would, perhaps be more revealing.

  • also the difference between gcc/auto-host.h on the two systems.

After that, I’ll try to find why my own set of build scripts (again, using unpatched gcc-14.1-darwin-r0) doesn’t build gen_il-main using libgcc_s.1.1.dylib.

That could be quite an involved project - now I know you are starting from the same point - the thing is to figure out why the configure / build thinks that there's a difference.

===

*** some configure functionalities depend on the installed utilities (e.g. gawk, objdump, otool, etc.) further to that it can be that the linker used by the build compiler can be relevant.

Those are areas where there could be deviation between the environments.

I am understanding (hopefully) that the failing get_il-main is built by the stage #1 compiler (and not the host/bootstrap one)?

@iains
Copy link
Owner

iains commented May 31, 2024

if it is possible, please could you post the output of:

otool -lv gcc/ada/gen_il/gen_il-main |grep -A3 LC_RPA

from the failing case ..

.. IIUC, this is a $build tool (i.e. intended to run on the $build system which can be different from the $host one) and should be being built with the bootstrap/build-system compiler.

edit: there are several Ada build-time tools.

@iains
Copy link
Owner

iains commented May 31, 2024

please also post your configure lines (for both cases). I just took a look at my gcc-14.1 build and get_il-main seems to be correctly built with the bootstrap/build-system compiler at each stage.

@simonjwright
Copy link
Contributor Author

I’m going to be away for a few days, but in the mean time

  • for their build on Github,
../src/configure
--prefix=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/install
--enable-languages=c,ada,c++
--enable-libstdcxx
--enable-libstdcxx-threads
--enable-libada
--disable-nls
--without-libiconv-prefix
--disable-libstdcxx-pch
--enable-lto
--disable-multilib
--disable-libcilkrts
--without-build-config
--with-build-sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
'--with-specs=%{!sysroot=*:--sysroot=%:if-exists-else(/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk)}'
--with-mpfr=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/mpfr/install
--with-gmp=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gmp/install
--with-mpc=/Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/mpc/install
--build=aarch64-apple-darwin23.5.0
  • for my local build using my scripts,
/Volumes/Miscellaneous3/src/gcc-14-branch/configure
--prefix=/Volumes/Miscellaneous3/aarch64/gcc-14.1.0-aarch64
--without-libiconv-prefix
--disable-libmudflap
--disable-libstdcxx-pch
--disable-libsanitizer
--disable-libcc1
--disable-libcilkrts
--disable-multilib
--disable-nls
--enable-languages=c,c++,ada
--host=aarch64-apple-darwin21
--target=aarch64-apple-darwin21
--build=aarch64-apple-darwin21
--without-isl
--with-build-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
--with-sysroot=
--with-specs='%{!sysroot=*:--sysroot=%:if-exists-else(/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk)}'
--with-bugurl=https://github.com/simonjwright/building-gcc-macos-native
--enable-bootstrap
--enable-host-pie
CFLAGS=-Wno-deprecated-declarations
CXXFLAGS=-Wno-deprecated-declarations
  • for the local build using their scripts,
$ otool -lv gcc/ada/gen_il/gen_il-main | grep -A3 LC_RPA
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 136
         path /Users/simon/Developer/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc/aarch64-apple-darwin23.2.0/13.2.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 96
         path /Users/simon/Developer/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 32
         path /opt/homebrew/lib (offset 12)
Load command 20
          cmd LC_RPATH
      cmdsize 96
         path /Users/simon/Developer/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib (offset 12)
Load command 21
$

(the LC_RPATH refers to the host compiler).

  • for my local build using my scripts,
$ otool -lv gcc/ada/gen_il/gen_il-main | grep -A3 LC_RPA
$

(i.o.w no LC_RPATH entries at all).

@simonjwright
Copy link
Contributor Author

I am understanding (hopefully) that the failing get_il-main is built by the stage #1 compiler (and not the host/bootstrap one)?

Yes, it’s the one built in stage 2 by the compiler that was built by the host compiler in stage 1

@iains
Copy link
Owner

iains commented May 31, 2024

I am understanding (hopefully) that the failing get_il-main is built by the stage #1 compiler (and not the host/bootstrap one)?

Yes, it’s the one built in stage 2 by the compiler that was built by the host compiler in stage 1

but I don't think that is right .. it is an exe to run on the build system - it should be built with XXX_FOR_BUILD (which is effectively XXX_FOR_HOST when bootstrapping). I will have to see from your configure lines what is confusing the ada build into using the stageN compiler to build for $build (it does not happen for my configure lines, that exe is built with the bootstrap/build-system compiler in each stage)

@iains
Copy link
Owner

iains commented May 31, 2024

BTW : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79885 is the reason that I avoid --with-build-sysroot= ...

I see both configure lines have a bunch of (what I believe are) unnecessary configure flags - it is important to be very clear about why any non-standard flag is added - the configure script is supposed to get it right for the platform; if it is failing we should fix it and then remove any work-around.

@iains
Copy link
Owner

iains commented May 31, 2024

I would also suggest :
--build=aarch64-apple-darwin2? (edited)
for both cases

  • $build will be copied to $host and then to $target if neither is provided.
  • building for aarch64-apple-darwin23.5.0 is in conflict with the idea that exes should be ABI-compatible with any darwin23 version. That might be upsetting something.

I like your creative specs for the sysroot; but what would be better is an upstream BZ that says what the problem is so that, maybe, we can find a more efficient solution than checking for each invocation.

You are on different OS versions, it seems .. although TBH, I'd hope that does not affect things too much.

Are you using the singe additional fix posted to the gcc-14-1-darwin branch?
75ff8c3

that could, potentially fix issues with mishandling the SDK (although I'd expect a different kind of fail from it .. so maybe not relevant).

edit2: note also that gnatmake et. al. do not accept a --sysroot option .. so that has to be done at a lower level (hopefully, it is working OK).

@simonjwright
Copy link
Contributor Author

I was wrong about the compiler used to build gen_il-main - it’s the host compiler (13.2.0 in these builds).

I tried the 14.1.1 patch, no change.

I’ve done my local builds in a new account running zsh without any .zshrc etc.

uname -a -

GH, Darwin Mac-1717606620560.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:39 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_VMAPPLE arm64

Mine, Darwin ramoth.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103 arm64

The reason for the different specs is almost certainly that I export MACOSX_DEPLOYMENT_TARGET=12, which has the same effect as --disable-darwin-at-rpath -- a BZ issue? Anyway, not relevant here, so I’ll just discuss building on GH vs building locally.

In both GH and local, the second build of gen_il-main gives (modulo the leading path)

otool -L

otool -L ada/gen_il/gen_il-main
ada/gen_il/gen_il-main:
	@rpath/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

otool -l gen_il-main

otool -l ada/gen_il/gen_il-main | grep -A3 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 144
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc/aarch64-apple-darwin23.2.0/13.2.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 112
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 104
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib (offset 12)
Load command 20

.. is the first @loader_path OK? aside from that, looks fine. The last of those should have picked up the libgcc_s.1.1.dylib we need from the host compiler, which presumably it does in stage 1, but in stage 2:

cd ada/gen_il; ./gen_il-main
dyld[75493]: Symbol not found: ___builtin_nested_func_ptr_created
  Referenced from: <A3324F46-453C-314E-B26B-51C847B1E704> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/gcc/ada/gen_il/gen_il-main
  Expected in:     <B3E386AD-E6E3-3A23-B4B3-C06CA85CFE57> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/prev-gcc/libgcc_s.1.1.dylib
/bin/sh: line 1: 75493 Abort trap: 6           ./gen_il-main

... where it’s trying to pick it up from gcc/build/prev-gcc, i.e. the new build.

@iains
Copy link
Owner

iains commented Jun 5, 2024

I was wrong about the compiler used to build gen_il-main - it’s the host compiler (13.2.0 in these builds).

I tried the 14.1.1 patch, no change.

The reason for the different specs is almost certainly that I export MACOSX_DEPLOYMENT_TARGET=12, which has the same effect as --disable-darwin-at-rpath -- a BZ issue?

That is not correct, indeed, @rpath is needed for correct operation on any OS >= 10.11 (which includes 12) .. that is a bug - please could you try with MACOSX_DEPLOYMENT_TARGET=12.0 to see if that helps narrow down the issue.

I'll look at the rest of the report later / tomorrow.

@iains
Copy link
Owner

iains commented Jun 6, 2024

I was wrong about the compiler used to build gen_il-main - it’s the host compiler (13.2.0 in these builds).

I tried the 14.1.1 patch, no change.

I’ve done my local builds in a new account running zsh without any .zshrc etc.

uname -a -

GH, Darwin Mac-1717606620560.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:39 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_VMAPPLE arm64

Mine, Darwin ramoth.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103 arm64

The reason for the different specs is almost certainly that I export MACOSX_DEPLOYMENT_TARGET=12, which has the same effect as --disable-darwin-at-rpath -- a BZ issue? Anyway, not relevant here, so I’ll just discuss building on GH vs building locally.

In both GH and local, the second build of gen_il-main gives (modulo the leading path)

otool -L

otool -L ada/gen_il/gen_il-main
ada/gen_il/gen_il-main:
	@rpath/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

otool -l gen_il-main

otool -l ada/gen_il/gen_il-main | grep -A3 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 144
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc/aarch64-apple-darwin23.2.0/13.2.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 112
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib/gcc (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 104
         path /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/base_gcc/install/lib (offset 12)
Load command 20

.. is the first @loader_path OK? aside from that, looks fine.

yes that allows libraries to find dependents that are co-installed (actually not necessary for libgcc_s.1.1.dylib since it's a leaf - but also should be harmless.

The last of those should have picked up the libgcc_s.1.1.dylib we need from the host compiler, which presumably it does in stage 1, but in stage 2:

this looks right and, for my builds (the use of GCC-11.4 c.f. 13.2 should be irrelevant):

$ otool -lv gcc/ada/gen_il/gen_il-main |grep -A3 LC_RP
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 96
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib/gcc/x86_64-apple-darwin23/11.4.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 64
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib (offset 12)
Load command 19

$ otool -lv prev-gcc/ada/gen_il/gen_il-main |grep -A3 LC_RP
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 96
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib/gcc/x86_64-apple-darwin23/11.4.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 64
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib (offset 12)
Load command 19

$ otool -lv stage1-gcc/ada/gen_il/gen_il-main |grep -A3 LC_RP
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 96
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib/gcc/x86_64-apple-darwin23/11.4.0 (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 64
         path /opt/iains/x86_64-apple-darwin23/gcc-11-4Dr1/lib (offset 12)
Load command 19

So .. as you can see, the $build system compiler is used in each case and correctly adds its rpaths.

cd ada/gen_il; ./gen_il-main
dyld[75493]: Symbol not found: ___builtin_nested_func_ptr_created
  Referenced from: <A3324F46-453C-314E-B26B-51C847B1E704> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/gcc/ada/gen_il/gen_il-main
  Expected in:     <B3E386AD-E6E3-3A23-B4B3-C06CA85CFE57> /Users/runner/work/GNAT-FSF-builds/GNAT-FSF-builds/sbx/aarch64-darwin/gcc/build/prev-gcc/libgcc_s.1.1.dylib
/bin/sh: line 1: 75493 Abort trap: 6           ./gen_il-main

Certainly, this is wrong - and, I think, is the result of disabling darwin-at-rpath (although how that is affecting what the $build system compiler is doing is 'interesting'.

We need to figure out why but, in the short-term how about confirming this with --enable-darwin-at-rpath in the configure options?

MACOSX_DEPLOYMENT_TARGET is, unfortunately, somewhat "grape shot" in that, if you export it from the top level build - it affects every compiler in use. IFF you are trying to bootstrap a compiler targeting macOS N-M on macOS N, really there are several steps that should be taken, and we should discuss that outside this issue.

@iains
Copy link
Owner

iains commented Jun 6, 2024

hmmm I also wonder if this is an odd interaction between gnatmake and the external compiler.

supposing that the gnatmake used to build gen_il is not the $build one, but happens to be the "just built" one .. then perhaps it is passing inappropriate flags to the called GCC version (which is $build) ... it has been the case in the past that unguarded "gnatmake" has been used in Ada build recipes - I have fixed some instances (to make sure that they use GNATMAKE_FOR_BUILD/HOST etc) .. but maybe missed some still .....

@iains
Copy link
Owner

iains commented Jun 6, 2024

one other - possibly tangential - point.

It seems that GH runners have several versions of Xcode installed - but default to the earliest (and broken for GCC) version. Apparently one can use sudo Xcode-select ... to pick a known good one.

simonjwright added a commit to simonjwright/building-gcc-macos-native that referenced this issue Jun 7, 2024
This build is to explore iains/gcc-14-branch#3;
it's a minimally-modified GCC (OK, the MACOSX_DEPLOYMENT_TARGET update:)
to be used for building the problematic 14.1.0.

  * common.sh (MACOSX_DEPLOYMENT_TARGET): 12.0.
  * gcc.sh (GCC_SRC): $SRC_PATH/gcc-13-branch.
@simonjwright
Copy link
Contributor Author

For info - may be irrelevant - the host compiler (13.2.0) included a shim to link using ld-classic if found. This was largely because of the XC 15.0 issue, but also there was the exception handling issue fixed in GCC 13.3.0 (at least in Iains's version) and 14.1.0.

I've just

  • built GCC 13.3.0 without the above mentioned linker shim, regrettably with MACOSX_DEPLOYMENT_TARGET=12.0,
  • run the 14.1.0 build (with the 14.1.1 patch) with Xcode 15.3 and that 13.3.0 as the host compiler, regrettably still with host/build/target aarch64-apple-darwin23.5.0.

Things are greatly improved, because I now get to this:

2024-06-08T12:05:10.1363350Z Comparing stages 2 and 3
2024-06-08T12:05:17.4905580Z Bootstrap comparison failure!
2024-06-08T12:05:17.4918740Z aarch64-apple-darwin23.5.0/libstdc++-v3/src/.libs/libstdc++.6.dylib-master.o differs

I found this comment, so I'll have another go without --without-build-config.

@iains
Copy link
Owner

iains commented Jun 8, 2024

I found this comment, so I'll have another go without --without-build-config.

Indeed my reaction was the same in this case too "what problem is that solving?"
... (I will keep repeating the mantra ... "do not add configure options unless you know why and what problem they are solving". We see more than a few cases where some workaround has been cargo-culted forward many times beyond when it was necessary... usually with unintended consequences).

@simonjwright
Copy link
Contributor Author

The note in their script says this was because of BZ 100340 -- which is RESOLVED FIXED.

And removing it has resulted in a successful build!!

Would you expect a compiler built against XC 15.n to run correctly under XC 15.{m < n}? There doesn’t seem to be a problem with XC 15.{m > n}.

Is it worth trying to find which change actually fixed this problem? (it would be quite tedious)

@iains
Copy link
Owner

iains commented Jun 8, 2024

The note in their script says this was because of [BZ 100340]
And removing it has resulted in a successful build!!

great!

Would you expect a compiler built against XC 15.n to run correctly under XC 15.{m < n}? There doesn’t seem to be a problem with XC 15.{m > n}.

This is not a reasonable scheme - the compiler is configured to use the facilities of 15.3 (which includes a working linker) - if you put it on a system with 15.0 installed, you are making it use a broken linker .. this will end badly :)

(even apart form that) We (the volunteer devs) are an extremely limited resource.
Even CPU-cycles-wise testing the permutations is infeasible.
AFAIU, it's possible to configure a GH runner to use 15.3 ..

.. until we qualify a 15.4 CLT (which does not even appear to be released yet) - I think we have to say that the requirement is 15.3 (there are known bugs with earlier CLT versions).

FWIW - I would love to have time to validate and include our own "binutils" so that these issues go away (or at least become ones we can solve) .. but that's yet another thing limited by resources

Homebrew has similar policies - so I do not think your users should be too surprised - as previously noted even Xcode does not support this kind of mix and match - a toolchain is an incredibly complex entity with many moving parts .. it needs to be tested as a whole :)

@simonjwright
Copy link
Contributor Author

(I will keep repeating the mantra ... "do not add configure options unless you know why and what problem they are solving". We see more than a few cases where some workaround has been cargo-culted forward many times beyond when it was necessary... usually with unintended consequences)

My previous build script was

$GCC_SRC/configure                                                       \
    --prefix=$PREFIX                                                     \
    --without-libiconv-prefix                                            \
    --disable-libmudflap                                                 \
    --disable-libstdcxx-pch                                              \
    --disable-libsanitizer                                               \
    --disable-libcc1                                                     \
    --disable-libcilkrts                                                 \
    --disable-multilib                                                   \
    --disable-nls                                                        \
    --enable-languages=c,c++,ada                                         \
    --host=$BUILD                                                        \
    --target=$BUILD                                                      \
    --build=$BUILD                                                       \
    --without-isl                                                        \
    --with-build-sysroot=$SDKROOT                                        \
    --with-sysroot=                                                      \
    --with-specs="%{!sysroot=*:--sysroot=%:if-exists-else($XCODE $CLT)}" \
    --with-bugurl=$BUGURL                                                \
    --$BOOTSTRAP-bootstrap                                               \
    --enable-host-pie                                                    \
    CFLAGS=-Wno-deprecated-declarations                                  \
    CXXFLAGS=-Wno-deprecated-declarations

I just tried

$GCC_SRC/configure                                                       \
    --prefix=$PREFIX                                                     \
    --enable-languages=c,c++,ada                                         \
    --build=$BUILD                                                       \
    --with-build-sysroot=$SDKROOT                                        \
    --with-sysroot=                                                      \
    --with-specs="%{!sysroot=*:--sysroot=%:if-exists-else($XCODE $CLT)}" \
    --with-bugurl=$BUGURL                                                \
    --$BOOTSTRAP-bootstrap

with 14.1.0 (r1) : complete success!

One surprising thing: libgcc was generated as libgcc.so, a "Mach-O 64-bit bundle arm64".

@simonjwright
Copy link
Contributor Author

I think we're done here. Thanks for the help!

@iains
Copy link
Owner

iains commented Jun 10, 2024

--with-build-sysroot=$SDKROOT is still broken (per the BZ I referenced earlier) - I am not sure why you are using it .. it does not solve any problems, only adds to them... [agreed it would be nice to fix it ..] but...

IFF you use SDKROOT [environment] the compiler should honour that.
IFF you use --with-sysroot=/path/to/sdk [configure] then

  • the built compiler will use that by default .. but...
  • SDKROOT [environment] will override it (i.e. behave the same as clang) ... and...
  • --sysroot= will override that is the user puts it on the c/l

I think using --with-build-sysroot= is defeating the fixincludes process and we cannot, unfortunately, omit fixincludes for any so-far released SDK ..

One surprising thing: libgcc was generated as libgcc.so, a "Mach-O 64-bit bundle arm64".

That is surprising .. and I expect it is also broken in use... I would like to repeat this, if possible - I fear that libtool is getting confused somehow - into thinking this is not a Darwin system...

--$BOOTSTRAP-bootstrap what values does $BOOTSTRAP take?

What other environment (e.g. SDKROOT, MACOSX_DEPLOYMENT_TARGET)?

@iains iains reopened this Jun 10, 2024
@simonjwright
Copy link
Contributor Author

--with-build-sysroot=$SDKROOT is still broken (per the BZ I referenced earlier) - I am not sure why you are using it .. it does not solve any problems, only adds to them... [agreed it would be nice to fix it ..] but...

I can't find a BZ about this?

Current working setup:

$GCC_SRC/configure                                                       \
    --prefix=$PREFIX                                                     \
    --enable-languages=c,c++,ada                                         \
    --build=$BUILD                                                       \
    --with-sysroot=$SDKROOT                                              \
    --with-specs="%{!sysroot=*:--sysroot=%:if-exists-else($XCODE $CLT)}" \
    --with-bugurl=$BUGURL                                                \
    --$BOOTSTRAP-bootstrap

with
BUILD = aarch64-apple-darwin21
SDKROOT = Library/Developer/CommandLineTools/SDKs/MacOSX.sdk (15.3)
XCODE = /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
CLT = /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
BOOTSTRAP = enable (default, could be disable)
MACOSX_DEPLOYMENT_TARGET = 12 (I set it to 12.0 in a different branch & forgot to copy here)

IFF you use SDKROOT [environment] the compiler should honour that.
IFF you use --with-sysroot=/path/to/sdk [configure] then

the built compiler will use that by default .. but...
SDKROOT [environment] will override it (i.e. behave the same as clang) ... and...
--sysroot= will override that is the user puts it on the c/l
I think using --with-build-sysroot= is defeating the fixincludes process and we cannot, unfortunately, omit fixincludes for any so-far released SDK ..

Having built as above,

  • The setting of SDKROOT when using the built compiler doesn't alter matters (that is, when I renamed the XC and CLT directories, a C compilation failed because of looking for includes in /usr/include, and setting SDKROOT to the new directory made no difference).
  • The --with-specs setting worked as expected.

One surprising thing: libgcc was generated as libgcc.so, a "Mach-O 64-bit bundle arm64".

So sorry, it was actually libcc1.so. libgcc_s.1.1.dylib was "Mach-O universal binary with 1 architecture: [arm64:Mach-O 64-bit dynamically linked shared library arm64]".

libcc1.so is the one that's "Mach-O 64-bit bundle arm64"

@iains
Copy link
Owner

iains commented Jun 11, 2024

--with-build-sysroot=$SDKROOT is still broken (per the BZ I referenced earlier) - I am not sure why you are using it .. it does not solve any problems, only adds to them... [agreed it would be nice to fix it ..] but...

I can't find a BZ about this?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79885

So sorry, it was actually libcc1.so. libgcc_s.1.1.dylib was "Mach-O universal binary with 1 architecture: [arm64:Mach-O 64-bit dynamically linked shared library arm64]".

libcc1.so is the one that's "Mach-O 64-bit bundle arm64"

That's expected, it's a plugin.

  • actually macOS does not really care about the suffix (but the user that loads the plugin will care about the filename), it's more a question of convention. The user for this one is GDB .. and I don't think many folks are using GDB on macOS (I do have a build for x86_64, but there's no arm64 port yet)

@iains
Copy link
Owner

iains commented Jul 20, 2024

I have now seen this problem reported another time - but with no luck at pinning down what configure or environment was triggering it.

What is your bootstrap gnat version? (i.e the one on $build, which I guess is the same as $host).

iains pushed a commit that referenced this issue Jul 21, 2024
Here during overload resolution we have two strictly viable ambiguous
candidates #1 and #2, and two non-strictly viable candidates #3 and #4
which we hold on to ever since r14-6522.  These latter candidates have
an empty second arg conversion since the first arg conversion was deemed
bad, and this trips up joust when called on #3 and #4 which assumes all
arg conversions are there.

We can fix this by making joust robust to empty arg conversions, but in
this situation we shouldn't need to compare #3 and #4 at all given that
we have a strictly viable candidate.  To that end, this patch makes
tourney shortcut considering non-strictly viable candidates upon
encountering ambiguity between two strictly viable candidates (taking
advantage of the fact that the candidates list is sorted according to
viability via splice_viable).

	PR c++/115239

gcc/cp/ChangeLog:

	* call.cc (tourney): Don't consider a non-strictly viable
	candidate as the champ if there was ambiguity between two
	strictly viable candidates.

gcc/testsuite/ChangeLog:

	* g++.dg/overload/error7.C: New test.

Reviewed-by: Jason Merrill <[email protected]>
(cherry picked from commit 7fed7e9bbc57d502e141e079a6be2706bdbd4560)
@iains
Copy link
Owner

iains commented Jul 21, 2024

@iains
Copy link
Owner

iains commented Jul 21, 2024

see also : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116021 (which is about unpatched trunk) - so this is almost certainly not something to do with additional patches on the branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants