Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update AMPI and ROMIO to be compatible gcc14 for spack #3866

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ericjbohm
Copy link
Contributor

Fix a gcc 14 "bug" in the ROMIO configure test by explicit return type of int for main function.

Reconcile incompatible function prototypes to the one in the MPI standard.

@ericjbohm ericjbohm added Bug Something isn't working AMPI Issues affecting AMPI Build & test automation CMake The CMake build system CI Continuous Integration labels Jan 23, 2025
@ericjbohm ericjbohm self-assigned this Jan 23, 2025
@ericjbohm
Copy link
Contributor Author

I managed to resolve a few of the GCC 14 issues, but I'm not sure what to do about the multiple definitions of MPI_Info.

@stwhite91
Copy link
Collaborator

stwhite91 commented Jan 24, 2025

Can you provide more info on the multiple definition of MPI_Info issue?

Edit: I found it:

==> Error: ProcessError: Command exited with status 2:
    './build' 'LIBS' 'netlrts-linux-x86_64' 'gcc-14' 'gfortran-14' '-j4' '--destination=/home/runner/work/charm/spack/opt/spack/linux-ubuntu24.04-zen2/gcc-14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq6gm2p43wj' 'smp' '--build-shared' '--with-production' '-g'

7 errors found in build log:
     955       CC       mpi-io/mpich_fileutil.lo
     956     [ 51%] Building CXX object src/ck-core/CMakeFiles/ck.dir/__/ck-ldb
             /BaseLB.C.o
     957     In file included from /home/runner/work/charm/spack/opt/spack/linu
             x-ubuntu24.04-zen2/gcc-14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq
             6gm2p43wj/include/ampi/mpi.h:1239,
     958                      from ./adio/include/adio.h:72,
     959                      from mpi-io/mpioimpl.h:15,
     960                      from mpi-io/mpich_fileutil.c:6:
  >> 961     /home/runner/work/charm/spack/opt/spack/linux-ubuntu24.04-zen2/gcc
             -14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq6gm2p43wj/include/mpio
             .h:58:29: error: conflicting types for 'MPI_Info'; have 'struct MP
             IR_Info *'
     962        58 |   typedef struct MPIR_Info *MPI_Info;
     963           |                             ^~~~~~~~
     964     /home/runner/work/charm/spack/opt/spack/linux-ubuntu24.04-zen2/gcc
             -14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq6gm2p43wj/include/ampi
             /mpi.h:339:13: note: previous declaration of 'MPI_Info' with type 
             'MPI_Info' {aka 'int'}
     965       339 | typedef int MPI_Info;
     966           |             ^~~~~~~~
     967     /home/runner/work/charm/spack/opt/spack/linux-ubuntu24.04-zen2/gcc
             -14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq6gm2p43wj/include/mpio
             .h:59:10: warning: "MPI_INFO_NULL" redefined
     968        59 | # define MPI_INFO_NULL         ((MPI_Info) 0)
     969           |          ^~~~~~~~~~~~~
     970     /home/runner/work/charm/spack/opt/spack/linux-ubuntu24.04-zen2/gcc
             -14.2.0/charmpp-main-7bjnyo7iw57gjkcmiw6laaq6gm2p43wj/include/ampi
             /mpi.h:[258](https://github.com/charmplusplus/charm/actions/runs/12940126862/job/36093730796?pr=3866#step:4:259):9: note: this is the location of the previous definitio
             n
     971       258 | #define MPI_INFO_NULL      (-1)
     972           |         ^~~~~~~~~~~~~
  >> 973     Fatal Error by charmc in directory /home/runner/work/charm/spack/o
             pt/spack/linux-ubuntu24.04-zen2/gcc-14.2.0/charmpp-main-7bjnyo7iw5
             7gjkcmiw6laaq6gm2p43wj/src/libs/ck-libs/ampi/romio-prefix/src/romi
             o

@stwhite91
Copy link
Collaborator

It looks like the offending code in ROMIO's header is protected by a macro HAVE_MPI_INFO which should be defined to 1 since AMPI defines all of the MPI_Info interface. So my guess is that ROMIO's configure script is not testing/setting HAVE_MPI_INFO correctly

@ericjbohm
Copy link
Contributor Author

It looks like the offending code in ROMIO's header is protected by a macro HAVE_MPI_INFO which should be defined to 1 since AMPI defines all of the MPI_Info interface. So my guess is that ROMIO's configure script is not testing/setting HAVE_MPI_INFO correctly

That is part of the problem, but is basically just the tip of the iceberg. Once you get past that there is a cascade of bugs all stemming from having multiple types (MPI_Info, MPI_Request, MPIO_Request, MPI_Status) all being basically aliases of int. Along with some slightly sketchy ptr to int conversions. In prior versions of gcc it didn't care about you assigning differently named types back and forth without explicit casts. In GCC 14 it cares a lot, arguably too much as they're all int under the hood.

I tried liberally sprinkling some explicit casts where necessary and could get past a lot of it, but it feels like a bottomless well of debugging that should be solvable by a better type definition scheme.

@stwhite91
Copy link
Collaborator

I see, yeah this is a problem because ROMIO is tailored to MPICH, and MPICH's MPI types are pointers. OpenMPI's MPI types are ints akin to AMPI's. It would take some effort, but if there is a desire to improve AMPI's IO functionality I think looking into OpenMPI's ompio library or its version of ROMIO v3.4.1 might be a good starting point: open-mpi/ompi@bb73962

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMPI Issues affecting AMPI Bug Something isn't working Build & test automation CI Continuous Integration CMake The CMake build system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants