Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding external dependencies in non-standard places #439

Open
awvwgk opened this issue Apr 12, 2021 · 31 comments
Open

Finding external dependencies in non-standard places #439

awvwgk opened this issue Apr 12, 2021 · 31 comments

Comments

@awvwgk
Copy link
Member

awvwgk commented Apr 12, 2021

This is a whole can of worms. The documentation regarding the dependency() function in meson gives a nice impression on all the things one can / has to consider when dealing with dependencies.

How does fpm find dependencies now?
Currently, we assume the compiler can find external libraries and include files all by itself.

This works fine for system libraries, but usually there are packages under different prefixes like /opt or in the users home which could also be used, but are not included by default.

Using a dependency tool
A somewhat standard way to handle non-system dependencies are pkg-config or cmake package files, some platforms might provide their own like OSX frameworks. Also some packages are known to provide incorrect package files.

Using environment variables
An third-party library usually provides a way to make itself known to the system, using environment variables like CPATH to add include directories, LIBRARY_PATH to add to the library search path and LD_LIBRARY_PATH to add to the runtime load path. This mechanism allows fpm to automatically pick up third-party dependencies by relying on the compiler.

Setup scripts / environment modules
Setting those variables in the first place is tricky, usually third-party libraries provide scripts which can be sourced by the shell or an environment module that can be loaded. Sometimes those scripts and modules are not working correctly as well. This is a common issue with environment modules which miss certain crucial environment variables (missing CMAKE_PREFIX_PATH is a classic).


As a developer there is nothing more painful than working in a broken environment. Either because some overzealous setup script in /etc/profile.d messes with all other packages, an environment module exports the wrong paths, or a pc file with a typo won't be recognized.

Can we / Should we do something about this?

@LKedward
Copy link
Member

Thanks for opening Sebastian @awvwgk and for the detailed summary of issues;
this is indeed a complicated one.

So it seems the environment variables / environment modules approach should already work with fpm since they rely on the compiler?

It seems that a possible next step would be to add automatic support for pkg-config and then cmake like meson, perhaps with a syntax like:

[dependencies]
zlib = { external = "zlib" }

which, like meson, attempts to find the package automatically using the supported methods.


Regarding erroneous module files or pc files, I don't think there is very much that we can do in fpm to alleviate this is there?

@ivan-pi
Copy link
Member

ivan-pi commented Apr 13, 2021

It seems that a possible next step would be to add automatic support for pkg-config and then cmake like meson

Should this be pursued directly under the roof of fpm, or deferred to the build script mechanism? In Cargo , support of pkg-config-rs, cmake, and other build tools (for C and C++ code) is provided through specific crates. These are then placed in the build-dependencies section:

# Cargo.toml
[build-dependencies]
cmake = "0.1"

Packages which rely on these tools are required to provide a build.rs script, that will typically inspect the system and figure out all the necessary flags, or even download and install missing dependencies. The idea is of course one has a access to a full programming language and not just restrictive manifest entries.

I can see pros and cons to both approaches (pursuing something like pkg-config support directly within fpm vs deferring it to the build script mechanism).

@LKedward
Copy link
Member

I can see pros and cons to both approaches (pursuing something like pkg-config support directly within fpm vs deferring it to the build script mechanism).

This is an interesting point; having thought about it, IMO I'd prefer to have direct support within fpm to simplify the user experience as much as possible.

@awvwgk
Copy link
Member Author

awvwgk commented Apr 16, 2021

We could probably start a separate fpm project to implement a pkg-config wrapper and use it in fpm to separate the two efforts.

@ivan-pi
Copy link
Member

ivan-pi commented Apr 16, 2021

Personally, I would support having pkg-config integrated directly (I mean within the manifest, the pkg-config wrapper can be in a separate git project). I have the feeling this could help cover many external dependencies already. But I don't think it is feasible to integrate support for all possible build system files (CMake, meson, scons, etc.) within fpm (at least not in the near future). Also note that pkg-config is not part of the standard MinGW installation, so Fortran users on Windows users could still find themselves with poor support.

@milancurcic, do you have any special reasons why pkg-config was not an option for you with NetCDF support in #438?

@ivan-pi
Copy link
Member

ivan-pi commented Apr 16, 2021

We could probably start a separate fpm project to implement a pkg-config wrapper and use it in fpm to separate the two efforts.

Maybe this gives birth to a Fortran implementation of pkg-config 👀. I wonder how much effort that would be (see https://gitlab.freedesktop.org/pkg-config/pkg-config, or https://github.com/pkgconf/pkgconf for C versions).

Addendum: I've scrapped this idea already. An interface to libpkgconf sounds more reasonable. They already support Python bindings this way: https://github.com/pkgconf/pkgconf-py. A feature comparison between the original pkg-config and pkgconf is located at http://pkgconf.org/features.html

Addendum 2: there is also a MIT-licensed Python interface to the command line pkg-config tool. Maybe it is worth looking into for a rough Fortran design.

@milancurcic
Copy link
Member

@milancurcic, do you have any special reasons why pkg-config was not an option for you with NetCDF support in #438?

It seems to be a system misconfig. On my Ubuntu 18.10:

$ pkg-config netcdf --cflags
-I/usr/include/hdf5/serial
milan@hyperion:~$ ls /usr/include/hdf5/serial/netcdf.mod
ls: cannot access '/usr/include/hdf5/serial/netcdf.mod': No such file or directory
milan@hyperion:~$ locate netcdf.mod
/home/milan/opt/netcdf-4.6.2_intel19/include/netcdf.mod
/home/milan/opt/netcdf-fortran-4.4.4_intel19/include/netcdf.mod
/usr/include/netcdf.mod

So, pkg-config thinks netcdf.mod is somewhere where it's not. I could do some research about pkg-config and fix the pc file, but now I'm learning another tool. I don't want that.

Setting paths myself works just fine for me. I don't mind people adding pkg-config support into fpm, but it's not for everybody.

@milancurcic
Copy link
Member

milancurcic commented Apr 16, 2021

Back to the original question, there is another approach not mentioned: fpm could search the common system paths itself, such as /usr/include for modules, /usr/lib*/* for libraries etc (Edit: on second thought this may not be needed for linking). I don't think this is a good idea, but putting it out there.

@arjenmarkus
Copy link
Member

arjenmarkus commented Apr 16, 2021 via email

@awvwgk
Copy link
Member Author

awvwgk commented Apr 16, 2021

So, pkg-config thinks netcdf.mod is somewhere where it's not. I could do some research about pkg-config and fix the pc file, but now I'm learning another tool. I don't want that.

My guess is that the neither /home/milan/opt/netcdf-4.6.2_intel19/lib*/pkgconfig nor /home/milan/opt/netcdf-fortran-4.4.4_intel19/lib*/pkgconfig are actually present in the PKG_CONFIG_PATH environment variable, but you have a system installation of netcdf which is always found if no custom search path is provided.

This kinda goes back to the original problem, you have an installation, but no mean to tell your system that you want to use it. Installing outside of a package or environment manager leaves this job to the user, which is plain annoying. My personal solution to this problem is a local environment module setup to manage Fortran libraries for all the different compilers I want to use, writing module files is a bit tedious but it served my needs perfectly so far.

Projects like spack and EasyBuild seem to be the next logical step to get a handle on this issue, but those always seemed like overkill for my local development machine.

@milancurcic
Copy link
Member

but you have a system installation of netcdf which is always found if no custom search path is provided.

/usr/include/netcdf.mod is not found unless I specify the path to the compiler. I don't think my custom builds under opt/ are relevant here.

@awvwgk
Copy link
Member Author

awvwgk commented Apr 16, 2021

I think this issue might be related to #441, it is really disappointing to find that GFortran indeed doesn't use any include path by default to search for module files (just checked locally). Additionally, pkg-config is smart enough to drop all include paths and library paths which are already part of the system paths (including values in CPATH and LIBRARY_PATH), unless it is told otherwise with --keep-system-cflags.

That said, it would probably be easier to use a non-system netcdf with GFortran than using a system installation in this setup. Sounds truly broken to me.

@awvwgk
Copy link
Member Author

awvwgk commented Apr 16, 2021

Let's call it a long-standing bug in GFortran (since 4.3.0): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35707

@milancurcic
Copy link
Member

@awvwgk yes, exactly, and there is a pkg-config misconfig on my system (off the shelf from apt, I didn't touch it). Pkg-config shouldn't think by default that netcdf is in a directory it is not.

@awvwgk
Copy link
Member Author

awvwgk commented Apr 17, 2021

there is a pkg-config misconfig on my system

I don't think it is misconfigured, it just requires HDF5 and since only HDF5 is in a non-system path it is shown as cflags. Try pkg-config netcdf --keep-system-cflags --cflags (only works for pkgconf, unfortunately). It is just that the default behaviour of pkg-config that doesn't work together with the bug in GFortran.

@awvwgk
Copy link
Member Author

awvwgk commented Apr 17, 2021

How does fpm find dependencies now?
Currently, we assume the compiler can find external libraries and include files all by itself.

Turns out that my initial premise was actually wrong, but if we have to drop this assumption, how can we even build a sane model for dealing with external dependencies?

@LKedward
Copy link
Member

If I understand correctly, the only issue is with trying to use third-party .mod files. It is worth noting that the linked gfortran bug was initially marked as WONFIX because the developers didn't want to encourage library developers to rely on distributing .mod files.

... how can we even build a sane model for dealing with external dependencies?

My opinion on this is that in the short term we should provide an environment variable that allows specifying include directories to fpm for existing pre-built .mod files, and in the long term we encourage/help developers to provide proper module interfaces for their libraries that can be distributed as fpm packages.

@milancurcic
Copy link
Member

It is just that the default behaviour of pkg-config that doesn't work together with the bug in GFortran.

Which takes us back to the bottom line and my original point. pkg-config does not generally and universally work for finding modules.

@milancurcic
Copy link
Member

Same behavior with pkg-config on Ubuntu 20.04:

$ pkg-config netcdf --cflags
-I/usr/include/hdf5/serial

nc-config gives the correct answer (and it always had, from my experience):

$ nc-config --cflags
-I/usr/include -I/usr/include/hdf5/serial

@awvwgk
Copy link
Member Author

awvwgk commented Apr 17, 2021

How do we move forward from here? nc-config is an unique feature of netcdf, to use it with fpm we would require a custom build script unless we would hardwire this as special case in fpm.

@milancurcic
Copy link
Member

Right, I mentioned nc-config only as an easy way for users to find their NetCDF paths. I don't recommend even considering to shoe-horn it into fpm.

To be clear, I don't not support using pkg-config in fpm. I'm not experienced with it, and I trust you and others that it may be a good solution. But it would be important to let users specify paths explicitly if needed, in case fpm is not finding a module that exists, or it's finding the wrong one. I think this is in line with your wonderful comment elsewhere that went something like, "nothing more frustrating than a smart feature that doesn't work and can't be overriden".

For now, without better answer, I recommend that we:

  1. Do nothing special about this. Simply let users know in the docs that they are responsible to provide the paths.
  2. Make it as easy as possible for users to provide the paths.

For point 2, I think #444 is a great step forward. In addition, we can consider to automatically add system paths like /usr/include and /usr/local/include by default, but there should be a way for the user to disable this (e.g. --no-system-cflags or similar).

Then we wait and listen to the users. If there are many reports that say "we want fpm to automatically find external modules", then we consider a smarter solution.

@ghost
Copy link

ghost commented Apr 20, 2021

Back to the original question, there is another approach not mentioned: fpm could search the common system paths itself, such as /usr/include for modules, /usr/lib*/* for libraries etc (Edit: on second thought this may not be needed for linking). I don't think this is a good idea, but putting it out there.

I think this is a good idea. We can combine all the ideas in here into a subcommand, fpm-find module(s) [compiler] [arch]. It will do whatever is necessary to find the correct modules and report back to fpm. fpm can save the results to cache.toml.

@ivan-pi
Copy link
Member

ivan-pi commented Apr 20, 2021

Thanks Carlos for your perspective.

It would be great to have a few more opinions on this whole topic.

cc @nncarlson @vmagnin @scivision @WardF @marshallward @everythingfunctional

@scivision
Copy link
Member

I have experience with this as a maintainer of Meson and a regular contributor to CMake. In general some custom logic is needed. However, pkg-config often works. There are packages that distribute broken CMake config files and broken pkgconfig files. There are even packages with their own special config scripts like HDF5 that are broken on certain platforms. I have found no one universal solution, and that's why you'll see I've created custom logic in Meson and CMake for packages like HDF5, NetCDF, and MPI, that are the most common offenders for needing custom find logic in the build system.

In short I would be in support of using pkgconfig. But there will need to be custom fpm internal logic for some popular packages as noted above, paradoxically.

@ivan-pi
Copy link
Member

ivan-pi commented Apr 20, 2021

Back to the original question, there is another approach not mentioned: fpm could search the common system paths itself, such as /usr/include for modules, /usr/lib*/* for libraries etc (Edit: on second thought this may not be needed for linking). I don't think this is a good idea, but putting it out there.

I think this is a good idea. We can combine all the ideas in here into a subcommand, fpm-find module(s) [compiler] [arch]. It will do whatever is necessary to find the correct modules and report back to fpm. fpm can save the results to cache.toml.

In a way this is similar to the suggestion from Brad in #444 (comment):

Also, given the (relatively) small number of external packages that may need to be supported this way, should they be supported as "features" that are on for a package that needs them, and users have a config file (maybe $HOME/.config/fpm.toml) that specifies where the installation resides for a given compiler? That seems like the easiest system for users to maintain.

@rouson
Copy link
Contributor

rouson commented May 17, 2021

Any updates on this a plan for resolving this issue? I like the idea of a subcommand fpm-find module(s) [compiler] [arch].

I'm converting an application from cmake to fpm. The application contains a use netcdf statement so I used homebrew to install netcdf on macOS, which installed /usr/local/Cellar/netcdf/4.8.0_1/include/netcdf.mod. How can I tell fpm where to search for netcdf.mod?

@rouson
Copy link
Contributor

rouson commented May 17, 2021

I tried

 ± fpm build --flag -J/usr/local/Cellar/netcdf/4.8.0_1/include/  
 + mkdir -p build/gfortran_2B85EFB7FAF749F9/icar
 + gfortran -c ./src/constants/icar_constants.f90 -J/usr/local/Cellar/netcdf/4.8.0_1/include/ -J build/gfortran_2B85EFB7FAF749F9/icar -I build/gfortran_2B85EFB7FAF749F9/icar  -o build/gfortran_2B85EFB7FAF749F9/icar/src_constants_icar_constants.f90.o
f951: Fatal Error: gfortran: Only one '-J' option allowed
compilation terminated.
 Command failed
ERROR STOP 

@awvwgk
Copy link
Member Author

awvwgk commented May 17, 2021

I tried

 ± fpm build --flag -J/usr/local/Cellar/netcdf/4.8.0_1/include/  
 + mkdir -p build/gfortran_2B85EFB7FAF749F9/icar
 + gfortran -c ./src/constants/icar_constants.f90 -J/usr/local/Cellar/netcdf/4.8.0_1/include/ -J build/gfortran_2B85EFB7FAF749F9/icar -I build/gfortran_2B85EFB7FAF749F9/icar  -o build/gfortran_2B85EFB7FAF749F9/icar/src_constants_icar_constants.f90.o
f951: Fatal Error: gfortran: Only one '-J' option allowed
compilation terminated.
 Command failed
ERROR STOP 

try -I instead of -J.

@LKedward
Copy link
Member

@rouson, for netcdf you may want to try my netcfd-interfaces fpm package for a compiler-independent solution. I have not tested it on MacOS, but as long as you can link against the NetCDF library, then the package should work as an alternative to needing to find the corresponding .mod file.

@AtilaSaraiva
Copy link

AtilaSaraiva commented Jun 13, 2022

I'd like to also chip in the discussion saying that I'd also like for pkg-config detection support, like meson.

meson doesnt have a nice fortran support, but at least this pkg-config feature is specially useful. I'd enjoy having this on fpm. Can someone fill me in the necessary steps to have this working?

@ivan-pi
Copy link
Member

ivan-pi commented Aug 11, 2022

Using environment variables
An third-party library usually provides a way to make itself known to the system, using environment variables like CPATH to add include directories, LIBRARY_PATH to add to the library search path and LD_LIBRARY_PATH to add to the runtime load path. This mechanism allows fpm to automatically pick up third-party dependencies by relying on the compiler.

Just wanted to note that Spack appears to be moving away from such paths: spack/spack#28354 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants