Skip to content

Latest commit

 

History

History
307 lines (244 loc) · 15.7 KB

README.md

File metadata and controls

307 lines (244 loc) · 15.7 KB

MFI

Modern Fortran interfaces to BLAS and LAPACK

This project aims to be a collection of modern fortran interfaces to commonly used procedure, for now BLAS and LAPACK. The main goal is to reduce the pain of using such libraries, providing a generic interface to the intrinsic supported types and identifying the optional or reconstructible arguments of a given procedure. The code uses fypp, to generate the interfaces automatically to all supported types and kinds.

Example $C = AB$

program main
use mfi_blas, only: mfi_gemm
use f77_blas, only: f77_gemm
use iso_fortran_env
implicit none
! ... variables and other boilerplate code here ...
! Original interface: type and precision dependent
call cgemm('N','N', N, N, N, alpha, A, N, B, N, beta, C, N)
! Improved F77 interface: still a lot of arguments
call f77_gemm('N','N', N, N, N, alpha, A, N, B, N, beta, C, N)
! Modern fortran interface: less arguments and more readable 
call mfi_gemm(A,B,C)
end program

Getting Started

FPM

This project supports the Fortran Package Manager. Follow the directions on that page to install FPM if you haven't already.

Using as a dependency in FPM

Add a entry in the "dependencies" section of your project's fpm.toml

# fpm.toml
[ dependencies ]
mfi = { git="https://github.com/14NGiestas/mfi.git", branch="mfi-fpm" }

Manual building

First get the code, by cloning the repo:

git clone https://github.com/14NGiestas/mfi.git
cd mfi/

Dependencies

Install the fypp using the command:

sudo pip install fypp

Install lapack and blas (use the static versions). This can be tricky, if you run into any problem, please open an issue.

Arch Linux

Ubuntu

Usually you can do the following:

make
fpm test

By default, the lapack-dev package (which provides the reference blas) do not provide the i?amin implementation (among other extensions) in such cases you can use blas extensions with:

make FYPPFLAGS=-DMFI_EXTENSIONS
fpm test

Or if you have support to such extensions in your blas provider you can:

make FYPPFLAGS="-DMFI_EXTENSIONS -DMFI_LINK_EXTERNAL"
fpm test

which will generate the code linking extensions to the external library

Support

Please note that this project is experimental, errors and mistakes are to be expected.

There four levels of interfaces that can be used:

  1. original f77: explicit declared original interface.
call cgemm('N','N', N, N, N, alpha, A, N, B, N, beta, C, N)
  1. improved f77: original argument convention without need of a prefix.
call f77_gemm('N','N', N, N, N, alpha, A, N, B, N, beta, C, N)
  1. modern interface with prefix:
call mfi_sgemm(A,B,C)
  1. modern interface:
call mfi_gemm(A,B,C)

If you are searching for a specific interface check the API reference

BLAS

Level 1

Most of BLAS level 1 routines can be replaced by intrinsincs and other features in modern fortran.

done name description modern alternative
👍 asum Sum of vector magnitudes sum
👍 axpy Scalar-vector product a*x + b
👍 copy Copy vector x = b
👍 dot Dot product dot_product
👍 dotc Dot product conjugated
👍 dotu Dot product unconjugated
og77 sdsdot Compute the inner product of two vectors with extended precision accumulation.
og77 dsdot Compute the inner product of two vectors with extended precision accumulation and result.
👍 nrm2 Vector 2-norm (Euclidean norm) norm2
👍 rot Plane rotation of points
👍 rotg Generate Givens rotation of points
👍 rotm Modified Givens plane rotation of points
👍 rotmg Generate modified Givens plane rotation of points
👍 scal Vector-scalar product a*x + b
👍 swap Vector-vector swap

Level 1 - Utils / Extensions

done? name description modern alternatives obs
👍 iamax Index of the maximum absolute value element of a vector maxval, maxloc
👍 iamin Index of the minimum absolute value element of a vector minval, minloc
👍 lamch Determines precision machine parameters. huge, tiny, epsilon Obs: had to add a parameter so fortran can distinguish between the single and double precision with the same interface. For values of cmach see: lamch

Level 2

done? name description
👍 gbmv Matrix-vector product using a general band matrix
👍 gemv Matrix-vector product using a general matrix
👍 ger Rank-1 update of a general matrix
👍 gerc Rank-1 update of a conjugated general matrix
👍 geru Rank-1 update of a general matrix, unconjugated
👍 hbmv Matrix-vector product using a Hermitian band matrix
👍 hemv Matrix-vector product using a Hermitian matrix
👍 her Rank-1 update of a Hermitian matrix
👍 her2 Rank-2 update of a Hermitian matrix
👍 hpmv Matrix-vector product using a Hermitian packed matrix
👍 hpr Rank-1 update of a Hermitian packed matrix
👍 hpr2 Rank-2 update of a Hermitian packed matrix
👍 sbmv Matrix-vector product using symmetric band matrix
👍 spmv Matrix-vector product using a symmetric packed matrix
👍 spr Rank-1 update of a symmetric packed matrix
👍 spr2 Rank-2 update of a symmetric packed matrix
👍 symv Matrix-vector product using a symmetric matrix
👍 syr Rank-1 update of a symmetric matrix
👍 syr2 Rank-2 update of a symmetric matrix
👍 tbmv Matrix-vector product using a triangular band matrix
👍 tbsv Solution of a linear system of equations with a triangular band matrix
👍 tpmv Matrix-vector product using a triangular packed matrix
👍 tpsv Solution of a linear system of equations with a triangular packed matrix
👍 trmv Matrix-vector product using a triangular matrix
👍 trsv Solution of a linear system of equations with a triangular matrix

Level 3

done? name description
👍 gemm Computes a matrix-matrix product with general matrices.
👍 hemm Computes a matrix-matrix product where one input matrix is Hermitian and one is general.
👍 herk Performs a Hermitian rank-k update.
👍 her2k Performs a Hermitian rank-2k update.
👍 symm Computes a matrix-matrix product where one input matrix is symmetric and one matrix is general.
👍 syrk Performs a symmetric rank-k update.
👍 syr2k Performs a symmetric rank-2k update.
👍 trmm Computes a matrix-matrix product where one input matrix is triangular and one input matrix is general.
👍 trsm Solves a triangular matrix equation (forward or backward solve).

LAPACK ⚠️

  • Lapack is really huge, so I'm going to focus on getting the improved f77 interfaces ready first. Anything I end up using I'm going to implement.

Linear solve, $AX = B$

Cholesky: computational routines (factor, cond, etc.)
done? name description
👍 pocon condition number estimate
Orthogonal/unitary factors (QR, CS, etc.)
done? name description
👍 geqrf Computes the QR factorization of a general m-by-n matrix.
👍 gerqf Computes the RQ factorization of a general m-by-n matrix.
👍 getrf Computes the LU factorization of a general m-by-n matrix.
👍 getri Computes the inverse of an LU-factored general matrix.
👍 getrs Solves a system of linear equations with an LU-factored square coefficient matrix, with multiple right-hand sides.
👍 hetrf Computes the Bunch-Kaufman factorization of a complex Hermitian matrix.
👍 potrf Computes the Cholesky factorization of a symmetric (Hermitian) positive-definite matrix.
👍 potri Computes the inverse of a Cholesky-factored symmetric (Hermitian) positive-definite matrix.
👍 potrs Solves a system of linear equations with a Cholesky-factored symmetric (Hermitian) positive-definite coefficient matrix, with multiple right-hand sides.
f77 orgqr Generates the real orthogonal matrix Q of the QR factorization formed by geqrf.
f77 orgrq Generates the real orthogonal matrix Q of the RQ factorization formed by gerqf.
f77 ormqr Multiplies a real matrix by the orthogonal matrix Q of the QR factorization formed by geqrf.
f77 ormrq Multiplies a real matrix by the orthogonal matrix Q of the RQ factorization formed by gerqf.
sytrf Computes the Bunch-Kaufman factorization of a symmetric matrix.
trtrs Solves a system of linear equations with a triangular coefficient matrix, with multiple right-hand sides.
f77 ungqr Generates the complex unitary matrix Q of the QR factorization formed by geqrf.
f77 ungrq Generates the complex unitary matrix Q of the RQ factorization formed by gerqf.
f77 unmqr Multiplies a complex matrix by the unitary matrix Q of the QR factorization formed by geqrf.
f77 unmrq Multiplies a complex matrix by the unitary matrix Q of the RQ factorization formed by gerqf.
f77 org2r
f77 orm2r
f77 ung2r
f77 unm2r
f77 orgr2
f77 ormr2
f77 ungr2
f77 unmr2

Singular Value and Eigenvalue Problem Routines

done? name description
👍 gesvd Computes the singular value decomposition of a general rectangular matrix.
👍 heevd Computes all eigenvalues and, optionally, all eigenvectors of a complex Hermitian matrix using divide and conquer algorithm.
👍 hegvd Computes all eigenvalues and, optionally, all eigenvectors of a complex generalized Hermitian definite eigenproblem using divide and conquer algorithm.
f77 heevr Computes the eigenvalues and, optionally, the left and/or right eigenvectors for HE matrices.
f77 heevx Computes the eigenvalues and, optionally, the left and/or right eigenvectors for HE matrices.
gebrd Reduces a general matrix to bidiagonal form.
hetrd Reduces a complex Hermitian matrix to tridiagonal form.
orgbr Generates the real orthogonal matrix Q or PT determined by gebrd.
orgtr Generates the real orthogonal matrix Q determined by sytrd.
ormtr Multiplies a real matrix by the orthogonal matrix Q determined by sytrd.
syevd Computes all eigenvalues and, optionally, all eigenvectors of a real symmetric matrix using divide and conquer algorithm.
sygvd Computes all eigenvalues and, optionally, all eigenvectors of a real generalized symmetric definite eigenproblem using divide and conquer algorithm.
sytrd Reduces a real symmetric matrix to tridiagonal form.
ungbr Generates the complex unitary matrix Q or PT determined by gebrd.
ungtr Generates the complex unitary matrix Q determined by hetrd.
unmtr Multiplies a complex matrix by the unitary matrix Q determined by hetrd.
Least squares
done name description
f77 gels least squares using QR/LQ
f77 gelst least squares using QR/LQ with T matrix
f77 gelss least squares using SVD, QR iteration
f77 gelsd least squares using SVD, divide and conquer
f77 gelsy least squares using complete orthogonal factor
f77 getsls least squares using tall-skinny QR/LQ
f77 gglse equality-constrained least squares
f77 ggglm Gauss-Markov linear model

Other Auxiliary Routines

There are some other auxiliary lapack routines around, that may apear here:

name Data Types Description
mfi_lartg s, d, c, z Generates a plane rotation with real cosine and real/complex sine.