Geospace preprint archives

While ArXiv is among the earliest and best known preprint archives, more focused archives can provide easier access to a targeted audience with good reputation. Here are a few I’ve come across relevant to geosciences:

Switch CMake Generator Ninja

The default CMake build generator is operating system dependent. In general many projects can benefit from increased build speed and especially rebuild speed of Ninja. To switch the default build generator on any platform, set environment variable CMAKE_GENERATOR

CMAKE_GENERATOR=Ninja

Ninja can be downloaded or:

pip install ninja

Conan

If using Conan Package Manager, tell Conan to use Ninja by setting environment variable

CONAN_CMAKE_GENERATOR=Ninja

Ninja job pools for low memory CMake builds

An increasing number of systems have multiple CPUs, say four, six or eight but may have modest RAM of 1 or 2 GB. An example of this is the Raspberry Pi. Ninja job pools allow specifying a specific limit on number of CPU processes used for a CMake target. That is, unlike GNU Make where we have to choose one CPU limit for the entire project, with Ninja we can select CPU limits on a per-target basis. That’s one important benefit of Ninja for speeding up builds of medium to large projects, and why we see increasing adoption of Ninja in prominent projects including Google Chrome. This is another reason why we generally strongly encourage using Ninja with CMake.

Specifically, CMake + Ninja builds can limit CPU process count via target properties:

The global JOB_POOLS property defines the pools for the targets.

Upon experiencing build issues such as SIGKILL due to excessive memory usage, inspect the failed build step to see if it was a compile or link operation, to determine which to limit on a per-target basis.

Example

Suppose that 500 MB of RAM are needed to compile a target and we decide to ensure at least 1 GB of RAM is available to give some margin. Thus we constrain the number of CPU processes for that target based on CMake-detected available physical memory. The appropriate parameters for your project are determined by trial and error. If this method still is not reliable even with a single CPU process, then a possible solution is to cross-compile, that is to build the executable on a more capable system for this modest system.

CMakeLists.txt includes:

set_property(GLOBAL PROPERTY JOB_POOLS one_jobs=1 two_jobs=2)

cmake_host_system_information(RESULT _memfree QUERY AVAILABLE_PHYSICAL_MEMORY)

add_library(big big1.c big2.f90)
if(_memfree LESS 1000)
  set_target_properties(big PROPERTIES JOB_POOL_COMPILE one_jobs)
endif()

Visual Studio update Ninja build

The Ninja build executable for Visual Studio is located like:

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/Ninja/ninja.exe

which can be determined from:

where ninja

The factory Visual Studio Ninja version may be too old for use with CMake Fortran projects. If this is so, when Generating a CMake project from Visual Studio, errors will result stating so. This is particularly needed for CMake + Ninja + Fortran projects, which require Ninja ≥ 1.10.

Fix

Replace the Visual Studio Ninja executable with the latest Ninja version, perhaps with a soft link to the ninja.exe desired.

  1. move the old Ninja exe

    move "C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/Ninja/ninja.exe" ninja.old
    
  2. create a softlink to the new Ninja like:

    mklink "C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/Ninja/ninja.exe" C:/ninja/ninja.exe
    

One-step build/install CMake

CMake ≥ 3.15 is strongly recommended in general for more robust and easy syntax.

Compile/Install CMake

This will get you the latest release of CMake. For Linux and Mac, admin/sudo is NOT required.

pip

There is an unoffical PyPi CMake package:

pip install cmake

CMake major versions

  • 3.17: Ninja Multi-Config generator, --debug-find to see what find_package() is trying to do,
  • 3.16: Precompiled headers, unity builds, many advanced project features
  • 3.15: CMAKE_GENERATOR environment variable works like -G option
  • 3.14: check_fortran_source_runs(), better FetchContent
  • 3.13: ctest --progress, better Matlab compiler support, lots of new linking options, fixes to Fortran submodule bugs, cmake -B build incantation, target_sources() with absolute path
  • 3.12: transitive library specification (out of same directory), full Fortran Submodule support
  • 3.11: specify targets initially w/o sources
  • 3.10: added Fortran Flang (LLVM) compiler, extensive MPI features added
  • 3.9: further C# and Cuda support originally added in CMake 3.8.
  • 3.8: Initial Cuda support
  • 3.7: comparing ≤ ≥ , initial Fortran submodule support
  • 3.6: better OpenBLAS support
  • 3.5: Enhanced FindBoost target with auto Boost prereqs
  • 3.4: Limit CPU usage when using ctest -j parallel tests
  • 3.3: List operations such as IN_LIST

NetCDF4 segfault on file open

NetCDF4 Fortran library may compile successfully and run for simple programs but segfault on programs where HDF5 is linked directly as well as NetCDF4. This is observed on Ubuntu 18.04 for example. A reason one might directly link both HDF5 and NetCDF is a program that need to read / write files in HDF5 as well as NetCDF format. The symptom observe thus far is the program segfault on nf90_open(). An example of a broken Linux system is Ubuntu 18.04.

Fix

The fix is to compile HDF5 and NetCDF for yourself.

CMake if environment variable conditions

CMake’s distinct syntax and behavior mean that using variables can have distinct behavior from other languages. Using environment variables is one such case where CMake has surprising behavior. One surprising behavior is using environment variables in if() stanzas. It’s possible to check if an environment variables is defined directly in CMake, but checking the value of the environment variable must be done indirectly via a CMake internal variable. An example of this is using environment variable “CI” to detect if CMake is running on a CI or not.

if(DEFINED ENV{CI})
  message(STATUS "CI value: $ENV{CI}")
endif()

The syntax above is valid. However, to actually use an environment variable in a CMake if() statement, an internal variable is necessary like:

set(CI $ENV{CI})

if(CI)
  message(STATUS "Skipping test since CI: ${CI}")
endif()

You can verify the behavior by making file “foo.cmake” and running it with cmake -P foo.cmake

Gfortran 9.3 duplicated use error fix

Gfortran 9.3.0 has a new bug (or feature perhaps) of being sensitive to overlapping / duplicated use elements in a module - submodule hierarchy. That is, if a procedure is used in multiple places in the module - submodule hierarchy, only use the procedure once at the highest necessary level of the hierarchy.

This is perhaps best shown by example:

module foo
implicit none
contains
subroutine bar()
end subroutine bar
end module foo

module parent
use foo, only : bar
implicit none
interface
module subroutine baz
end subroutine baz
end interface
end module parent

submodule (parent) child
use foo, only : bar  !< this is unnecessary and triggers the Gfortran 9.3.0 error
implicit none
contains
module procedure baz
end procedure baz
end submodule child

The error message from Gfortran 9.3.0 is like:

$ gfortran -c .\dupe.f90

dupe.f90:17:17:

   17 | submodule (parent) child
      |                 1
   18 | use foo, only : bar
      |                   2

Fortran MPI with MSYS2 MinGW on Windows

OpenMPI is not currently available for native Windows. While Cygwin and WSL do have working OpenMPI, it is also possible to use MPICH via Microsoft MS-MPI as described in this article.

setup MPI on Windows MSYS2

This procedure gives MPI via MSYS2 GCC / Gfortran compilers on Windows.

  1. Download and install Microsoft MS-MPI

  2. To make mpiexec.exe available, add to user PATH: “C:/Program Files/Microsoft MPI/Bin” – this is needed even when using MSYS2.

  3. Install MSYS2 MS-MPI library, from the MSYS2 Terminal

    pacman -S mingw-w64-x86_64-msmpi
    
  4. compile the MPI module with Gfortran from PowerShell or Windows Terminal. This also creates “mpi.mod” for Fortran use.

    cd C:/msys64/mingw64/
    
    gfortran include/mpi.f90 -c -fno-range-check
    
    ar cr lib/libmpi.a mpi.o
    

Notes

If you don’t compile the mpi.f90 as above, errors may result like:

use mpi

Fatal Error: Cannot open module file 'mpi.mod' for reading at (1): No such file or directory

MS-MPI without MSYS2

If not using MSYS2, it’s still possible use MS-MPI. We will put the resulting artifacts under c:/lib/mpi

mkdir c:/lib/mkl/include

Copy-Item -Path "C:/Program Files (x86)/Microsoft SDKs/MPI/Include" -Destination "c:/lib/mpi/include" -Recurse

cd c:/lib/mpi

you may need to edit “include/mpi.f90” to have INCLUDE 'mpifptr.h'

gfortran include/mpi.f90 -c -fno-range-check

ar cr lib/libmpi.a mpi.o

Fortran dependency graph with CMake

CMake can generate GraphViz dependency graphs for languages including Fortran. Fortran submodule are not shown in the graph, but executables and modules are shown in the directed dependency graph.

Requirements

  • the “dot” GraphViz program converts the .dot files to PNG, SVG, etc. This is typically available on Linux, Homebrew, Windows MSYS2, Cygwin, WSL, etc.
  • Since CMake needs to configure and generate, the compiler and generator needed by the CMake project must be working. However the project does not need to be compiled before generating the dependency graph.

Example

h5fortran HDF5 object-oriented Fortran dependency graph is below. SVG can be a useful format since it’s vector and can be zoomed arbitrarily large in a web browser, while PNG is viewable by almost anything.

cmake -B build --graphviz=foo.dot

dot -Tpng -o foo.png foo.dot

dot -Tsvg -o foo.svg foo.dot

h5fortran dependency graph

The “dependers” files show only the nodes depending on a node.