Viewing FITS image stack

The FITS format is an old but frequently used format in astronomical imagery. Two of the easiest ways to view FITS file image stacks with standalone programs are HDFView and NASA FV.

While FV has utilities and a UI more oriented to astronomical uses, HDFView is a generally useful tool to view HDF5, HDF4 and FITS files.

Ninja build on CentOS 7

The binary executables for Ninja 1.9.0 do not work on CentOS 7, the error is like:

$ ninja

ninja: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ninja)
ninja: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by ninja)

This is a known issue with the Ninja release artifact build process, which is planned to be fixed for Ninja 1.10.0.

Workaround

Until Ninja 1.10.0 is released, use Ninja 1.8.2 on CentOS. This may work for other “older” Linux distros with similar errors.

MacOS hardware requirements vs. version

Those who need to keep MacOS-capable hardware around for development or maintenance purposes, or simply as a form of conserving resources may be interested to know how Apple is increasing MacOS requirements over time. In general, if the Apple computer is too old for a currently supported MacOS version, consider installing Linux, which normally works well on Apple computer hardware. Apple maintains a list of no longer supported Apple hardware.

MacOS hardware requirement vs version

MacOS requirements by version.

The red dotted line depicts the oldest supported version of MacOS.

Data source is Wikipedia.

Using CMake on Windows

CMake on Windows is installed via the cmake*-win64-x64.msi graphical installer. Do not use sudo or Run As Administrator for cmake in general.

Compile programs using CMake

  1. Navigate to the directory containing the file CMakeLists.txt using the Windows Terminal / Command Prompt.
  2. configure the build. This is normally run only once unless making major project changes.

    cmake -G "MinGW Makefiles" -DCMAKE_SH="CMAKE_SH-NOTFOUND" -B build .
  3. compile the program. This is the command run each time you make a change to the project code.

    cmake --build .
  4. Optionally, install the program with

    cmake --build . --target install

Generator selection

On Windows, CMake defaults to Visual Studio and Nmake. The cmake options above direct the build to MinGW. If you wish to make this change permanent for CMake ≥ 3.15, set the Windows environment variable CMAKE_GENERATOR to “MinGW Makefiles”. This can still be overridden if needed like

cmake -G "Visual Studio 16 2019"

sh.exe error with cmake

The nuisance error from cmake about sh.exe being on the Path. This error also happens with CMake Windows builds on Azure Pipelines and GitHub Actions.

sh.exe was found in your PATH, here: C:/Program Files/Git/user/bin/sh.exe For MinGW make to work correctly sh.exe must NOT be in your path. Run cmake from a shill that does not have sh.exe in your PATH. If you want to use a UNIX shell, then use MSYS Makefile

Eliminate this message by adding -DCMAKE_SH="CMAKE_SH-NOTFOUND" to the cmake command, like:

cmake -G "MinGW Makefiles" -DCMAKE_SH="CMAKE_SH-NOTFOUND" ..

Notes

Use ** instead of pow in Python

In Python, x**y is much faster than:

Julia is more than 5 times faster than Python at scalar exponentiation, while Go was in-between Python and Julia in performance.

Python

Benchmarking was the same for integer or float base or exponent.

Python testing done with:

  • Python 3.7.4
  • Ipython 7.8.0
  • Numpy 1.16.5

** operator

The ** operator in Python also has the advantage of returning int if inputs are int and arithmetic result is integer.

10**(-3)
8.22 ns ± 0.0182 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

pow(10, -3)
227 ns ± 0.313 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

math.pow(10, -3)
252 ns ± 1.56 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

numpy.power(10., -3)
1.5 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Numpy is known in general to be slower at scalar operations than Python-native operators and Python built-in math. But of course Numpy is generally a lot faster and easier for N-dimensional array operations.

Julia

Julia 1.2.0 was likewise benchmarked for reference on the same computer.

First we installed Julia BenchmarkTools:

import Pkg
Pkg.add("BenchmarkTools")

The Julia wallclock time for exponentiation was the same for float and int as with Python.

3.399 nanoseconds

Go

Go 1.13.1 was benchmarked:

go test -bench=Power
BenchmarkPower-12       33883672                31.8 ns/op

go benchmark reference

Python flatten list of lists into list

For scenarios where a function outputs a list, and that function is in a for loop or asyncio event loop, the final output will be a list of lists, like:

x = [[1,2,3], [4, 5, 6]]

This may be inconvenient for applications where a flattened list is required. The simplest and fastest way to flatten a list of lists in like:

import itertools

x = [[1,2,3], [4, 5, 6]]

x_flat = list(itertools.chain(*x))

which results in

[1, 2, 3, 4, 5, 6]

GNSS data abbreviations

Some of the most fundamental GNSS measurements retrieved from GNSS receivers include:

  • CNR: carrier to noise ratio or C/N0. The estimation technique varies between receivers. Typical values in the range 30..50 [dB Hz]
  • PSR: Psuedorange [meters]
  • ADR: accumulated Doppler range–carrier phase measurements [cycles]

Using networks of GNSS receivers along with appropriate post-processing techniques, estimated maps of vertical TEC (integrated electron density) can be derived:

Haystack TEC map

Haystack TEC map https://www.haystack.mit.edu/eclipse.html

Reference

GitHub / GitLab Pages to Netlify

While both GitHub Pages and GitLab Pages are adequate for most personal, group and project pages, when website size and / or traffic have grown beyond what is feasible for these solutions, a more comprehensive hosting provider like Netlify may be considered. Netlify provides its own CDN, so those that had been using Cloudflare for DNS and CDN can configure Cloudflare to provide only DNS, if they so choose. Netlify is free for single users, allowing a private GitLab, GitHub or Bitbucket repo (or other suitable source) to deploy to a public custom domain HTTPS website. SSL certificates can be user-provided or can be created through Netlify for your custom domain (e.g. https://www.example.com).

Why transfer site to Netlify

Netlify provides a comparison of GitHub Pages and Netlify. GitLab Pages allows user choice of static site generator (Hugo, Jekyll, etc.), while GitHub Pages allows only Jekyll. GitLab Pages private repos have a runtime limit of 2000 minutes/month. Netlify allow 3 builds / minute and 100 GB / month on the free tier, with 300 build minutes/month. For sites that are becoming very popular, GitHub Pages will simply want you to move elsewhere, while Netlify will have a paid plan to offer. This process may be too burdensome for those with limited IT or bandwidth resources, or simply the lack to time to learn how to do this.

Netlify uses webhooks to detect a git push to the website GitLab repo, and then builds the site. Netlify has a CDN and DDoS protection built-in. Even if the other features aren’t needed, a key feature is the ability to have the website code in a private repo with unlimited public website deployments and traffic.

Build minute limits (such as on GitLab and Netlify) can legimately be worked around by building the site locally on your laptop and pushing the publish-ready HTML.

Transfer site to Netlify

Note: This process may take down your site for a day or two if things go wrong. Even under normal conditions, all site visitors may need to allow an HTTPS exception due to SSL certificate error since Netlify requires all DNS servers to update before generating the domain certificate.

  1. if not already on GitLab, copy your website repo to GitLab (any name repo is fine).
  2. disable Auto DevOps and ensure no file named .gitlab-ci.yml exists.
  3. Login to Netlify using Gitlab, which will ask for your website repo.
  4. pick a custom Netlify subdomain like mycompany.netlify.com. Ensure this site is totally working before proceeding.
  5. Set Cloudflare or whatever your DNS provider is to point CNAME or A to mycompany.netlify.com (THIS IS THE PART THAT CAN TAKE YOUR MAIN WEBSITE DOWN!)
  6. Under Netlify Domain Management → HTTPS → Verify DNS config, ensure the verification completes. Until the DNS change propagates worldwide, your main HTTPS domain visitors are getting SSL verification errors. They can use http://mycompany.com instead of https://mycompany.com temporarily. Do this at a low traffic time range! If you were using Cloudflare CDN, the old records may point to DigitalOcean while the new records point to *.netlify.com.
  7. Optionally, under Netlify → Build & deploy → Post Processing → Asset Optimization consider using these to improve website speed.

Numpy / OpenCV image BGR to RGB

Conversion between any/all of BGR, RGB, and GBR may be necessary when working with

  • Matplotlib pyplot.imshow(): M x N x 3 image, where last dimension is RGB.
  • OpenCV cv2.imshow(): M x N x 3 image, where last dimension is BGR
  • Scientific Cameras: some output M X N x 3 image, where last dimension is GBR

Note: as in any programming language, operations on memory-contiguous arrays are most efficient. In particular, OpenCV in-place operations require a contiguous array from Python to avoid unexpected results. The safest approach is to always make a copy of the array as in the examples below.

Examples

Note to use .copy() to avoid unexpected results if using OpenCV. If just using Matplotlib, .copy() is not necessary–but performance (speed) may benefit from .copy().

BGR to RGB

OpenCV image to Matplotlib

rgb = bgr[...,::-1].copy()

RGB to BGR

Matplotlib image to OpenCV

bgr = rgb[...,::-1].copy()

RGB to GBR

gbr = rgb[...,[2,0,1]].copy()

Axis order for Python images

  • 3-D: W x H x 3, where the last axis is color (e.g. RGB)
  • 4-D: W x H x 3 x 1, where the last axis is typically an alpha channel

Further examples

Majority of new Python work is Python 3

Consider requiring Python ≥ 3.6 for your Python projects.

There is considerable additional effort required to support Python ≤ 3.5 in general while using concurrent and other performant Python programming with the latest common modules like Numpy, Xarray, Matplotlib, etc.


Python 3 is used by a large and growing majority of new and active Python developers in science, engineering, medical research and education. Python 3 was released in December 2008. While initially there was pushback over backward incompatibilities, the performance, efficiencies and features of Python 3 have won out over Python 2. Python 2 is obsolete, so it is unwise to continue to put effort into new Python 2 code in general.

Python 2 deprecation

Major Python data analysis packages used in STEM industry, research and education have already abandoned Python 2, including IPython , AstroPy, SunPy Django and virtually every worthwhile maintained Python module.

The most popular Python packages have supported Python 3 for some time now, including Amazon AWS and Google Cloud Platform. Already in 2016, Python 3 was starting to take over in PyCharm IDE:

and by May 2017, the majority of PyCharm users were on Python 3:

More than 95% of our projects require at least Python 3.6. PyPy supports supports Python 3.6 syntax. In these exceptions, due to the shortness of the program and simplicity of code, there wasn’t any reason to lock Python 2 users out. PyMap3D is a key utility in 3-D coordinate transformation that parallels conversions available in Matlab for which I have allowed Python 2 compatibility.

From 2018 onward, Python projects should generally be written for Python ≥ 3.6, due to the important number of features requiring Python 3.6.

The main holdouts in Python 2 code are of the same nature as those that hang on to COBOL systems. Those with static, unchanging requirements in large proprietary codebases that few people are familiar with. Some programmers thrive and make a decent living servicing those legacy COBOL and Python 2 environments. The majority of STEM coders, analysts and educators have been writing Python 3 code. The old Python 2-only and Python 2-first objections were mostly written before 2016 and almost all were before 2017. Some of their complaints were addressed in Python 3.6 (released December 23, 2016).

The main outstanding complaint I see over Python 3 is over the separation between bytes and strings. Our work in applications with IoT and embedded systems distinguishes between bytes and strings, so I appreciate the separation of bytes and strings in Python 3. For the global environment I write for, I appreciate that strings are Unicode in Python 3.

Efficiency

Python 3 efficiencies in terms of programmer time come in more efficient syntax. The Python 3 core itself is as much as 10% faster in several aspects, with some standard modules like re processing regular expressions as much as 20x faster. The modernize tool and six and __future__ modules smooth over most of these changes to make Python 2 / 3 compatible code. Some of the most enticing changes from Python ≥ 3.6 are not easily backportable to Python 2.7, however. These features simplify syntax and make the code more self-documenting. Python 2 has many years of cruft bolted in, and that’s a big part of why Python 3 introduced backward incompatibilities, and they are growing with time.

The brief list below are just a few Python 3 features that I take advantage of. There are very many more than this that may apply for you.

Asynchronous execution

asyncio has been rapidly maturing since its genesis in Python 3.4. It brings to core Python features that used to require Tornado, twisted, etc. Asynchronous execution is required for applications that need to scale massively. Not just for Facebook, but also IoT applications where remote sensors report in are a perfect use case for asyncio. asyncio is a convenient , relatively safe way to thread, increasingly baked right into Python.

Since asyncio continues to develop at speed, you should use the newest version of Python suitable (at least Python 3.6) and ensure the documentation you’re reading is for the version of Python you’re using. {: .alert-box}

LRU caching

Least-recently used caching is enabled by a decorator to functions. For example, you have a computationally-expensive function where you sometimes use the same arguments, but don’t want to bother tracking the output by saving in a variable. LRU caching is as simple as:

from functools import lru_cache

@lru_cache(maxsize=None)
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

print([fib(n) for n in range(16)])

print(fib.cache_info())

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610]

CacheInfo(hits=28, misses=16, maxsize=None, currsize=16)

Factory installed pip

Factory installed pip encourages proper use of setup.py and setuptools instead of one-off non-OS-independent setup scripts. In 2018, pip ≥ 10 can use the simpler pyproject.toml instead of or along with setup.py for Python projects.

Type hinting

Python 3.5 added type hinting, which some parsers use to give code warnings, while not actually enforcing strict typing (unless you want to).

import math

def 2sinxy(x:float, y:float) -> float:
    return 2*math.sin(x*y)

This function would not crash if you fed int in on most interpreters, but PyCharm IDE and other can emit warnings when conflicting variable types are passed in/out.

Argument unpacking

Python 3.5 added argument unpacking, where iterables can be fed into functions requiring multiple arguments, expanded with *iterable. Multiple iterables can be unpacked into a single function.

OO IPv4/v6 addressing

ipaddress is a useful standard library feature for manipulating IP addresses, used in the findssh program to scan for active servers in IP address ranges.

OO path/filename

pathlib is standard library. This gets rid of most awkward os.path functions.

F-strings (no ‘{}’.format)

f-strings from Python 3.6 allow f'This is {weight} kg for {price} dollars. instead of 'This is {} kg for {} dollars'.format(weight,price)

Python 3.7

Python 3.7 adds several more compelling reasons to upgrade.

Large Companies

Patreon transitioned from PHP → Python 3 in 2015. Key appeals for Patreon to switch to Python 3 included: * future-proofing * appeal to developer hiring, working with latest language * lower risk than porting to static typed language like Scala


Instagram runs fully on Python 3 as noted at the 2017 PyCon Keynote at the 13 minute mark:


Facebook is proud of its ongoing conversion to Python 3 as well, and uses this fact to help recruit top talent.

Linux Python 2 demotion

Starting in 2010, Arch Linux defaulted to Python 3. In 2017, Ubuntu 17.10 defaulted to Python 3, and Python 2 had to be manually installed. In 2018, Ubuntu 18.04 is cleaning out Python 2 from all programs in the main Ubuntu repository and defaults to Python 3.6. The goal is to demote Python from the main repository, requiring an extra step to install Python 2 from the “universe” repository.

Executable Python scripts should continue to have the first line

#!/usr/bin/env python

so that users can configure their desired Python version. Many users install a third party Python distribution such as Anaconda Python, PyCharm, Intel Python , etc. that have high performance math libraries such as PyCUDA.

Further Reading

  • Very detailed notes from Python Software Foundation Fellow Nick Coghlan on why, when, what, how, where of Python 3 transition with fascinating historical notes.
  • ActiveState saw majority of downloads being Python 3 since January 2017.