Check HDF5 files for corruption

HDF5 files do not have an error recovery mechanism and do not journal. There is an optional per-variable error checksum Fletcher32 to detect data corruption.

  • Checking/comparing file size alone is not an adequate check for HDF5 corruption.

HDF5 testing

Here a few easy techniques to check for corrupted HDF5 files.

Python

This Python-based HDF5 checking script checks HDF5 files for corruption and optionally finds the corrupted block(s) and variable(s)

Shell

Install via:

  • Linux: apt install hdf5-tools
  • MacOS: brew install hdf5
  • Windows: use MSYS2
h5stat file.h5

You can also print the data values in the file

h5dump file.h5

GUI HDF5 checker

HDFview appears to use the Fletcher32 checksum to show a red question mark if corruption is detected. Another curiosity is that the Object reference is 2^32 - 1 on the corrupted variable.

HDFView bad variable

HDF5 GUIs to view and edit variables in .h5 files