radproc.raw.process_radolan_data

radproc.raw.process_radolan_data(inFolder, HDFFile, idArr=None, complevel=9)

Converts all RADOLAN binary data into an HDF5 file with monthly DataFrames for a given study area without generating a new ID raster.

All RADOLAN binary files in a directory tree are imported, clipped to study area, converted into monthly pandas DataFrame and stored in an HDF5 file.

The names for the HDF5 datasets are derived from the names of the input folders (year and month). The directory tree containing the raw binary RADOLAN data is expected to have the following format:

<inFolder>/<year>/<month>/<binaries with RADOLAN data>

–> <inFolder>/YYYY/MM

–> C:/Data/RADOLAN/2008/5

In this example, the output dataset will have the path 2008/5 within the HDF5 file.

Additionally, a textfile containing all directories which could not be processed due to data format errors is created in directory of HDF5 file.

Parameters:
inFolder : string
Path to the directory containing RADOLAN binary files stored in directory tree of following structure:: <inFolder>/YYYY/MM –> C:/Data/RADOLAN/2008/5
HDFFile : string
Path and name of the HDF5 file. If the specified HDF5 file already exists, the new dataset will be appended; if the HDF5 file doesn’t exist, it will be created.
idArr : one-dimensional numpy array (optional, default: None)
containing ID values to select RADOLAN data of the cells located in the investigation area. If no idArr is specified, the ID array is automatically generated from RADOLAN metadata and RADOLAN precipitation data are not clipped to any investigation area.
complevel : interger (optional, default: 9)
defines the level of compression for the output HDF5 file. complevel may range from 0 to 9, where 9 is the highest compression possible. Using a high compression level reduces data size significantly, but writing data to HDF5 takes more time and data import from HDF5 is slighly slower.
Returns:
No return value Function creates datasets for every month in HDF5 file specified in parameter HDFFile. Additionally, a textfile containing all directories which could not be processed due to data format errors is created in HDFFolder.
Notes:

See File system description for further details on data processing.

This function can be used to process RADOLAN data without having ArcGIS installed.

Format description and examples:
 

After the import, individual DataFrames can be loaded into memory from the generated HDF5 file using radproc.core.load_month() or several DataFrames within the same year can be loaded together using radproc.core.load_months_from_hdf5()

Every row of the output DataFrames equals a precipitation raster of the investigation area at the specific date. Every column equals a time series of the precipitation at a specific raster cell.

Data can be accessed and sliced with the following Syntax:

df.loc[row_index, column_name]

with row index as string in date format ‘YYYY-MM-dd hh:mm’ and column names as integer values

Examples::

>>> dataHDF5 = r'C:\Data\HDF5\RW.h5'
>>> df = radproc.load_month(HDFFile=dataHDF5, year=2008, month=5)
>>> df.loc['2008-05-01 00:50',414773] #--> returns single float value of specified date and cell
>>> df.loc['2008-05-01 00:50', :] #--> returns entire row (= raster) of specified date as one-dimensional DataFrame
>>> df.loc['2008-05-01', :] #--> returns DataFrame with all rows of specified day (because time of day is omitted)
>>> df.loc[, 414773] #--> returns time series of the specified cell as Series