radproc.raw.create_idraster_and_process_radolan_data

radproc.raw.create_idraster_and_process_radolan_data(inFolder, HDFFile, clipFeature=None, complevel=9)

Convert all RADOLAN binary data in directory tree into an HDF5 file with monthly DataFrames for a given study area.

First, an ID raster is generated and - if you specified a Shapefile or Feature-Class as clipFeature - clipped to your study area. The national ID Raster (idras_ger) and the clipped one (idras) are saved in directory of HDF5 file.

Afterwards, all RADOLAN binary files in a directory tree are imported, clipped to study area, converted into monthly pandas DataFrames and stored in an HDF5 file.

The names for the HDF5 datasets are derived from the names of the input folders (year and month). The directory tree containing the raw binary RADOLAN data is expected to have the following format:

<inFolder>/<year>/<month>/<binaries with RADOLAN data>

–> <inFolder>/YYYY/MM

–> <inFolder>/2008/5 or <inFolder>/2008/05

–> e.g. C:/Data/RW/2008/5

In this example, the output dataset will have the path 2008/5 within the HDF5 file. The necessary format is automatically generated by the functions radproc.raw.unzip_RW_binaries() and radproc.raw.unzip_YW_binaries().

If necessary, a textfile containing all directories which could not be processed due to data format errors is created in directory of HDF5 file.

Parameters:
inFolder : string
Path to the directory tree containing RADOLAN binary files. The directory tree is expected to have the following structure: <inFolder>/YYYY/MM –> C:/Data/RADOLAN/2008/5
HDFFile : string
Path and name of the HDF5 file. If the specified HDF5 file already exists, the new dataset will be appended; if the HDF5 file doesn’t exist, it will be created.
clipFeature : string (optional, default: None)
Path to the clip feature defining the extent of the study area. File type may be Shapefile or Feature Class. The clip Feature does not need to be provided in the RADOLAN projection. See below for further details. Default: None (Data are not clipped to any study area)
complevel : interger (optional, default: 9)
defines the level of compression for the output HDF5 file. complevel may range from 0 to 9, where 9 is the highest compression possible. Using a high compression level reduces data size significantly, but writing data to HDF5 takes more time and data import from HDF5 is slighly slower.
Returns:

No return value Function creates datasets for every month in HDF5 file specified in parameter HDFFile. Additionally, two ID Rasters are created in the directory of the HDF5 file.

In case any binary files could not be read in due to processing errors, these are skipped and the respective intervals are filled with NoData (NaN) values. A textfile with the names and error messages for the respective monthly input data folder is written and stored in inFolder for information. For example, issues due to obviously corrupt file formats are known for the RADOLAN RW dataset in July and August 2005 and May 2007.

See also

See File system description for further details on data processing. If you already have an ID Array available, use radproc.raw.process_radolan_data() instead.

Note

The RADOLAN data are provided in a custom stereographic projection defined by the DWD and both ID rasters will automatically be generated in this projection by this function. As there is no transformation method available yet, it is not possible to directly perform any geoprocessing tasks with RADOLAN and geodata with other spatial references. Nevertheless, ArcGIS is able to perform a correct on-the-fly transformation to display the data together. The clip function implemented in radproc uses this as a work-around solution to “push” the clip feature into the RADOLAN projection. Hence, the clipping works with geodata in different projections, but the locations of the cells might be slightly inaccurate.

Format description and examples:
 

After the import, individual DataFrames can be loaded into memory from the generated HDF5 file using radproc.core.load_month() or several DataFrames within the same year can be loaded together using radproc.core.load_months_from_hdf5()

Every row of the output DataFrames equals a precipitation raster of the investigation area at the specific date. Every column equals a time series of the precipitation at a specific raster cell.

Data can be accessed and sliced with the following Syntax:

df.loc[row_index, column_name]

with row index as string in date format ‘YYYY-MM-dd hh:mm’ and column names as integer values

Examples:

>>> dataHDF5 = r'C:\Data\HDF5\RW.h5'
>>> df = radproc.load_month(HDFFile=dataHDF5, year=2008, month=5)
>>> df.loc['2008-05-01 00:50',414773] #--> returns single float value of specified date and cell
>>> df.loc['2008-05-01 00:50', :] #--> returns entire row (= raster) of specified date as one-dimensional DataFrame
>>> df.loc['2008-05-01', :] #--> returns DataFrame with all rows of specified day (because time of day is omitted)
>>> df.loc[, 414773] #--> returns time series of the specified cell as Series