5.0 KiB
Baseline ETo Data
The baseline ETo endpoint determines the baseline ETo for a location by reading a file generated using data from MOD16. The data is stored in a binary file that has 4 key differences from the GeoTIFF provided by MOD16:
- The bit depth is decreased from 16 bits to 8 bits to reduce the file size.
- Missing data is interpolated using the values of surrounding pixels. The MOD16 dataset does not contain data for locations that don't have vegetated land cover (such as urban environments), which can be problematic since many users may set their location to nearby cities.
- The data is stored in an uncompressed format so that geographic coordinates can be mapped to the offset of the corresponding pixel in the file. This means the file can be stored on disk instead of memory, and the pixel for a specified location can be quickly accessed by seeking to the calculated offset in the file.
- A metadata header that contains parameters about the data used to create the file (such as the image dimensions and instructions on how to map a pixel value to an annual ETo value) is added to the beginning of the file. This header enables the weather server to use datafiles generated from future versions of the MOD16 dataset (even if these versions modify some of these parameters).
The datafile is to be stored as baselineEToData/Baseline_ETo_Data.bin.
The datafile is not included in the repo because it is very large (62 MB zipped, 710 MB uncompressed), but it can be downloaded separately.
This file was generated by making 20 passes over the data from 2000-2013 in the MOD16A3 dataset.
Alternatively, it can be generated by running the data preparer program yourself.
Preparing the Datafile
Since TIFF files do not support streaming, directly using the GeoTIFF images from MOD16 would require loading the entire image into memory.
To avoid this, the file must first be converted to a binary format so the pixels in the image can be read row-by-row.
Running ./prepareData.sh <PASSES> will download the required image files using wget, convert them to a binary format using ImageMagick, compile the program with gcc, and run it .
This process can be simplified by using the included Dockerfile that will perform all of these steps inside a container.
The Dockerfile can be used by running docker build -t baseline-eto-data-preparer . && docker run --rm -v $(pwd):/output baseline-eto-data-preparer <PASSES>.
The <PASSES> argument is used to control how much the program should attempt to fill in missing data.
(#passes)
Passes
The program fills in missing data by making several successive passes over the entire image, attempting to fill in each missing pixel on each pass. The value for each missing pixel is interpolated using the values of pixels in the surrounding 5x5 square, and missing pixels that don't have enough data available will be skipped. However, these pixels may be filled in on a later pass if future passes are able to fill in the surrounding pixels. Running the program with a higher number of passes will fill in more missing data, but the program will take longer to run and each subsequent pass becomes less accurate (since the interpolations will be based on interpolated data).
File Format
The data will be saved in a binary format beginning with the a 32 byte big-endian header in the following format:
| Offset | Type | Description |
|---|---|---|
| 0 | uint8 | File format version |
| 1-4 | uint32 | Image width (in pixels) |
| 5-8 | uint32 | Image height (in pixels) |
| 9 | uint8 | Pixel bit depth (the only bit depth currently supported is 8) |
| 10-13 | float | Minimum ETo |
| 14-17 | float | Scaling factor |
| 18-32 | N/A | May be used in future versions |
The header is immediately followed by a IMAGE_WIDTH * IMAGE_HEIGHT bytes of data corresponding to the pixels in the image in row-major order.
Each pixel is interpreted as an 8 bit unsigned integer, and the average annual potential ETo at that location is PIXEL * SCALING_FACTOR + MINIMUM_ETO inches/year.
A value of 255 is special and indicates that no data is available for that location.
Notes
- Although the MOD16 documentation states that several pixel values are used to indicate the land cover type for locations that are missing data, the image actually only uses the value
65535. The program handles this by using a mask image of water bodies so it can fill in pixels for urban environments without also filling in data for oceans. - The map uses an equirectangular projection with the northernmost 10 degrees and southernmost 30 degrees cropped off.