This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Processing MERG data for use with OCW
Skip to end of metadata
Go to start of metadata


The NCEP/CPC 4km Global (60N – 60S) IR Dataset (MERG dataset is an infrared brightness temperature (BT) satellite dataset created by the Climate Prediction Center (CPC), National Centers on Environmental Protection (NCEP) and the National Weather Service (NWS) from data collected by sensors sampling within the 10.2 and 12.5 μm spectral band on various geostationary satellite platforms. The data are representative of readings at the top of the atmosphere, and are already correct for errors such as the “zenith angle dependence”, but no intercalibration correction has been applied. In preparing the MERG dataset, the data have been regridded such that it exist on a rectangular latitude-longitude grid, and the time intervals are given at the hour and 30 minutes into the hour.

Here is a short tutorial of how to convert these files to a NETCDF format that can be ingested by OCW.The scripts provided here are very raw, but are hopefully documented sufficiently to accommodate your needs.


*Python 2.7.4
Module dependences: Nio
*GrADS (We use OpenGrADS grads2 Version 2.0.1.oga.1)


  • Install GrADS (and LATS4D) according to their instructions
  • Move all of the MERG raw files which were downloaded e.g. merg_2002080417_4km-pixel.Z or merg_2002080417_4km-pixel (if unzipped already) into one folder
  • Edit accordingly

 The scripts for processing MERG raw data

import fileinput
import glob
import os
import re
import string
import subprocess
import sys

#----------------------- GLOBAL VARIABLES --------------------------
# --------------------- User defined variables ---------------------
LATMIN = '-5.0'         #min latitude; -ve values in the SH e.g. 5S = -5
LATMAX = '20.0'            #max latitude; -ve values in the SH e.g. 5S = -5
LONMIN = '-30.0'        #min longitude; -ve values in the WH e.g. 59.8W = -59.8
LONMAX = '30.0'            #min longitude; -ve values in the WH e.g. 59.8W = -59.8
T_BB_MAX = '250'           #warmest temp to allow in K
T_BB_MIN = '180'                        #coolest temp to allow in K
#------------------- End user defined Variables -------------------

#------------------------ End GLOBAL VARS -------------------------

def preprocessingMERG(MERGdirname):
        Utility script for unzipping and converting the merg*.Z files from Mirador to
        NETCDF format. The files end up in a folder called mergNETCDF in the directory
        where the raw MERG data is

        Directory to the location of the raw MERG files, preferably zipped


       1 GrADS ( and lats4D (
         have been installed on the system and the user can access
       2 User can write files in location where script is being called
       3 the files havent been unzipped

    imgFilename = ''

    #Just incase the X11 server is giving problems'export DISPLAY=:0.0', shell=True)

    #for files in glob.glob("*-pixel"):
    for files in glob.glob("*.Z"):
        fname = os.path.splitext(files)[0]

        #unzip it
        bash_cmd = 'gunzip ' + files, shell=True)

        #determine the time from the filename
        ftime ='\_(.*)\_',fname).group(1)

        yy = ftime[0:4]
        mm = ftime[4:6]
        day = ftime[6:8]
        hr = ftime [8:10]

        #TODO: must be something more pythonic!

        if mm=='01':
            mth = 'Jan'
        if mm == '02':
            mth = 'Feb'
        if mm == '03':
            mth = 'Mar'
        if mm == '04':
            mth = 'Apr'
        if mm == '05':
            mth = 'May'
        if mm == '06':
            mth = 'Jun'
        if mm == '07':
            mth = 'Jul'
        if mm == '08':
            mth = 'Aug'
        if mm == '09':
            mth = 'Sep'
        if mm == '10':
            mth = 'Oct'
        if mm == '11':
            mth = 'Nov'
        if mm == '12':
            mth = 'Dec'
'rm merg.ctl', shell=True)'touch merg.ctl', shell=True)
        replaceExpDset = 'echo DSET ' + fname +' >> merg.ctl'
        replaceExpTdef = 'echo TDEF 99999 LINEAR '+hr+'z'+day+mth+yy +' 30mn' +' >> merg.ctl', shell=True)'echo "OPTIONS yrev little_endian template" >> merg.ctl', shell=True)'echo "UNDEF  330" >> merg.ctl', shell=True)'echo "TITLE  globally merged IR data" >> merg.ctl', shell=True)'echo "XDEF 9896 LINEAR   0.0182 0.036378335" >> merg.ctl', shell=True)'echo "YDEF 3298 LINEAR   -59.982 0.036383683" >> merg.ctl', shell=True)'echo "ZDEF   01 LEVELS 1" >> merg.ctl', shell=True), shell=True)'echo "VARS 1" >> merg.ctl', shell=True)'echo "ch4  1  -1,40,1,-1 IR BT  (add  "75" to this value)" >> merg.ctl', shell=True)'echo "ENDVARS" >> merg.ctl', shell=True)

        #generate the lats4D command for GrADS
        lats4D = 'lats4d -v -q -lat ' + LATMIN +' ' + LATMAX + ' ' + '-lon ' + LONMIN + ' ' + LONMAX + ' -time '+hr+'Z'+day+mth+yy + ' -func @+75 ' + '-i merg.ctl' + ' -o ' + fname

        gradscmd = 'grads -lbc ' + '\'' +lats4D + '\''
        #run grads and lats4d command, shell=True)

        #----- Generate GrADS images--------------
        imgFilename = hr+'Z'+day+mth+yy+'.gif'
        #-------End Generate GrADS images ---------
        #remove the raw data file to save some resources
        rmCmd= "rm " + fname, shell=True)

    #when all the files have been converted, mv the netcdf files'mkdir mergNETCDF', shell=True)'mv *.nc mergNETCDF', shell=True)
    #mv all images'mkdir mergImgs', shell=True)'mv *.gif mergImgs', shell=True)


def tempMaskedImages(imgFilename):
    Purpose:: To generate temperature-masked images for a first pass verification

        filename for the gif file

        None - gif images for each file of IR temperature < T_BB_MAX are generated in folder called mergImgs

       Same as for preprocessingMERG
       1 GrADS ( and lats4D (
         have been installed on the system and the user can access
       2 User can write files in location where script is being called
       3 the files havent been unzipped
    ''''rm', shell=True)'touch', shell=True)'echo "''\'open merg.ctl''\'" >>', shell=True)'echo "''\'set mpdset hires''\'" >>', shell=True)'echo "''\'set lat ' +LATMIN + ' ' +LATMAX +'\'" >>', shell=True)'echo "''\'set lon ' +LONMIN + ' ' +LONMAX +'\'" >>', shell=True)'echo "''\'set cint 10''\'" >>', shell=True)'echo "''\'set cmax '+T_BB_MAX +'\'" >>', shell=True)'echo "''\'set cmin '+T_BB_MIN +'\'" >>', shell=True)'echo "''\'d ch4+75''\'" >>', shell=True)'echo "''\'draw title Masked Temp @ '+imgFilename +'\'" >>', shell=True)'echo "''\'printim '+imgFilename +' x1000 y800''\'" >>', shell=True)'echo "''\'quit''\'" >>', shell=True)
    gradscmd = 'grads -blc ' + '\'run''\'', shell=True)

  • Provide the path to the data folder in
  • Run
       Program to run the preprocessing
    import preprocessing
    #ensure and this file are in the same folder
    def main():
    	#change path accordingly
    	print "running ..."
    	print "...outta here!"

The NETCDF files in the mergNETCDF file generated are now ready to be ingested into OCW.

  • To ingest these files into OCW format, use the readMergData to open the files in OCW format. This step assumes that OCW installed.
#Module to read MERG data in OCW format
import glob
import itertools
import Nio
import numpy as np
import as ma
import sys

#existing modules in services
import process

#----------------------- GLOBAL VARIABLES --------------------------
# --------------------- User defined variables ---------------------
LATMIN = '-5.0'         #min latitude; -ve values in the SH e.g. 5S = -5
LATMAX = '20.0'            #max latitude; -ve values in the SH e.g. 5S = -5
LONMIN = '-30.0'        #min longitude; -ve values in the WH e.g. 59.8W = -59.8
LONMAX = '30.0'            #min longitude; -ve values in the WH e.g. 59.8W = -59.8
T_BB_MAX = 250           #warmest temp to allow
#------------------- End user defined Variables -------------------

#------------------------ End GLOBAL VARS -------------------------

def readMergData(dirname):
        Read MERG data into RCMES format
        Directory to the MERG files in NETCDF format
        A 3D masked array (t,lat,lon) with only the variables which meet the minimum temperature 
        criteria for each frame
        **remove below**
        A dictionary of all the MERG data from the files in the directory given.
        The dictionary contains, a string representing the time (a datetime string), a 
        tuple (lat,lon,value) representing only the data that meets the temperature requirements
        i.e. T_BB_MAX

        The MERG data has been converted to NETCDF using LATS4D. See MERG utility function preprocessingMERG
        The data has the same lat/lon format
    global LAT
    global LON

    # these strings are specific to the MERG data
    mergVarName = 'ch4'
    mergTimeVarName = 'time'
    mergLatVarName = 'latitude'
    mergLonVarName = 'longitude'
    filelistInstructions = dirname + '/*'
    filelist = glob.glob(filelistInstructions)

    #sat_img is the array that will contain all the masked frames
    mergImgs = []
    #timelist of python time strings
    timelist = [] 
    time2store = None

    nfiles = len(filelist)

    # Crash nicely if there are no netcdf files
    if nfiles == 0:
        print 'Error: no files in this directory! Exiting elegantly'
        # Open the first file in the list to read in lats, lons and generate the  grid for comparison
        tmp = Nio.open_file(filelist[0], format='nc')
        #clip the lat/lon grid according to user input if needed like
        #latsraw = tmp.variables[mergLatVarName][mergLatVarName+"|"+LATMIN+":"+LATMAX].astype('f2')
        #lonsraw = tmp.variables[mergLonVarName][mergLonVarName+"|"+LONMIN+":"+LONMAX].astype('f2')
        latsraw = tmp.variables[mergLatVarName][:].astype('f2')
        lonsraw = tmp.variables[mergLonVarName][:].astype('f2')
        lonsraw[lonsraw > 180] = lonsraw[lonsraw > 180] - 360.  # convert to -180,180 if necessary
        LON, LAT = np.meshgrid(lonsraw, latsraw)
        #clean up
        latsraw =[]
        lonsraw = []
    for files in filelist:
        thisFile = Nio.open_file(files, format='nc') 
        #mask the data and fill with zeros for later 
        tempRaw = thisFile.variables[mergVarName][:].astype('int16')
        #clip the dataset according to user lat,long coordinates
#         tempRaw = thisFile.variables[mergVarName][mergLatVarName+"|"+LATMIN+":"+LATMAX \
#                                    +" " +mergLonVarName+"|"+LONMIN+":"+LONMAX ].astype('int16')

        tempMask = ma.masked_array(tempRaw, mask=(tempRaw > T_BB_MAX), fill_value=0) 
        #get the actual values that the mask returned
        tempMaskedValue = ma.zeros((tempRaw.shape)).astype('int16')
        for index, value in maenumerate(tempMask): 
            time_index, lat_index, lon_index = index            
            tempMaskedValue[time_index,lat_index, lon_index]=value    
        #timesRaw = thisFile.variables[mergTimeVarName]
        #convert this time to a python datastring
        time2store, _ = process.getModelTimes(files, mergTimeVarName)
        print time2store
        #extend instead of append because getModelTimes returns a list already and we don't 
        #want a list of list
        #to get the actual masked data only

    mergImgs = ma.array(mergImgs)

    return mergImgs, timelist


def maenumerate(mArray):
        Utility script for returning the actual values from the masked array
        Taken from:
        1 variable
        mArray - the masked array returned from the ma.array() command
        1 variable
        maskedValues - 3D (t,lat,lon), value of only masked values
    mask = ~mArray.mask.ravel()

    for index, maskedValue in itertools.izip(np.ndenumerate(mArray), mask):
        if maskedValue: 
            yield index

I hope this helps!