A THREDDS Tutorial

The Skinny

Installing, configuring and maintaining a THREDDS server may at first seem like an overwhelming task for someone approaching it for the first time. The real tasks involved are downloading and installing a few programs in the appropriate places, and then creating and modifying configuration files so your THREDDS server will serve what you want it to serve in the way you want it served. The installation is a one-time thing that won't be very difficult, especially given the system administrator help you'll need to install it correctly. If you already have sysadmin experience, this will be trivial. The more difficult task will be creating the configuration files - written in a dialect of XML - and getting the syntax correct such that the THREDDS magic you want to happen does indeed happen. The basic configuration is relatively easy, but if you want to use the full power of NcML for rewriting, modifying and combining datasets things get a bit harder. While the documentation is scattered and less than wholly coherent, there are plenty of examples available via web searching on which you can build. This document is hopefully a step towards making the documentation marginally less scattered and more coherent.

There is a related page containing various links pertinent to THREDDS and NcML.

Table of Contents



Introduction to THREDDS Data Server (TDS)



Overview and Summary

The THREDDS (Thematic Realtime Environmental Distributed Data Services) Data Server (TDS) is a package designed to make it as easy as possible for those who create scientific datasets to make them available to those who use them. The goal is to make datasets in many different formats and located in many different geographic locations available to users in a way that hides the data format and location information and presents only the data essential to the datasets themselves. It is a web server providing metadata and data access for scientific datasets. It employs several widely-used and useful data access protocols including OPeNDAP, WMS, WCS and HTTP.

Primarily the TDS is used to manually or automatically create data catalogs that provide virtual directories of data and their associated metadata, and then make them available to users via various data interrogation and transfer protocols. For example, if you have a series of individual NetCDF or HDF files containing gridded velocities, temperatures, etc. at one day intervals over a year, TDS can be used to create a virtual, combined file that a user sees and can access as a single file containing the data for the entire year. All or chosen parts of this virtual file can be downloaded, viewed or processed by a user without the tedious need to deal with 365 individual files. The virtual file can then be accessed via most widely used data access protocols such as OPeNDAP, WMS, WCS and HTTP.

The TDS uses an internal library to read datasets in most of the formats used in the geosciences - NetCDF, OPeNDAP, HDF, GRIB, NEXRAD, etc. - and transform them into a common internal data format called the Common Data Model (CDM). This allows the use of datasets in disparate formats, with the CDM library doing all the hard work of converting them all into a common internal format for further processing.

The TDS uses the NetCDF Markup Language (NcML) to modify and create virtual aggregations of and modify CDM datasets. NcML is used to process the internal CDM versions of the original files in various formats into virtual datasets via modifying and aggregating the CDM versions.

NcML can be used to add, delete or change the metadata in the original files. The metadata is data about the data included as the initial or header portion of, for example, a NetCDF file. This metadata includes dimensions, variables and attributes. A short example of a typical NetCDF header is:

dimensions:
	eta_rho = 128 ;
	xi_rho = 256 ;
	wind_time = UNLIMITED ; // (224 currently)
variables:
	float lon_rho(eta_rho, xi_rho) ;
		lon_rho:long_name = "longitude of RHO-points" ;
		lon_rho:units = "degree_east" ;
	float lat_rho(eta_rho, xi_rho) ;
		lat_rho:long_name = "latitude of RHO-pints" ;
		lat_rho:units = "degree_north" ;
	double wind_time(wind_time) ;
		wind_time:units = "days since 1970-01-01 0:00:00 0:00" ;
		wind_time:long_name = "time since initialization" ;
		wind_time:standard_name = "wind_time" ;
	float Uwind(wind_time, eta_rho, xi_rho) ;
		Uwind:units = "meter second-1" ;
		Uwind:long_name = "surface u-wind component" ;
		Uwind:time = "wind_time" ;
		Uwind:standard_units = "m s-1" ;
		Uwind:standard_name = "eastward_wind" ;
	float Vwind(wind_time, eta_rho, xi_rho) ;
		Vwind:units = "meter second-1" ;
		Vwind:long_name = "surface v-wind component" ;
		Vwind:time = "wind_time" ;
		Vwind:standard_units = "m s-1" ;
		Vwind:standard_name = "northward_wind" ;
wherein a dimension is eta_rho, a variable is Uwind, and an attribute is standard_name. If a key or desired piece of any of these types of metadata is missing from the original data file, NcML can be used to add it to the metadata presented as part of the full virtual data set. The original dataset will not be modified, but the virtual representation of it will included the modifications.

NcML is used to rename, add, delete and restructure variables from the original files. For instance, if you have a NetCDF file that contains a gridded temperature field on a known regular grid but doesn't contain the data and metadata about that grid then you can add that information via NcML. As an example, if lon_rho and lat_rho were missing from the file represented by the header file in the example above, they could easily be added.

NcML is used to combine or aggregate data from multiple CDM files.

After the original datasets are converted into a common format and then modified to suit whatever needs are desired, they are made available over the internet via the most useful data transfer protocols for the geosciences.

To summarize, the TDS reads in datasets in several different formats, converts them into a standard internal format, transforms and aggregates them into virtual datasets, and makes them available to users via several different data transfer methods.

Current Status

The current technical status of the TDS can always be found on the Technical Status Page, which contains news and announcements about the latest release version.



Installation

Related pages with additional information:

Installing THREDDS Prerequisites

The two key prerequisites for installing THREDDS are Java and Apache Tomcat. Many operating systems are shipped with a recent version of Java, while Tomcat usually has to be installed separately. Be sure to note which versions of Java and Tomcat your version of THREDDS requires. This information is supplied on the Getting Started page along with detailed instructions on how to install both packages.

Java

Java is a programming language designed to have minimal implementation dependencies, which enables developers to write a program once that will run on any device that includes a Java installation. To accomplish this goal, Java programs are compiled to run on so-called virtual machines, which are basically a software version of a computer's hardware CPU. Once a virtual machine is create for a specific hardware architecture, any Java program should run on that architecture. Java is especially useful for client-server web applications such as THREDDS.

Although it is unusual for a computing platform to not come with Java already installed, if the need arises the latest version of Java can be obtained at the download site at:

http://www.java.com/en/download/manual.jsp

which has virtual machine packages for Windows, Linux and Solaris machines for both 32- and 64-bit architectures. Apple supplies their own version of Java for the OS X operating system.

The most recent officially released version of THREDDS is 4.2, which requires Java 1.6 or above and recommends 1.6u24 or greater.

Once you have figured out where the Java installation is located, you need to specify it via a global environment variable. An example, for a Java installation located at /opt/java, would be:

export JAVA_HOME=/opt/java

The location is highly variable by platform, so you may have to consult your sysadmin about this.

Tomcat

Introduction

Apache Tomcat is an open source implementation of Java Servlet and JavaServer Pages techologies. A Java servlet is a Java programming language class used to extend the capabilities of servers that host web applications accessed by a request-response model. Servlets provide component-based, platform-independent methods for building web-based applications such as THREDDS. JavaServer Pages (JSP) is a technology for creating dynamically generated web pages based on HTML and XML. It is similar to the PHP web programming language but built on top of Java. Basically, Tomcat is an HTTP server that leverages the platform-independence of Java all the way up to the server level, enabling and allowing the portability of such web-based applications as THREDDS.

It should be noted that implementations of Java Servlet and JavaServer Pages other than Tomcat are available and can also be used with THREDDS. We have chosen to document Tomcat for the simple reason that it works well with THREDDS and that most of the available documentations is for that combination.

Tomcat is usually not included in standard operating system distributions and must be obtained from the Tomcat home site at:

http://tomcat.apache.org/

Tomcat can be obtained in source code format and compiled for your specific platform, but it is recommended that you skip that chore and simply download a binary distribution appropriate to your platform.

The most recent officially released version of THREDDS is 4.2, which requires Tomcat 5.5 or above and recommends the latest version of Tomcat 6.x.

Documentation

The Tomcat download page for version 6.x is at:

http://tomcat.apache.org/download-60.cgi

and the documentation page for version 6.x at:

http://tomcat.apache.org/tomcat-6.0-doc/index.html

The installation documentation for 6.x is at:

http://tomcat.apache.org/tomcat-6.0-doc/setup.html

Installation

A basic installation procedure begins with downloading a compressed version of the distribution, in this case apache-tomcat-6.0.35.tar.gz. We will be installing Tomcat in the /opt directory, but that can vary depending on how your local installation is configured. After we have downloaded the distribution, we enter the following commands (with root privileges required for this particular location):

mv apache-tomcat-6.0.35.tar.gz /opt
cd /opt
tar xzvf apache-tomcat-6.0.35.tar.gz
mv apache-tomcat-6.0.35 tomcat

It is recommended that a setenv.sh file be created in /opt/tomcat/bin to set Tomcat startup options. To do this:

cd /opt/tomcat/bin
vi setenv.sh

and add the following lines (modifying the locations for JAVA_HOME and CATALINA_HOME if so required by your configuration):

#!/bin/sh
#
# ENVARS for Tomcat and TDS environment
#
JAVA_HOME="/opt/java"
export JAVA_HOME

JAVA_OPTS="-Xmx4096m -Xms512m -server -Djava.awt.headless=true -Djava.util.prefs.systemRoot=$CATALINA_HOME/content/thredds/javaUtilPrefs"
export JAVA_OPTS

CATALINA_HOME="/opt/tomcat"
export CATALINA_HOME

with -Xmx4096m replaced by -Xmx1500m for 32-bit Java installations.

See the Security Measures section below for additional steps recommended for installing a production installation of THREDDS.

Starting the Server

There are a couple of options for starting Tomcat, with extensive details available at:

http://www.mulesoft.com/tomcat-start

Basically, you can start Tomcat manually or automatically.

Starting Manually

A manual start - given the location into which we've installed Tomcat - would be performed via:

/opt/tomcat/bin/startup.sh

while a shutdown would be performed via:

/opt/tomcat/bin/shutdown.sh

Both of these require that the JAVA_HOME environment variable be set as shown above, so if an error messages says that it can't find this then you must set it.

Starting Automatically

Tomcat can also be run as a UNIX daemon using a program called jsvc. This requires compiling a program included with the Tomcat distribution via the following steps (assuming that Tomcat has been installed as per the previous instructions):

cd /opt/tomcat/bin
tar xzvf commons-daemon-native.tar.gz
cd commons-daemon-1.0.x-native-src/unix
./configure
make
cp jsvc ../..

This procedure puts the jsvc binary in the /opt/tomcat/bin directory and allows you to run Tomcat as a daemon via:

/opt/tomcat/bin/jsvc -cp ./bin/bootstrap.jar -outfile ./logs/catalina.out -errfile ./logs/catalina.err org.apache.catalina.startup.Bootstrap

Additional information about this procedure including additional options for running the daemon can be found at:

http://tomcat.apache.org/tomcat-6.0-doc/setup.html

Checking for a Running Tomcat Server

Once you have started the Tomcat server via one of the procedures above, you can verify that it is running either via the command line with:

ps -ef | grep tomcat

or by opening a browser window or tab and going to:

http://localhost:8080/

Troubleshooting

Tomcat troubleshooting starts with checking the logs in the directory:

/opt/tomcat/logs

with the most useful messages usually appearing in the main log file, i.e.

/opt/tomcat/logs/catalina.out

Security Measures

The THREDDS site has a checklist for production installation available at:

http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/tutorial/Checklist.html

that details additional installation steps recommended for securing a production Tomcat installation.

Performance Tuning

If you are having performance issues with Tomcat, for instance, it's running very slowly, then you have some options to tune it for optimum performance. A good overview of these options can be found at:

http://www.mulesoft.com/tomcat-performance

Installing THREDDS

Obtaining THREDDS

The latest THREDDS release can always be found on the page:

http://www.unidata.ucar.edu/projects/THREDDS/tech/TDS.html

which also provides information about the required versions of Java and Tomcat.

The TDS can be downloaded as either a jar or a war file. A jar file is a Java archive file, which bundles java binary classes into a single file for easy accessibility. A war file is a web application archive file, which bundles JSP/Servlet classes into a single file for the same reason. Since we've already installed the Tomcat web application server, we will download the war version of THREDDS which is thredds.war.

Installing the Web Archive File

The TDS Java web archive file is installed in the webapps subdirectory of the main Tomcat installation, i.e.

/opt/tomcat/webapps

This is done by simply copying or moving the thredds.war into that directory, after which a running Tomcat server will automatically unpack and start the programs therein. You can tell that the process has at least started if you can see that a thredds subdirectory has been created in /opt/tomcat/webapps, i.e.

/opt/tomcat/webapps/thredds

Checking for a Running THREDDS Server

If the installation has proceeded correctly, you can point your browser at:

http://localhost:8080/thredds/

and if the dynamically generated THREDDS page appears you have been successful.

Configuring THREDDS

Overview

THREDDS catalogs collect, organize and describe accessible datasets. They provide a hierarchical structure for organizing the datasets, an access method (via URL), and a human readable name for each dataset. The TDS is configured using a language called XML.

The eXtensible Markup Language or XML is a markup language that defines a set of rules for encoding documents in a format that is both human- and machine-readable. An XML document is immediately recognizable by the number of angle brackets contained therein, and sort of resembles HTML markup although the two markup languages have different purposes. A good way of looking at this is that HTML is for form and XML is for content.

XML documents consist mainly of elements and attributes. An element is a component that begins with a start-tag and ends with a matching end-tag, with an example being:

<serviceName>odap</serviceName>

where serviceName is the element name. An attribute is a name/value pair within a start-tag, and example being:

<dataset name="TDS Tutorial" </dataset>

where within the element dataset the attribute name is name and the value is TDS Tutorial.

An XML schema or grammar is a description of a specific type of XML document, usually expressed in terms of constraints on the structure and content of documents of that type. Typically a schema constrains the set of elements that may be used in a document, which attributes can be applied to them, the order in which they may appear, and the allowable parent/child relationships. Basically, a schema allows you to specify what you really need in a document while excluding extraneous material. The XML schema for the THREDDS markup language can be found at:

http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.2.xsd

although you don't really want to look at this until you've spun up a lot further on the basics of THREDDS configuration.

An XML namespace is used to provide uniquely named elements and attributes in an XML document, that is, a way to avoid element name conflicts. The namespace for the Dataset Inventory Catalog Specification Version 1.0 for THREDDS can be found at:

http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html

although this is also best avoided until later.

Configuration Directory

The configuration directory is located at:

/opt/tomcat/content/thredds/conf

wherein the configuration files are located. This separation of software and configuration files allows the TDS software to be upgraded without disturbing the configuration files.

The configuration is performed within two catalog files: threddsConfig.xml and catalog.xml. Most of the basic configuration is done via the latter, with more advanced features controlled via the former. We now take a look at how to configure them, starting with simple examples and moving towards more complex ones.

Basic Catalogs

Services and Datasets

We will proceed with the useful pedagogical technique of moving from the simple to the complex. The simple and complete THREDDS catalog that follows defines a single service (OPeNDAP) that serves a single dataset (110312006.nc).

Example 1 - Basic Catalog

1 <?xml version="1.0" ?>
2 <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
3   <service name="dodsServer" serviceType="OpenDAP"  base="/thredds/dodsC/" />
4   <dataset name="SAGE III Ozone Loss for Oct 31 2006" serviceName="dodsServer" urlPath="sage/110312006.nc"/>
5 </catalog>

where:

The indentation of the sub-elements service and dataset is not required, but can be very useful for avoiding confusion when many layers of nesting are needed as we will see in future examples.

Basically, lines 1, 2 and 5 are boilerplate that can be ignored beyond acknowledging their mandatory presence for a complete THREDDS catalog, while lines 3 and 4 perform the tasks of specifying a dataset and a method for serving it over the internet.

The Service Element

If we look at the service element section of the THREDDS catalog specification, i.e.

http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#service

we discover that all three of the attributes shown are required. And not only are they required, the TDS requires that service base URLs must exactly match the values shown below for each available service type.

OPeNDAP
<service name="odap" serviceType="OPeNDAP" base="/thredds/dodsC/" />

NetCDF Subset Service
<service name="ncss" serviceType="NetcdfSubset" base="/thredds/ncss/grid/" />

WCS
<service name="wcs" serviceType="WCS" base="/thredds/wcs/" />

WMS
<service name="wms" serviceType="WMS" base="/thredds/wms/" />

HTTP
<service name="fileServer" serviceType="HTTPServer" base="/thredds/fileServer/" />

The Dataset Element

Looking at the dataset element section of the THREDDS catalog specification, i.e.

http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataset

we discover that only the name attribute is required, while the serviceName and urlPath attributes are optional since they can be specified by an access element as will be shown in upcoming sections. In this case the two optional attributes implicitly define an access element and are used for this simple case where the dataset has a single access via, in this example, the OpenDAP server.

It is important to make the dataset name both descriptive and succinct. This name is what the user will see and use to make choices from the web page that displays our THREDDS catalog. The user will have a much easier time finding specific datasets if they are specified as in our example, i.e. SAGE III Ozone Loss for Oct 31 2006, rather with than something cryptic like Ozone Data. We must frequently remind ourselves that the point of THREDDS is to make things easier for users so they will use our datasets instead of throwing up their hands in frustration and finding something else to do.

Constructing the Access Path

The information found in the dataset and service is combined with the address of your THREDDS service to construct an access URL. In this case, it is constructed from the server base URL http://motherlode.ucar.edu:8080, the base attribute value /thredds/dodsC, and the dataset attribute urlPath value sage/110312006.nc. The absolute value is:

http://motherlode.ucar.edu:8080/thredds/dodsC/sage/110312006.nc

Nesting Datasets

In nearly all cases where we want to serve datasets via THREDDS, we'll have more than one and more often than not many more than one. The simplest way to handle this is to declare a collection dataset with the dataset element that will be used to nest a collection of direct datasets - with the dataset element in our first catalog example being an example of this - that point directly to data. The following example illustrates a fairly common situation with geoscience data wherein we have a series of monthly average files we wish to make available.

Example 2 - Basic Catalog With Nesting

<?xml version="1.0" ?> 
 <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
   <service name="dodsServer" serviceType="OpenDAP"  base="/thredds/dodsC/" />

   <dataset name="SAGE III Ozone Loss Experiment" >
     <dataset name="January Averages" serviceName="dodsServer" urlPath="sage/avg/jan.nc"/>
     <dataset name="February Averages" serviceName="dodsServer" urlPath="sage/avg/feb.nc"/>
     <dataset name="March Averages" serviceName="dodsServer" urlPath="sage/avg/mar.nc"/>
   </dataset>

 </catalog>

This is identical to our first example except for the use of both collection and direct datasets. The collection dataset - which is simply a container for the direct datasets - has a short name attribute that's descriptive of and applicable to all the direct datasets within. The direct datasets each also have name attributes that serve the purpose of distinguishing them from each other.

The serviceName attribute for each of the direct datasets is identical to that in the previous example, but the urlPath differs for each one since it specifies the final part of the URL that will be used to access each of the different files. This particular catalog will result in a web page menu for the user that looks something like this:

SAGE II Ozone Loss Experiment
      January Averages
      February Averages
      March Averages
Note that the collection dataset container must be closed with another dataset element, and that while the collection dataset ends with a > instead of the /> used for the direct dataset.

Multiple Nesting Levels

We are not restricted to a single nested dataset nor to only two nesting levels. We can specify as many as we need to appropriately describe our datasets as seen in the following example wherein both monthly and daily average files from the same experiment are made available within the same collection dataset. We also require three nesting levels for the daily averages rather than the two we used for the monthly averages.

Example 3 - Basic Catalog with Multiple Nesting

<?xml version="1.0" ?>
 <catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
   <service name="dodsServer" serviceType="OpenDAP"  base="/thredds/dodsC/" />

   <dataset name="SAGE III Ozone Loss Experiment" >

     <dataset name="Monthly Averages" >
       <dataset name="January Averages" serviceName="dodsServer" urlPath="sage/avg/jan.nc"/>
       <dataset name="February Averages" serviceName="dodsServer" urlPath="sage/avg/feb.nc"/>
       <dataset name="March Averages" serviceName="dodsServer" urlPath="sage/avg/mar.nc"/>
     </dataset>

     <dataset name="Daily Averages" >
       <dataset name="January" >
         <dataset name="Jan. 1" serviceName="dodsServer" urlPath="sage/daily/jan/20010101.nc"/>
         <dataset name="Jan. 2" serviceName="dodsServer" urlPath="sage/daily/jan/20010102.nc"/>
         <dataset name="Jan. 3" serviceName="dodsServer" urlPath="sage/daily/jan/20010103.nc"/>
       </dataset>   
     </dataset> 

   </dataset>

 </catalog>

Note the usefulness of indenting the various levels when it comes to keeping track what's going on in the example catalog. This catalog will render something like this on our THREDDS page:

SAGE II Ozone Loss Experiment
    Monthly Averages
        January Averages
        February Averages
        March Averages
    Daily Averages
        January
            Jan. 1
            Jan. 2
            Jan. 3

Additional Dataset Attributes

The dataset attributes we've seen thus far suffice to uniquely identify and locate each dataset. However, much more information can be added to assist human and machine searchers identify a dataset in contexts larger than the dataset itself. The dataset element documentation at:

http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataset

shows several more attributes we can employ to make our datasets easier to find and understand. The following example shows the use of the collectionType, authority, ID and dataType attributes, with a brief explanation of each following the example. Even more additional attributes are available for making THREDDS datasets more easily discoverable by digital libraries, and will be discussed in later sections.

Example 4 - Catalog with Additional Dataset Attributes

<dataset name="SAGE III Ozone Loss Experiment" collectionType="TimeSeries">
  <dataset name="January Averages" serviceName="aggServer" urlPath="sage/avg/jan.nc" authority="unidata.ucar.edu" ID="sage-20938483">
	 <dataType>Trajectory</dataType>
  </dataset>
</dataset>

The collectionType attribute is used to indicate a coherent collection dataset which has only one level of nested datasets. At this time, the only coherent collection values available for this attribute are TimeSeries and Stations, although more are promised for future versions.

The authority and ID attributes are used in combination to create a globally unique identifier for the dataset. In the example, we see that Unidata is the authority and their identification number for this specific dataset is sage-20938483. The broader context is that Unidata has itself thousands of different datasets, and that this particular one is one of many from the SAGE experiment - itself one of many experiments whose datasets they store - with the given ID number. There are many other geoscience institutes that also have their own collection of datasets from various experiments and investigations. The use of both a location and number to identify a dataset better ensures that the identification will be unique. For example, while it's possible that Unidata and Scripps could each give one of their datasets the same ID number, the additional use of the authority attribute will guarantee that a dataset with the ID number 42 will be globally differentiated by the fact that one has the authority unidata.ucar.edu and the other scripps.ucsd.edu.

The dataType attribute is useful for helping the user decide how to present the dataset being obtained. This is best and most simply explained via a list of the available values: Grid, Image, Station, Swath and Trajectory. For processing or presentation purposes, a grid of discrete and spatially separated values is handled much differently than an image or a single trajectory.

The Metadata Element

The catalog shown in Example 3 repeats an identical serviceName value many times. A method for avoiding this sort of repetition is available via the metadata element. Here is an example of its use:

Example 5 - Catalog Fragment Using Metadata Element

<dataset name="SAGE III Ozone Loss Experiment" >

   <metadata inherit="true">
     <serviceName>dodsServer</serviceName>
     <dataType>Trajectory</dataType>
     <dataFormatType>NetCDF</dataFormatType>
     <authority>unidata.ucar.edu</authority>
   </metadata>

   <dataset name="January Averages" urlPath="sage/avg/jan.nc" ID="sage-23487382"/>
   <dataset name="February Averages" urlPath="sage/avg/feb.nc" ID="sage-63656446"/>
   <dataset name="Global Averages" urlPath="sage/global.nc" ID="sage-7869700g" dataType="Grid"/>

</dataset>

The metadata element here is used within the collection dataset container to apply various attributes to the direct datasets therein. The inherit attribute value true indicates that all the attribute information inside the metadata element applies to the current dataset and all those nested within it. This inheritance can be overridden, though, by simply specifying a different attribute value within an individual direct dataset. The example shows this in the dataset element with the name Global Averages, wherein the metadata-specified dataType attribute name Trajectory is overridden with the name Grid.

Compound Service Elements

In all of the examples thus far, the datasets have been made available via a single access method, while there are five available methods. Datasets can be made available via more than one access method by defining and referencing a compound service element, and example of which follows.

Example 6 - Compound Service Elements

<?xml version="1.0" ?>
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
   <service name="all" serviceType="Compound" base="" >
      <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
      <service name="wcs" serviceType="WCS" base="/thredds/wcs/" />
   </service>
   <dataset name="SAGE III Ozone Loss for Oct 31 2006" serviceName="dodsServer" urlPath="sage/110312006.nc"/>
      <serviceName>all</serviceName>
   </dataset>
</catalog>

Configuration Catalogs

A Configuration Catalog is a basic THREDDS catalog with extensions that contain information about the datasets to be served and what services will be available for each dataset. The extensions are the datasetRoot and datasetScan elements. All the catalog examples given thus far contain no information about the directory in which the datasets reside. The examples show how a URL is constructed to access a dataset via one of the available services, but there is nothing thus far that relates a given URL to a specific directory on your computer.

The datasetROOT Element

The datasetRoot element maps a URL base path to a directory, allowing the constructed URLs to find the directory in which a dataset resides. Suppose you have several datasets in the directory with absolute path /data/ocean/gulf and wish to make them available via THREDDS. The files in the directory are:

salinity.nc
temp.nc
hdf/
  salinity.hdf
  temp.hdf

The following example shows how that location can be mapped onto the URLs used for accessing the datasets.

Example 7 - The datasetRoot Element

...
  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />

  <datasetRoot path="ocean" location="/data/ocean/gulf/" />

  <dataset name="A Test Dataset" ID="testDataset" urlPath="ocean/salinity.nc" >
    <serviceName>odap</serviceName>
  </dataset>
  <dataset name="A Test Dataset 2" ID="testDataset2" urlPath="ocean/hdf/salinity.hdf" >
    <serviceName>odap</serviceName>
  </dataset>
...

The directory path /data/ocean/gulf is aliased to ocean, which is inserted into the URL string right after the part indicating the service type. The URLs for accessing these files via the OPeNDAP server will be:

http://hostname:8080/thredds/dodsC/ocean/salinity.nc
http://hostname:8080/thredds/dodsC/ocean/hdf/salinity.hdf

where http://hostname:8080 is the server name, thredds is the web application name, dodsC is the service name, ocean is the data root alias, and sanity.nc and hdf/salinity.hdf are the filenames relative to the directory being aliased which, in this example, is /data/ocean/gulf.

Multiple datasetRoots can be defined in a catalog, as in the following example.

Example 8 - Multiple datasetRoot Elements

...
  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />

  <datasetRoot path="ocean" location="/data/ocean/gulf/" />
  <datasetRoot path="atmos" location="/data/atmos/" />

  <dataset name="Ocean Data Test" ID="testDataset" urlPath="ocean/salinity.nc" >
    <serviceName>odap</serviceName>
  </dataset>
  <dataset name="Atmosphere Data Test" ID="testDataset2" urlPath="atmos/sfc_wind.nc" >
    <serviceName>odap</serviceName>
  </dataset>
...

The datasetScan Element

The datasetScan element specifies filesystem locations to be scanned for datasets when generating a catalog. Up until now our examples have entailed creating catalogs from individually specified datasets. While this is a fine way to create a catalog when you only have a few datasets, it can get very tedious very quickly if you have hundreds of datasets, for example, a series of daily files of gridded temperatures over several years. Like the datasetRoot element, this element defines a mapping between a URL base path and a directory. Unlike that element, the datasetScan element will automatically serve some or all of the datasets found in the scanned directory instead of working with individual dataset elements to define the datasets.

A simple example of a catalog employing the datasetScan element follows.

Example 9 - A Catalog with a datasetRoot Element

<?xml version="1.0" encoding="UTF-8"?>
<catalog name="Ocean Data" version="1.0.1" 
    xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
    xmlns:xlink="http://www.w3.org/1999/xlink">

  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
  <datasetScan name="Ocean Data" path="ocean" location="/data/ocean/" >
    <serviceName>odap</serviceName>
  </datasetScan >
</catalog>

The path attribute of the datasetScan element is the part of the URL that identifies this particular datasetScan and is used to map dataset URLs to a location. The location attribute gives the location of the dataset collection on the local filesystem.

In the catalog shown to a client requesting a specific datasetScan, the datasetScan element is shown as a catalog reference, that is, replaced by a catalogRef element. For example, the catalog in Example 9 would be transformed for a client request into:

Example 10 - A Catalog with a datasetRoot Element Replaced by a catalogRef Element

<?xml version="1.0" encoding="UTF-8"?>
<catalog name="Ocean Data" version="1.0.1"
    xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
    xmlns:xlink="http://www.w3.org/1999/xlink">

  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
  <catalogRef xlink:href="/thredds/catalog/ocean/catalog.xml" xlink:title="Ocean Data" name="" />
</catalog>

The boldfaced portion of Example 9 is replaced by the same in Example 10.

The catalog.xml of Example 10 will be generated dynamically for display by the server when requested by the client. It will do this by scanning the /data/ocean directory specified in the datasetScan element.

If this catalog were to scan a directory structure that looks like:

/data/ocean/
   atlantic/
      salinity/
         s_20050101.nc
         s_20050102.nc
           ...
         s_20051231.nc
      temperature/
         t_20050101.nc
           ...
         t_20051231.nc
   pacific/
      ...
   indian/
      ...

the result of a client request for the top-level catalog created by this datasetScan request to http://server:8080/thredds/catalog/ocean/catalog.xml would look something like:

Example 11 - First Level Catalog Created by the datasetRoot Element

<catalog ...>
  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
  <dataset name="Ocean Data">
    <metadata inherited="true">
      <serviceName>odap</serviceName>
    </metadata>
    <catalogRef xlink:title="atlantic" xlink:href="atlantic/catalog.xml" name="" />
    <catalogRef xlink:title="pacific" xlink:href="pacific/catalog.xml" name="" />
    <catalogRef xlink:title="indian" xlink:href="indian/catalog.xml" name="" />
  </dataset>
</catalog>

A request for the second-level catalog atlantic in Example 11 would be of the form http://server:8080/thredds/catalog/ocean/atlantic/catalog.xml and look something like:

Example 12 - Second Level Catalog Created by the datasetRoot Element

<catalog ...>
  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
  <dataset name="ocean/atlantic">
    <metadata inherited="true">
      <serviceName>odap</serviceName>
    </metadata>
    <catalogRef xlink:title="salinity" xlink:href="atlantic/salinity/catalog.xml" name="" />
    <catalogRef xlink:title="temperature" xlink:href="atlantic/temperature/catalog.xml" name="" />
  </dataset>
</catalog>

A client request for the first subdirectory in Example 11 would have the URL http://server:8080/thredds/catalog/ocean/atlantic/salinity/catalog.xml and would create a catalog looking like:

Example 13 - A Data Catalog for the First Subdirectory in Example 12

<catalog ...>
  <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
  <dataset name="ocean/atlantic/salinity>
    <metadata inherited="true">
      <serviceName>odap</serviceName>
    </metadata>
    <dataset name="s_20050101.nc"
             urlPath="ocean/atlantic/salinity/s_20050101.nc" />
    <dataset name="s_20050101.nc"
             urlPath="ocean/atlantic/salinity/s_20050101.nc" />
      ...
    <dataset name="s_20050101.nc"
             urlPath="ocean/atlantic/salinity/s_20050101.nc" />
  </dataset>
</catalog>

A summary of what happens when a datasetScan is performed:

Using datasetScan With Filters

A datasetScan element can specify a subset of the available files and directories to include in the generated catalog using the filter element. A simple example of a filter element is:

<filter>
  <include wildcard="*.grib1"/>
  <exclude wildcard="*_0000.grib1"/>
</filter>



The Catalog Configuration File



Related and useful pages:

The XML-based THREDDS catalog configuration file catalog.xml is constructed of nested elements. The top-level element is the Catalog element. An example is:

<catalog name="TGLO/TABS THREDDS Server"
        xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
        xmlns:xlink="http://www.w3.org/1999/xlink">
  ...
</catalog>

The Name property element provides a name for this catalog, in case you wish to supply more than one named catalog. The xmlns elements provide the location of the XML THREDDS namespace being used.

Inside the Catalog element are the Service and Dataset elements which provide, respectively, services by which the files are made available and lists of the files being made available.


Configuring The Services


The services presently (6/09) available under the Service element are specified as shown in the following example:

<catalog ...>
 ...
 <service name="all" serviceType="Compound" base="" >
     <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
     <service name="wcs" serviceType="WCS" base="/thredds/wcs/" />
     <service name="subsetServer" serviceType="NetcdfSubset" base="/thredds/ncss/grid/" suffix="/dataset.html"/>
     <service name="HTTPServer" serviceType="HTTPServer" base="/thredds/fileServer/" />
  </service>
 ...
</catalog>

This will result in the rendered text.

Access:

   1. OPENDAP: http://csanady.tamu.edu:8080/thredds/dodsC/GNOME/GNAM-hind-reg-24/GNAM-hind-reg-09-05-24-00-24.nc
   2. WCS: http://csanady.tamu.edu:8080/thredds/wcs/GNOME/GNAM-hind-reg-24/GNAM-hind-reg-09-05-24-00-24.nc
   3. NetcdfSubset: http://csanady.tamu.edu:8080/thredds/ncss/grid/GNOME/GNAM-hind-reg-24/GNAM-hind-reg-09-05-24-00-24.nc/dataset.html
   4. HTTPServer: http://csanady.tamu.edu:8080/thredds/fileServer/GNOME/GNAM-hind-reg-24/GNAM-hind-reg-09-05-24-00-24.nc

on the separate HTML page created for each available file. An example of such a page is here.

These services are explained in the following.


OPeNDAP Service

Upon clicking the string following OPENDAP in the rendered text above, you will obtain a separate page entitled OPeNDAP Dataset Access Form, i.e. the standard WWW Interface form created by the OPeNDAP server, the use of which is extensively documented in the OPenDAP User Guide. A separate OPeNDAP Dataset Access Form HTML page created for each available file. An example of such a page is here.


WCS Service

Clicking on the string following WCS will provide (assuming your browser has the capability to render XML as HTML) an HTML-rendered version of the WCS capabilities document. This is provided by the THREDDS WCS Server, which implements the OGC Web Coverage Service (WCS) 1.0.0 specification. The capabilities document tells you what is available so you can take the next step of actually obtaining data rather than metadata from the THREDDS WCS server. The data can be served in either GeoTIFF or NetCDF format. This cannot be done directly via the THREDDS interface but must be performed with a WCS client program. Some available clients are:

The THREDDS WCS Configuration page describes the limitations of files that can be served via the WCS server:


WCS Service Configuration

This section details how to modify the TDS configuration files to enable and use the embedded WCS server. Methods for accessing the enabled WCS server can be found in the WCS Client Requests section.

The general steps to modify the configuration files to support WCS are as follows.

1. Modifying the threddsConfig.xml File

In the default threddsConfig.xml file, the WCS element is set to false. The available parameters for the WCS element are explained on the threddsConfig.xml Tutorial Page. They are:

  <WCS>
    <allow>true</allow>
    <dir>/temp/ncache/</dir>
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>
  </WCS>

This controls the Web Coverage Service (WCS) , which allows WCS clients to specify a subset of a dataset and download GeoTIFF or netCDF files. By default, this service is off. The elements and their allowable values are:

  1. allow: to disallow, change this to false. If you don't use a WCSservice in your catalogs, there will be no valid URLs for this service.
  2. dir: the working directory for files to be downloaded (choosing a cache directory)
  3. scour: how often to scour the working directory, to delete files that were not successfully downloaded.
  4. maxAge: how long to leave the files in the working directory while the download is occurring. The files are deleted after a successful download.

If WCS is allowed but the directory is not set, the TDS will use the ${tomcat_home}/content/thredds/wcs/wcache/ directory for temporary files.

2. Configure Datasets in the catalog.xml File to Have a WCS Access Method

After WCS is enabled in the threddsConfig.xml file, the datasets need to be configured to have a WCS access method in the catalog.xml file. The service element's serviceType and base attribute values must be as follows:

<service name="wcs" serviceType="WCS" base="/thredds/wcs/" />

The dataset being served must reference this service (or a containing compound service) by the service name wcs:

<dataset ID="sample" name="Sample Data" urlPath="sample.nc">
  <serviceName>wcs</serviceName>
</dataset>

The dataset can be configured by datasetRoot or datasetScan as appropriate (as outlined in basic configuration). They are listed in the resulting THREDDS catalogs as are other datasets. WCS clients may not be able to directly use the THREDDS catalogs to find the WCS services but the catalogs are useful for users to browse and for separate search services (e.g., OGC catalog services).

WCS Service Limitations

The limitations of the present and future THREDDS WCS server are explained on the THREDDS WCS page. The current 1.0.0 version has the following restrictions:

  1. Interpolation is not available (i.e. interpolationMethod="none").
  2. CRS/SRS
    1. All CRS/SRS are listed as WGS84(DD) even though it may be unrelated to the CRS of the data.
    2. CRS is horizontal only, i.e. XY.
    3. The response coverage is in the native CRS of the data, as implied by the lack of interpolation.
    4. The netCDF-Java library understands a number of projections (i.e. a subset of the CF convention grid mapping options, most assuming a spherical earth) including a simple lat/lon grid [-180/180 and -90/90].
    5. All BBOX requests are assumed to be in the lat/lon of the native projection.
  3. Only one value can be specified as the temporal selection, i.e. no list or min/max/res.
  4. Range:
    1. Each coverage has only one range field.
    2. The range axis is Vertical only if the coordinate has a vertical component.
    3. Only one value can be specified for the range axis selection, i.e. no list or min/max/res.
  5. The support GetCoverage response formats are:
    1. GeoTIFF - A grayscale 8-bit GeoTIFF file.
    2. GeoTIFFfloat - A floating point "Data Sample" GeoTIFF file.
    3. NetCDF3 - A NetCDF file following the CF-1.0 convention.

WCS Service Access

Methods for accessing the WCS server are explained in the WCS Client Requests section.


NetCDF Subset Service

As explained on the THREDDS NetCDF Subset Service page, the subset service can be used in two modes.

Note: As per a conversation with Rich Signell, the subsetting capabilities require not only a regular grid, but also that the lon/lat values are specified as 1-D rather than 2-D fields. This might prove to be painful.


HTTP File Server Service

This is a very simple service that simply allows you to download the entire file by clicking on the string following HTTPServer.


The Datasets


A basic dataset configuration structure is shown in the following example:

<catalog ...>
 ...
 <dataset name="TGLO/TABS Model Output Products">
    ...
     <metadata ...>
       ...
     </metadata>
    ...
     <datasetScan ...>
     ...
     </datasetScan>
    ...
 </dataset>
 ...
</catalog>

The DatasetScan element will be explained first, and then the Metadata element that provides metadata about the datasets.


The DatasetScan Element

The most basic way to serve files via the THREDDS Data Server (TDS) is by the explicit specification of individual file names. While this is sufficient if one is only serving a few files, if there are more than a few files and, moreso, if the list of files is updated regularly, it is basically mandatory to use the DatasetScan element to automatically create lists of available files.

A very simple example of a DatasetScan element configuration is:

<datasetScan name="ROMS History Files" ID="TABSDatasetScan"
               path="HIS" location="/data1/TGLO/HIS" harvest="true">
</datasetScan>
This example scans the disk directory /data1/TGLO/HIS and creates a top-level THREDDS directory called ROMS History Files containing all of the files contained therein. The use of other elements within the above DatasetScan element allow the provision of further convenience and information. Some useful elements are explained in the following.

Filter Element

If the directory the DatasetScan element is scanning a directory that contains files you want to keep in the directory but don't want to serve, then the Filter element can be used to include or exclude specific files. For example, if we want to serve only those NetCDF files with the *.nc suffix, then we would include the following Filter element within the DatasetScan element:

<filter>
    <include wildcard="*.nc"/>
</filter>

Sort Element

Our TGLO/TABS archive directories contain 6-hourly or daily files for months or even years, and we want to show them sorted in either oldest-to-youngest or youngest-to-oldest order. The former can be implemented with the Sort as follows:

<sort&rt;
    <lexigraphicByName increasing="true" />
</sort>

This obtains the order we want since the filenames contain the year, month and day in a YY-MM-DD format that sorts as oldest to youngest when the Sort element is true.

Specifying the Most Recent File via the addLatest Element

For those interested in obtaining only the most recent file in a given directory a addLatest element is provided. It takes the form:

<addLatest>
   <simpleLatest name="latest.xml" top="true" serviceName="latest" />
</addLatest>

This creates a first entry at the top of the column labelled latest which takes the user directly to the most recently acquired file.

Specifying the Size of the Available Files

The addDatasetSize element adds a column showing the size of each of the available files. It takes the form:

<addDatasetSize/>


The Metadata Element

Related:

The THREDDS catalog specification namespace allows all sorts of metadata to be added, and even further metadata can be supplied using separate XML namespaces.

The TDS Distribution Metadata Example

The standard TDS distribution comes with an enhanced data catalog example containing a set of metadata that is recommended as a basic starting level for all your datasets. The XML file that produced this example is reproduced below, with the metadata portions highlighted in bold.

Table: The Enhanced Data Catalog enhancedCatalog.xml
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns:xlink="http://www.w3.org/1999/xlink"
   name="Unidata THREDDS-IDD NetCDF-OpenDAP Server" version="1.0.1">

  <service name="latest" serviceType="Resolver" base="" />
  <service name="both" serviceType="Compound" base="">
    <service name="ncdods" serviceType="OPENDAP" base="/thredds/dodsC/" />
    <service name="HTTPServer" serviceType="HTTPServer" base="/thredds/fileServer/" />
  </service>

  <dataset name="NCEP Model Data">
    <metadata inherited="true">
      <serviceName>both</serviceName>
      <authority>edu.ucar.unidata</authority>
      <dataType>Grid</dataType>
      <dataFormat>NetCDF</dataFormat>
      <documentation type="rights">Freely available</documentation>
      <documentation xlink:href="http://www.emc.ncep.noaa.gov/modelinfo/index.html"
             xlink:title="NCEP Model documentation"></documentation>
      <creator>
        <name vocabulary="DIF">DOC/NOAA/NWS/NCEP</name>
        <contact url="http://www.ncep.noaa.gov/" email="http://www.ncep.noaa.gov/mail_liaison.shtml" />
      </creator>
      <publisher>
        <name vocabulary="DIF">UCAR/UNIDATA</name>
        <contact url="http://www.unidata.ucar.edu/" email="support@unidata.ucar.edu" />
      </publisher>
      <timeCoverage>
        <end>present</end>
        <duration>14 days</duration>
      </timeCoverage>
    </metadata>

    <datasetScan name="ETA Data" ID="testEnhanced"
                 path="testEnhanced" location="content/testdata/"
                 harvest="true">
      <metadata inherited="true">
        <documentation type="summary">NCEP North American Model : AWIPS 211 (Q) Regional - CONUS
             (Lambert Conformal). Model runs are made at 12Z and 00Z, with analysis and forecasts every 6
             hours out to 60 hours. Horizontal = 93 by 65 points, resolution 81.27 km, LambertConformal
             projection. Vertical = 1000 to 100 hPa pressure levels.</documentation>
        <geospatialCoverage>
          <northsouth>
            <start>26.92475</start>
            <size>15.9778</size>
            <units>degrees_north</units>
          </northsouth>
          <eastwest>
            <start>-135.33123</start>
            <size>103.78772</size>
            <units>degrees_east</units>
          </eastwest>
          <updown>
            <start>0.0</start>
            <size>0.0</size>
            <units>km</units>
          </updown>
        </geospatialCoverage>
        <variables vocabulary="GRIB-1" />
        <variables vocabulary="">
          <variable name="Z_sfc" vocabulary_name="" units="gp m">Geopotential height, gpm</variable>
        </variables>
      </metadata>

      <filter>
        <include wildcard="*eta_211.nc" />
      </filter>
      <addID/>
      <sort>
        <lexigraphicByName increasing="false"/>
      </sort>
      <addLatest/>
      <addDatasetSize/>
      <addTimeCoverage datasetNameMatchPattern="([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})_eta_211.nc$"
                       startTimeSubstitutionPattern="$1-$2-$3T$4:00:00"
                       duration="60 hours" />
    </datasetScan>
  </dataset>
</catalog>


The TGLO/TABS catalog.xml File


The current (6/09) catalog.xml file for serving the TGLO/TABS output is reproduced below.

Table: The TGLO/TABS catalog.xml
<?xml version="1.0" encoding="UTF-8"?>
<catalog name="TGLO/TABS THREDDS Server"
        xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"
        xmlns:xlink="http://www.w3.org/1999/xlink">

  <service name="latest" serviceType="Resolver" base="" />
  <service name="all" serviceType="Compound" base="" >
     <service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/" />
     <service name="wcs" serviceType="WCS" base="/thredds/wcs/" />
     <service name="subsetServer" serviceType="NetcdfSubset" base="/thredds/ncss/grid/" suffix="/dataset.html"/>
     <service name="HTTPServer" serviceType="HTTPServer" base="/thredds/fileServer/" />
  </service>

  <dataset name="TGLO/TABS Model Output Products">
    <metadata inherited="true">
      <serviceName>all</serviceName>
      <authority>edu.ucar.unidata</authority>
      <dataType>Grid</dataType>
      <dataFormat>NetCDF</dataFormat>
      <documentation type="rights">Freely available</documentation>
      <documentation xlink:href="http://www.emc.ncep.noaa.gov/modelinfo/index.html"
           xlink:title="TGLO/TABS Model documentation"></documentation>
      <creator>
        <name vocabulary="DIF">TAMU TGLO/TABS Project</name>
        <contact url="http://seawater.tamu.edu/tglo/rindex.html" email="baum@stommel.tamu.edu" />
      </creator>
      <publisher>
        <name vocabulary="DIF">TAMU TGLO/TABS Project</name>
        <contact url="http://seawater.tamu.edu/tglo/rindex.html" email="baum@stommel.tamu.edu" />
      </publisher>
    </metadata>

  <datasetScan name="ROMS History Files" ID="TABSDatasetScan"
               path="HIS" location="/data1/TGLO/HIS" harvest="true">

      <metadata inherited="true">
      <serviceName>all</serviceName>

      </metadata>

    <sort>
      <lexigraphicByName increasing="true" />
    </sort>

    <filter>
      <include wildcard="*.nc"/>
    </filter>

    <addLatest>
       <simpleLatest name="latest.xml" top="true" serviceName="latest" />
    </addLatest>

    <addDatasetSize/>

  </datasetScan>

  <datasetRoot path="test" location="content/testdata/"/>

  <datasetScan name="GNOME-ready Files" ID="GNOMEDatasetScan"
               path="GNOME" location="content/GNOME" harvest="true">

    <metadata inherited="true">
      <serviceName>all</serviceName>

        <documentation type="summary">TGLO/TABS Model</documentation>

        <geospatialCoverage>
          <northsouth>
            <start>31.0</start>
            <size>18.0</size>
            <units>degrees_north</units>
          </northsouth>
          <eastwest>
            <start>-98.0</start>
            <size>81.0</size>
            <units>degrees_east</units>
          </eastwest>
          <updown>
            <start>0.0</start>
            <size>0.0</size>
            <units>km</units>
          </updown>
        </geospatialCoverage>

    </metadata>

    <filter>
      <include wildcard="*.nc"/>
    </filter>

    <sort>
      <lexigraphicByName increasing="true" />
    </sort>

    <addLatest>
       <simpleLatest name="latest.xml" top="true" serviceName="latest" />
    </addLatest>

    <addDatasetSize/>

  </datasetScan>

  </dataset>

  <datasetScan name="Test Example" ID="testDatasetScan"
               path="testAll" location="content/testdata">
    <metadata inherited="true">
      <serviceName>all</serviceName>
    </metadata>
    <filter>
      <include wildcard="*.nc"/>
    </filter>
  </datasetScan>

  <catalogRef xlink:title="Enhanced Catalog Example" xlink:href="enhancedCatalog.xml" name=""/>

</catalog>



The THREDDS Server Configuration File



The threddsConfig.xml file contains parameters that control the THREDDS Data Server (TDS), as specified on the threddsConfig.xml Tutorial Page. This file differs from the main catalog catalog.xml in that it contains mostly default values of parameters that usually do not change, or at least aren't modified as often as in the latter.

Table: The TGLO/TABS threddsconfig.xml
<?xml version="1.0" encoding="UTF-8"?>
<threddsConfig>

  <!--
  The <catalogRoot> element:
  For catalogs you don't want visible from the /thredds/catalog.xml chain
  of catalogs, you can use catalogRoot elements. Each catalog root config
  catalog is crawled and used in configuring the TDS.
  -->
  <!-- E.g.,
  <catalogRoot>myExtraCatalog.xml</catalogRoot>
  <catalogRoot>myOtherExtraCatalog.xml</catalogRoot>
  -->

  <!--
   The <CatalogServices> element:
   -->
  <CatalogServices>
    <allowRemote>false</allowRemote>
  </CatalogServices>

  <!--
  The <NetcdfFileCache> element:
  -->
  <NetcdfFileCache>
    <minFiles>100</minFiles>
    <maxFiles>200</maxFiles>
    <scour>10 min</scour>
  </NetcdfFileCache>

  <!--
  The <NetcdfDatasetCache> element:
  -->
  <NetcdfDatasetCache>
    <minFiles>100</minFiles>
    <maxFiles>200</maxFiles>
    <scour>10 min</scour>
  </NetcdfDatasetCache>

  <!--
  The <HTTPFileCache> element:
  -->
  <HTTPFileCache>
    <minFiles>25</minFiles>
    <maxFiles>40</maxFiles>
    <scour>10 min</scour>
  </HTTPFileCache>

  <!--
  The <CdmValidatorService> element:
  -->
  <CdmValidatorService>
    <allow>false</allow>
    <dir>/data/tmp/thredds/cdmValidateCache/</dir>
    <maxFileUploadSize>1 Gb</maxFileUploadSize>
    <scour>24 hours</scour>
    <maxAge>30 days</maxAge>
  </CdmValidatorService>

  <!--
  The <NetcdfSubsetService> element:
  -->
  <NetcdfSubsetService>
    <allow>true</allow>
    <dir>/data/tmp/thredds/ncSubsetCache/</dir>
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>
  </NetcdfSubsetService>

<!--
  The <WCS> element:
  -->
  <WCS>
    <allow>true</allow>
    <allowRemote>true</allowRemote>
    <dir>/data/tmp/thredds/wcsCache/</dir>
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>
  </WCS>

  <!--
  The <Viewer> element:
  -->
  <!-- Viewer>my.package.MyViewer</Viewer -->

  <!--
  The <FmrcInventory> element:
  -->
  <FmrcInventory>
    <openType>XML_ONLY</openType>
  </FmrcInventory>

  <!--
  The <nj22Config> element:
  -->
  <!-- nj22Config
    <ioServiceProvider class="edu.univ.ny.stuff.FooFiles"/>
    <coordSysBuilder convention="foo" class="test.Foo"/>
    <coordTransBuilder name="atmos_ln_sigma_coordinates" type="vertical" class="my.stuff.atmosSigmaLog"/>
    <typedDatasetFactory datatype="Point" class="gov.noaa.obscure.file.Flabulate"/>
    <table type="GRIB1" filename="/home/rkambic/grib/tables/userlookup.lst"/>
    <table type="GRIB2" filename="/home/rkambic/grib/tables/grib2userparameters"/>
  </nj22Config -->

  <!--
  The <DiskCache> element:

  <DiskCache>
    <alwaysUse>false</alwaysUse>
    <dir>/data/tmp/thredds/cache/</dir>
    <scour>1 hour</scour>
    <maxSize>1 Gb</maxSize>
  </DiskCache>    -->

  <!--
  The <GribIndexing> element:
  -->
  <GribIndexing>
    <setExtendIndex>false</setExtendIndex>
    <alwaysUseCache>false</alwaysUseCache>
  </GribIndexing>

  <!--
  The <AggregationCache> element:

  <AggregationCache>
    <dir>/data/tmp/thredds/aggcache/</dir>
    <scour>24 hours</scour>
    <maxAge>90 days</maxAge>
  </AggregationCache>     -->

</threddsConfig>



WCS Client Requests



Introduction

Unlike with the OPeNDAP service, there is as yet no web GUI front-end for using the THREDDS WCS service. WCS service requests must be made using the syntax detailed in the WCS 1.0.0 specifications document. This syntax specifies a set of commands that must be serially appended to the dataset URL in order to obtain the geographically and/or temporally limited subset of the dataset being accessed. As the URLs can become quite lengthy, this can be a daunting task if done without the assistance of software designed to make access more user-friendly.

In this section we will demonstrate how to access the THREDDS WCS the slow and painful way as well as how to access it using the OWSLib and Gi-go clients mentioned previously.


The WCS Request Syntax

The THREDDS WCS server presently implements three request types. They are, in ascending order of complexity of request, GetCapabilities, DescribeCoverage and GetCoverage.

A WCS client must make a properly formatted request to the WCS server to obtain the desired data. With the THREDDS WCS server, all WCS requests start with, to use a trivial example:

http://servername:8080/thredds/wcs

to which we add a path indicating which file to use, in the case of this trivial example test.nc:

http://servername:8080/thredds/wcs/test.nc

In the substantial examples shown below, the analogous string to this will be:

http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc

wherein the file path is galeon/testdata/RUC.nc rather than the test.nc of the trivial example.

The WCS client then formats queries to the THREDDS WCS server for the chosen dataset. Some typical example queries are:


GetCapabilities Request

The GetCapabilities query obtains general information about the service, and summary information about the available data collections from which coverages may be requested.

http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc?request=GetCapabilities&version=1.0.0&service=WCS

There will be two general sections in the XML document obtained via this request. The first part will be a list of the capabilities of the server, i.e.

<WCS_Capabilities version="1.0.0">
 ...
<Capability>
  <Request>
    <GetCapabilities>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </GetCapabilities>
    <DescribeCoverage>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </DescribeCoverage>
    <GetCoverage>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </GetCoverage>
  </Request>
</Capability>
  ...
</WCS_Capabilities>

This request has returned the information that the available capabilities of the THREDDS server are GetCapabilities, DescribeCoverage and GetCoverage.

The second part will list the metadata of all of the available content, e.g.

<WCS_Capabilities version="1.0.0">
 ...
<Capability>
  <Request>
    <GetCapabilities>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </GetCapabilities>
    <DescribeCoverage>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </DescribeCoverage>
    <GetCoverage>
      <DCPType>
        <HTTP>
          <Get>
            <OnlineResource xlink:href="http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/RUC.nc"/>
          </Get>
        </HTTP>
      </DCPType>
    </GetCoverage>
  </Request>
</Capability>
  ...
</WCS_Capabilities>

This request has returned the information that the available capabilities of the THREDDS server are GetCapabilities, DescribeCoverage and GetCoverage.

The second part will list the metadata of all of the available content, e.g.

<WCS_Capabilities version="1.0.0">
  ...
<ContentMetadata>
  <ContentOfferingBrief>
    <description>
      Potential_temperature K       false     Potential temperature @ tropopause
    </description>
    <name>Potential_temperature</name>
    <label>Potential temperature @ tropopause</label>
    <lonLatEnvelope srsName="urn:ogc:def:crs:OGC:1.3:CRS84">
      <gml:pos>-153.57137910046424 11.747958797813363</gml:pos>
      <gml:pos>-48.67045857136577 57.47373334674363</gml:pos>
      <gml:timePosition>2002-12-02T22:00:00Z</gml:timePosition>
      <gml:timePosition>2002-12-03T01:00:00Z</gml:timePosition>
    </lonLatEnvelope>
  </ContentOfferingBrief>
  ...
</ContentMetadata>
</WCS_Capabilities>

This is just one of a couple dozen content offering descriptions within the returned document. It tells us that a variable with the name Potential_temperature is available within the given envelope of lon/lat coordinates and at the given time(s).


DescribeCoverage Request

A DescribeCoverage query obtains a full description of one more more coverages available.

http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/striped.nc?request=DescribeCoverage&version=1.0.0&service=WCS

The XML returned from the request will contain the following sections:


GetCoverage Request

The strings that must be appended to the dataset URL for this request include:

1. The version, service and request boilerplate specifications, i.e.

request=GetCoverage&version=1.0.0&service=WCS

2. The format specification to provide the format in which the data will be returned. As of this writing, the format specification can be either NetCDF or GeoTIFF, i.e.

format=NetCDF or format=GeoTIFF

3. The coverage specification is found in the description section of the DescribeCoverage response. In the case of the datafile striped.nc, this is ta so the string will be:

coverage=ta

4. The time specification is found in the timePosition section of the DescribeCoverage request. For striped.nc this is 2005-05-10T00:00:00Z so the string will be:

time=2005-05-10T00:00:00Z

5. The bbox specification is presently assumed to be in the lat/lon of the native projection of the dataset. The longitude range of the striped.nc dataset is (0.0,358.875) and the latitude range is (-89.4375,89.4375). This is found in the lonLatEnvelope section of the DescribeCoverage response, in this case:

<lonLatEnvelope srsName="urn:ogc:def:crs:OGC:1.3:CRS84">
  <gml:pos>0.0 -89.4375</gml:pos>
  <gml:pos>358.875 89.4375</gml:pos>
 ...
</lonLatEnvelope>

wherein the range is given as a box bounded by the lower left (0.0, -89.4375) and upper right (358.875, 89.4375) points of the domain. In case of confusion, one can download this NetCDF dataset via the HTTP server via:

http://motherlode.ucar.edu:8080/thredds/fileServer/galeon/testdata/striped.nc

and check the metadata via the ncdump command.

6. The vertical specification

A GetCoverage query obtain a subset of the chosen variable in NetCDF3 format:

http://motherlode.ucar.edu:8080/thredds/wcs/galeon/testdata/striped.nc?request=GetCoverage&version=1.0.0&service=WCS&format=NetCDF3&coverage=ta&time=2005-05-10T00:00:00Z&vertical=100.0&bbox=-134,11,-47,57


Using the OWSLib Client Library with TDS

The Python OWSLib OGC client library includes WCS capabilities. The examples directory includes an example on how to use OWSLib with THREDDS WCS.

    >>> from owslib.wcs import WebCoverageService
    >>> wcs=WebCoverageService('http://motherlode.ucar.edu:8080/thredds/wcs/fmrc/NCEP/NAM/CONUS_40km/conduit/NCEP-NAM-CONUS_40km-conduit_best.ncd', version='1.0.0')
    >>> wcs.url
    'http://motherlode.ucar.edu:8080/thredds/wcs/fmrc/NCEP/NAM/CONUS_40km/conduit/NCEP-NAM-CONUS_40km-conduit_best.ncd'
    >>> wcs.version
    '1.0.0'
    >>> wcs.identification.service
    'NCEP-NAM-CONUS_40km-conduit_best.ncd'
    >>> wcs.identification.version
    '1.0.0'
    >>> wcs.identification.title
    'NCEP-NAM-CONUS_40km-conduit_best.ncd'
    >>> wcs.identification.abstract
    'Experimental THREDDS/WCS server for CDM gridded datasets'
    >>> wcs.identification.keywords
    [None]
    >>> wcs.identification.fees
    'NONE'
    >>> wcs.identification.accessConstraints
    'NONE'

Print the ids of all layers:
        >>> wcs.contents.keys()
        ['Total_Column-Integrated_Cloud_Ice', 'Categorical_Snow', ... , 'Pressure_grid_scale_cloud_bottom']

#To further interrogate a single "coverage" get the coverageMetadata object
#You can either do:
        >>> cvg= wcs.contents['Temperature'] #to get it from the dictonary

#or even simpler you can do:
        >>> cvg=wcs['Temperature']

        >>> cvg.boundingBoxWGS84
        (-153.21272373696925, 11.968943490602328, -49.029734977681301, 57.381662765023037)

        >>> cvg.timepositions[:3]
        ['2009-01-08T06:00:00Z', '2009-01-08T09:00:00Z', '2009-01-08T12:00:00Z']

        >>> cvg.supportedCRS
        ['EPSG:4326']

        >>> cvg.supportedFormats
        ['GeoTIFF', 'GeoTIFFfloat', 'NetCDF3']

#Now we have enough information to build a getCoverage request:
        >>> covID='Temperature'
        >>> timeRange=['2009-01-08T06:00:00',  '2009-01-08T12:00:00']  #Okay, you should be able to select a range of times, but the server doesn't seem to like it.
        >>> timeRange=['2009-01-08T09:00:00Z'] #So for now I'll just  choose one timestep (from cvg.timepositions)
        >>> bb=(-140, -15, 30, 55) # chosen from cvg.boundingBoxWGS84
        >>> formatType='NetCDF3' # chosen from cvg.supportedFormats

#Make the actual getCoverage request.
        >>> output=wcs.getCoverage(identifier=covID,time=timeRange,bbox=bb, format=formatType)

#Then write this to a netcdf file.
        >>> filename = 'threddstest.nc'
        >>> f=open(filename, 'wb')
        >>> f.write(output.read())
        >>> f.close()



Using NcML



If you add an NcML element to a THREDDS dataset, the Dataset element then refers to the NcML dataset.

Description of NcML

The NetCDF Markup Language (NcML) is an XML dialect that allows you to create NetCDF datasets. An NcML document is an XML document that uses NcML to define a NetCDF dataset. Typically, this document refers to a NetCDF dataset called the referenced NetCDF dataset. The purpose and usefulness of NcML is to allow:

Creating NcML Files

This can be done either:

Use of NcML

The IOOS Model Data Interoperability Working Group is most useful for this endeavor.

This is introduced here.

Metadata

Variables

Aggregation

This is explained here.

Forecast Model Run Collection Aggregation

This is explained here.