Installing the Open Data Cube
Get up and running with the ODC
OVERVIEW
Find our ODC GitHub here>>
The Open Data Cube is a collection of software designed to:
-
Catalogue large amounts of Earth Observation data
-
Provide a Python based API for high performance querying and data access
-
Give scientists and other users easy ability to perform Exploratory Data Analysis
-
Allow scalable continent scale processing of the stored data
-
Track the provenance of all the contained data to allow for quality control and updates
The ODC can be deployed on various computing platforms. Possible deployments include:
-
Local deployment (e.g., high-end workstation)
-
Cloud (e.g., Amazon Web Services)
-
High Performance Computing infrastructure (e.g., NCI)
The Open Data Cube software is based around the datacube-core library.
datacube-core is an open source Python library, released under the Apache 2.0 license.
INSTALLATION
These installation instructions build a Data Cube environment that can be used to ingest data (with config files from github) and run analytics processes. The full Open Data Cube installation documentation can be found here.
The Data Cube is a set of python code with dependencies including:
-
Python 3.5+ (3.6 recommended)
-
GDAL
-
PostgreSQL database
These dependencies along with the target operating system environment should be considered when deciding how to install Data Cube to meet your system requirements.
The recommended method for installing the Data Cube is to use a container and package manager. The instructions below are for Miniconda and PostgreSQL.
Other methods to build and install the Data Cube are maintained by the community and are available at https://github.com/opendatacube/documentation. These may include docker recipes and operating system specific deployments.
Overview of ODC installation
ACCESSING THE ODC
ODC Reference install - Cube in a Box
A distributable, ready to run reference install is available as the “ODC Reference Install”, or Cube in a Box (CIAB). Where the Sandbox install provides an accessible, externally managed platform to trial the features of the Open Data Cube, the Reference Install is designed to provide a ready to run installation of an independent Open Data Cube, on an organization's own resources. See our Cube in a Box page for more information, here>>
The ODC Sandbox
A demonstration Data Cube Sandbox is available as an entry point to getting started with the Open Data Cube, and was recently made available here(Link will be available when public). The Sandbox is a JupyterHub Python notebook server, with individual work spaces, and the Global Collection 1 Landsat 8 AWS PDS indexed. See our ODC Sandbox page for more information, here>>
ODC Web UI Demo
The Web User Interface (UI) is a web application that allows developers to interactively showcase and visualize the output of algorithms. Try the ODC Web UI on Amazon Web Services, here>>
SYSTEM REQUIREMENTS
Before you get started, make sure you have the right system. Given the flexibility of deployment environments and the breadth of applications and data sizes, there is no clear set of requirements. All scalable Data Cube systems will have some of the same properties:
-
Shared storage
-
Storage capacity for both original datasets and their ingested counterparts
-
High memory capacity
-
Large processing capacity
-
High availability/Large bandwidth internet connection
DATA CATALOG: GETTING DATA INTO THE CUBE
Once you have the Data Cube software installed and connected to a database, you can start to load in some data. Documentation describing the steps for adding data to the Open Data Cube is located here.
ODC works with Analysis Ready Data (ARD)
Free and open Analysis Ready Data (ARD) is needed to support a diverse set of applications. These data include, but are not limited to, optical and radar at various spatial resolutions (coarse, medium and fine). There is also a need to utilize multiple datasets together through interoperable methods, where the data remains separate but takes advantage of complementary benefits, or through merged products, where the data is combined to improve temporal sampling or for sensor fusion.
Integrating Non-Space Datasets
The data cube can also store non-space datasets. Coming soon... This page will detail how ODC can manage non-space datasets. For example, how do we ingest common raster data such as precipitation or temperature?
Data Catalog: Data Sets currently supported by the ODC community
Below is a list of space datasets supported by ODC Community. This list will continue to expand and will include links to documentation for how to acquire and pre-process to ARD (if needed).
Landsat 5 / 7 / 8
ARD (surface reflectance, USGS Collection-1, UTM projection, 30m)
Landsat 5 / 7 / 8
ARD (surface reflectance, from LEDAPS and NBAR, Albers projection, 25-m)
Sentinel-1
ARD (gamma nought, 10m)
Sentinel-1
ARD (gamma nought, Albers projection, 12.5m)
Sentinel-2
Level-1C (MSI TOA reflectance, Albers projection, 10/20/60m)
ALOS-1/2 PALSAR Annual Mosaics
ARD (gamma nought, WGS84, 25m)
ASTER Digital Elevation Model (DEM)
ARD (elevation)
MODIS - MCD43
ARD (BRDF Albedo, 16-Day L3 Global 500m)
QUESTIONS?
If you are having issues, please feel free to reach us through our support channels:
-
Github: https://github.com/opendatacube
-
GIS Stack Exchange: https://gis.stackexchange.com/questions/tagged/open-data-cube
-
Email: Mail us at info@opendatacube.org