With the opening of global archives of Earth Observation data streams from satellites we have arrived at a richness of operationally available observations over the whole globe, starting from the Landsat series of satellites and now the plethora of available data coming through Copernicus and its series of Sentinel satellites, that has never been available before. This created huge opportunities for research and businesses, being able to exploit the temporal domain of those observations in a powerful manner, but also poses challenges in terms of data management and processing capacities. As a consequence, a growing number of cloud services and customized solutions in various research centres have been developed, leading to processing workflows optimized for specific system architectures and back-end infrastructures. As such this is limiting portability and reproducibility of workflows across backends, both for science and business applications.
OpenEO, a Horizon 2020 project (grant 776242), addresses this problem by defining an API between service provider back-ends and client applications. This API can be seen as a common language that defines interfaces for finding, accessing, processing and retrieving data, and only requires that a backend architecture specific driver and corresponding client libraries are used when defining the workflows.
The project however does not stop at simply defining an API, but also provides implementations for a number of different back-end solutions, ranging from a file-based storage system with processes running in individual containers, over data cubes exposed via OGC web services to GRASS GIS. On the client side, libraries are developed in python, R and javascript, leaving space to develop applications based on Earth Observations with very different requirements, from web visualization to extensive time series analysis in a research setting. In addition to a large set of simple processes, user-defined functions allow users to submit more complicated processes as python or R code, e.g. for dedicated time series models, to be run on the imagery data.
All drivers both on client and back-end site are developed as open source software and can be found on the project’s Github group (https://github.com/Open-EO). This enables everybody to peek
into the available implementations and develop own adaptations based on what is needed for specific back-end solutions. There is also no limitation to define new client libraries in other languages or develop plugins for common software such as QGIS or the Sentinel Toolboxes.
A proof of concept showing the successful interaction and processing of different clients and back-ends has been published (http://openeo.org/openeo/news/2018/03/17/poc.html).
openEO API provides definitions for data discovery, processing and especially chaining of defined processes integrated with user defined functions, data and result access via download and for commercial implementations a framework for managing user content and billing.
openEO’s API development and the definition of supported core processes is driven be the expertise of the project consortium members and a set of use cases comprising typical analytical challenges for environmental monitoring based on earth observation data. A hackathon event and user workshop added additional feedback to that. Further such events are planned also for later this year.
Currently the version 0.4 has been released providing a first attempt of defining core processes and back-end developers are currently working on the implementation of this core functionality, following the latest spec. Data discovery has been closely linked to other emerging standards like the spatial temporal asset catalogue STAC (https://github.com/radiantearth/stac-spec) and the web feature service in it’s latest development version 3.0 (WFS 3.0 – https://github.com/opengeospatial/WFS_FES).