UncertML - Describing and communicating uncertainty within the (semantic) web

Mr Matt Williams
Computer Science, Aston University

Date: 14th October 2008 (Tuesday)
Time: 14:00 - 15:00
Venue: MB564

The Semantic Web relies on carefully structured, well defined data to allow machines to communicate and understand one another. In many domains (e.g. geospatial) the data being described contains some uncertainty, often due to bias, observation error or incomplete knowledge. Meaningful processing of this data requires these uncertainties to be carefully analysed and integrated into the process chain. Currently, within the Semantic Web there is no standard mechanism for interoperable description and exchange of uncertain information, which renders the automated processing of such information implausible, particularly where error must be considered and captured as it propagates through a processing sequence. In particular we adopt a Bayesian perspective and focus on the case where the inputs / outputs are naturally treated as random variables.

I will present a solution to the problem in the form of the Uncertainty Markup Language (UncertML). UncertML is a conceptual model, realised as an XML schema, that allows uncertainty to be quantified in a variety of ways: i.e. realisations, statistics and probability distributions.

The INTAMAP (INTeroperability and Automated MAPping) project provides a use case for UncertML. I will demonstrate how observation errors can be quantified using UncertML and wrapped within an Observations & Measurements (O&M) Observation. An interpolation Web Processing Service (WPS) uses the uncertainty information within these observations to influence and improve its prediction outcome. The output uncertainties from this WPS may also be encoded in a variety of UncertML types, e.g. a series of marginal Gaussian distributions, a set of statistics, such as the first three marginal moments, or a set of realisations from a Monte Carlo treatment. Quantifying and propagating uncertainty in this way allows such interpolation results to be consumed by other services. This could form part of a risk management chain or a decision support system, and ultimately paves the way for complex data processing chains in the Semantic Web.

Download slides