Need for Expressive Stream Reasoning Benchmark
Various benchmarks exist to evaluate expressive reasoning over static data sources. On the other hand, a plethora of benchmarks exists to evaluate RDF Stream Processing with no or limited support for reasoning. (Support for RDFS reasoning at most.) However, no benchmarks are available to evaluate expressive reasoning over continuous data streams. Therefore, we introduce OWL2Streams, an evaluation framework for expressive stream reasoning.
Why Expressive Reasoning?
Many domains rely on rich background knowledge to fully capture the domain, such as Pervasive Health or the Internet of Things (IoT), requiring Expressive Reasoning to fully capture their domain knowledge. Furthermore, most IoT labeled ontologies on Linked Open Vocabularies require expressive reasoning techniques, such as OWL2 DL, to be fully interpreted.
Pushing Optimizations
OWL2Streams aims at pushing optimizations for expressive reasoning over continuous data streams. It makes a distinction between static data, describing background information, and streaming data. OWL2Streams pushes optimizations for the following challenges:
- Size of the static data: the larger the size of the static data, the slower the reasoning. Optimizations can focus on only extracting relevant parts of the static data.
- Expressivity of the ontology: the expressivity negatively impacts the reasoning time, optimizations can focus on extracting only relevant statements from the ontology TBox.
- Frequency of the stream: the faster the updates in the stream, the faster the reasoning process needs to be able to provide results. Optimizations can focus on first filtering relevant facts from the data.
- Number of parallel streams: More streams imply more data, however, challenges arise when data from various streams need to be correlated.
- Numeric data streams: in domains such as the IoT, many streams contain numeric data. However, data property reasoning is not supported by many OWL2 reasoners. Optimizations could focus on filtering out events with numeric values that are not useful.
- Data Expiration: next to reasoning over the addition of data, the removal of data is even more complex. Optimizations could investigate adapted incremental reasoning techniques for expressive reasoning. (Currently, none exist for OWL2 DL.)
Supported Scenarios
At this point, OWL2Streams supports three different scenarios focussing on different optimization opportunities. These scenarios are:
- Smart City: an adaption of the City Bench containing more elaborate background knowledge.
- Smart Building: a COVID-19 scenario, consisting of sensors that measure the air quality in various rooms. The domain knowledge allows to infer if certain rooms have higher probabilities for COVID-19 infections, based on the air quality and the activities in each room.
- University Management: An adaptation of the OWL2Bench benchmark. This scenario consists of a university where students register and enroll in a certain program. The data stream consists of the enrollment of the students.
Conclusion
OWL2Streams is an evaluation framework for pushing the expressivity in Reasoning over continuous data streams. If you are interested, checkout out the Github Repository or contact me for more information.