Bridging the gap between expressivity and efficiency in stream reasoning : a structural caching approach for IoT streams


In today’s data landscape, data streams are well represented. This is mainly due to the rise of data-intensive domains such as the Internet of Things (IoT), Smart Industries, Pervasive Health, and Social Media. To extract meaningful insights from these streams, they should be processed in real time, while solving an integration problem as these streams need to be combined with more static data and their domain knowledge. Ontologies are ideal for modeling this domain knowledge and facilitate the integration of heterogeneous data within data-intensive domains such as the IoT. Expressive reasoning techniques, such as OWL2 DL reasoning, are needed to completely interpret the domain knowledge and for the extraction of meaningful decisions. Expressive reasoning techniques have mainly focused on static data environments, as it tends to become slow with growing datasets. There is thus a mismatch between expressive reasoning and the real-time requirements of data-intensive domains. In this paper, we take a first step towards bridging the gap between expressivity and efficiency while reasoning over high-velocity IoT data streams for the task of event enrichment. We present a structural caching technique that eliminates reoccurring reasoning steps by exploiting the characteristics of most IoT streams, i.e., streams typically produce events that are similar in structure and size. Our caching technique speeds up reasoning time up to thousands of times for fully fledged OWL2 DL reasoners and even tenths and hundreds of times for less expressive OWL2 RL and OWL2 EL reasoners.