Prof. Dr. Illia Horenko
Home University: Institute of Computational Science Via Giuseppe Buffi 13 +41 58 666 4123 illia.horenkoatusi.ch

A central scientific issue in our work programme, which recurs in several of the individual projects, is the unbiased characterization of observation, measurement, and simulation data. Over the past several years, Prof. Horenko has developed nonparametric, nonstationary, nonhomogeneous data analysis techniques which, in our view, belong to the most advanced methodologies in this field. Moreover, besides having introduced some fundamentally new techniques, he has also tested them against and applied them to reallife data from a range of application areas that are part of the CRC 1114, such as Meteorology and BioInformatics, with related publications in highranking journals. He has already accounted for the challenges that arise from the shear amount of data that have to be processed in reallife applications to obtain robust and credible results, and his group has generated highperformance ready implementations of these data analysis algorithms. Prof. Horenko's FEMBV family of time series analysis techniques allows for systematic time dependent model identification when assumptions of stationarity or homogeneity of some underlying statistics are not justifiable. Finite Element Methods are employed in the numerical representation of indicator functions for the spacetime domains of applicability of different models from a common model class. These indicators are regularized using a Bounded Variation constraint, hence the acronym FEMBV. The choice of the model class from which to select the individual models in each of the regimes depends on the type of data considered. Implemented versions include Vector AutoRegressiv models with eXternal influences (VARX) with finite memory depth, Kmeans for geometric clustering of continuous data, Empirical Orthogonal Function (EOF) decompositions for model reduction of continuous data in highdimensional vector spaces, Markov/Bernoulli models for discrete (categorical) data, and Generalized ExtremeValue distributions (GEV) for regression analysis with emphasis on extreme event characteristics. The number of different spatiotemporal regimes, the model parameters to be chosen within these regimes, such as memory depth and number of EOFs, and the indicator functions signalling activation of the respective models are all determined simultaneously in a global optimization procedure. This yields a judicious compromise between low residuals in reproducing the data of a training set on the one hand, and the demand for the smallestpossible overall number of free parameters of the complete model on the other. The optimization is based on a new nonparametric modified Akaike Information Criterion (mAIC) and may be interpreted as a constructive implementation of ``Occam's Razor''. By addressing directly a scalar model error functional to characterize the modeldatadistance, the optimization problem remains solvable in high dimensions. Versions of this methodology have been applied successfully to a variety of data from different application areas. 
Project cooperations with the Mercator Fellow:
 Within Project A01, an appropriate version of these techniques will serve as an independent, databased method for optimizing hierarchical multiscale stochastic precipitation models, and for the quantitative databased evaluation of the project's hypotheses and theoretical derivations.
 In most recent work, the model identification procedures have been generalized for successive incorporation of new data as they become available in the course of time. This extension is based on Bayesian learning ideas and complements the framework of the Data Assimilation project A02. Here, Prof. Horenko's techniques could be used to represent and assimilate the influences of possible nonobserved external influences, which materialize in the FEMBV family of models as regime changes in the identified FEMBV indicator functions.
 Project B01 will generate a wealth of threedimensional displacement fields from laboratory ``earthquakes''. The FEMBVVARX techniques, in combination with model reduction in terms of spatial patterns, e.g., through EOFdecompositions, will allow for a detailed characterization of these data that goes considerably beyond what is currently available in this laboratory setting. At the same time, this methodology will be provide insights into connections between the measured threedimensional fields and displacements measured at the surface. This is important as only surface displacements can directly be measured out in the field. The threedimensional displacements under an observed surface are ``nonobserved degrees of freedom'' in the sense of the FEMBV technology, and their influence is reflected in potential model regime changes. In conjunction with the threedimensional laboratory measurements, there is a unique opportunity to establish a direct, quantifiable connection between such three dimensional processes and the surface displacements.
 Project B04 investigates the compact representation of complex data using tensorproduct decompositions, considering direct numerical simulations of turbulent flows and experimental as well as simulation data from project B01 (see above). There are two routes of fruitful developments in conjunction with Prof. Horenko's data analysis techniques in this project. The first route simply consists in a mutual benchmarking of the data compression capability of the tensor product decompositions with what is achievable using Prof. Horenko's EOFbased multipleregime representations. The second route of development involves extending the data analysis technology by incorporating tensor product representations in the data representation ansatz. Tensor product decompositions could replace the EOFbased decompositions in cases where the data reveal scale selfsimilarities.
 Prof. Horenko will also guide the databased development of triggering mechanisms for unstable updraftes in Project C06.
Project publications (Mercator Fellow)
Gerber, S. and Olsson, S. and Noé, F. and Horenko, I. (2018) A scalable approach to the computation of invariant measures for highdimensional Markovian systems. Sci. Rep., 8 (1796). ISSN 20452322
Pospisil, L. and Gagliardini, P. and Sawyer, W. and Horenko, I. (2017) On a scalable nonparametric denoising of time series signals. Communications in Applied Mathematics and Computational Science . pp. 128. (In Press)
Gerber, S. and Horenko, I. (2017) Toward a direct and scalable identification of reduced models for categorical processes. Proceedings of the National Academy of Sciences, 114 (19). pp. 48634868.
O'Kane, T.J. and Monselesan, D.P. and Risbey, J.S. and Horenko, I. and Franzke, Ch.L.E. (2017) On memory, dimension, and atmospheric teleconnection patterns. Math. Clim. Weather Forecast, 3 (1). pp. 127.
Horenko, I. and Gerber, S. and O'Kane, T.J. and Risbey, J.S. and Monselesan, D.P. (2017) On inference and validation of causality relations in climate teleconnections. In: Nonlinear and Stochastic Climate Dynamics. Cambridge University Press, pp. 184208. ISBN 9781107118140
Horenko, I. and Gerber, S. (2015) Improving clustering by imposing network information. Science Advances, 1 (7). ISSN 23752548
O’Kane, T.J. and Risbey, J.S. and Monselesan, D.P. and Horenko, I. and Franzke, Ch.L.E. (2015) On the dynamics of persistent states and their secular trends in the waveguides of the Southern Hemisphere troposphere. Climate Dynamics . ISSN Print: 09307575, Online: 14320894