This dissertation is about analyzing and visualizing datasets using basis selection techniques for matrix approximation. A large portion of the previous work in basis selection and matrix approximation has been focused entirely on new algorithms to improve specific measures of quality and has been largely motivated by the goals of reducing runtime and minimizing the error introduced. We contribute to that body of knowledge, but we also enlarge the types of motivating problems and interesting applications available to basis selection techniques.
Specifically, in addition to contributing to well-studied problems, such as the computational aspects of kernel-based learning and general low-rank matrix approximations, we also introduce two real-world problems where basis selection aids in significant ways: subset-based visualization and proxy-construction for uncertainty quantification of resource-demanding simulations. We hope this dissertation will motivate others to study and extend ideas and techniques that are specifically motivated by these fascinating problems.
The full set of concepts discussed here can be categorized into two fundamental ideas: using appropriate basis selection to improve human interpretability of datsets and basis selection to address computational burden. We present these ideas in a collection of five papers. The first paper introduces a novel subset-based visualization motivated by an application to topology optimization design exploration, and emphasizes the ability of a subset matrix to visually summarize a dataset. The second and third papers address computational limitations in kernel-based learning, introducing a novel basis search technique for the \Nystrom approximation and a random-projection type approximation, respectively. The fourth paper introduces a novel algorithm and analysis related to general subset-based matrix approximation, touching on both computational and interpretation aspects of basis selection. The fifth paper considers a novel basis selection approach to proxy-function construction for faster uncertainty quantification of compute-intensive simulations.
Related Publications:
[1] Allocation strategies for high fidelity models in the multifidelity regime
D. perry, R. Kirby, A. Narayan, and R. Whitaker
Subm. SIAM J. on Uncertainty Quantification, 2017.
[2] Visualization of topology optimization designs with representative subset selection
D. Perry, V. Keshavarzzadeh, S. Elhabian, R. Kirby, M. Gleicher, and R. Whitaker
Subm. Trans. Vis. Comput. Graphics, 2017.
[3] Nystrom Sketches
D. Perry, B. Osting, and R. Whitaker
Proc. ECML-PKDD 2017, Springer Lecture Notes Comput. Sci., to appear, 2017.
[4] Streaming kernel principal component analysis
M. Ghashami, D. Perry, and J. Phillips
Proc. AI Stats 2016, J. Mach. Learning Res., v. 51, pp. 1365-1374, 2016.
[5] Augmented leverage score sampling with bounds
D. Perry and R. Whitaker
Proc. ECML-PKDD 2016, Springer Lecture Notes in Comput. Sci., vol. 9852, pp. 543-558, 2016.
Posted by: Nathan Galli
Based on the paper "Optimal Transport for Diffeomorphic Registration" by Jean Feydy et al. I will describe their registration procedure using Optimal transport theory. They introduce a new type of similarity measure for shape registration which are sensitive to spatial displacement. Obtaining the optimal transport between shapes is usually computationally expensive, so they describe a discrete optimization procedure which is computationally more efficient. I will briefly introduce the basic ideas of optimal transport and shape registration, and explain how Jean et al. use this theory to match shapes represented by measures.
Posted by: Nathan Galli
Posted by: Nathan Galli