Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

Scientific Computing

Numerical simulation of real-world phenomena provides fertile ground for building interdisciplinary relationships. The SCI Institute has a long tradition of building these relationships in a win-win fashion – a win for the theoretical and algorithmic development of numerical modeling and simulation techniques and a win for the discipline-specific science of interest. High-order and adaptive methods, uncertainty quantification, complexity analysis, and parallelization are just some of the topics being investigated by SCI faculty. These areas of computing are being applied to a wide variety of engineering applications ranging from fluid mechanics and solid mechanics to bioelectricity.


Martin Berzins

Parallel Computing

Mike Kirby

Finite Element Methods
Uncertainty Quantification

Valerio Pascucci

Scientific Data Management

Chris Johnson

Problem Solving Environments

Amir Arzani

Scientific machine learning
Data-driven fluid flow modeling

Funded Research Projects:

Publications in Scientific Computing:

Inherently interpretable machine learning solutions to differential equations,
H. Oh, R. Amici, G. Bomarito, S. Zhe, R.M. Kirby, J. Hochhalter. In Engineering with Computers, 2023.

A machine learning method for the discovery of analytic solutions to differential equations is assessed. The method utilizes an inherently interpretable machine learning algorithm, genetic programming-based symbolic regression. An advantage of its interpretability is the output of symbolic expressions that can be used to assess error in algebraic terms, as opposed to purely numerical quantities. Therefore, models output by the developed method are verified by assessing its ability to recover known analytic solutions for two differential equations, as opposed to assessing numerical error. To demonstrate its improvement, the developed method is compared to a conventional, purely data-driven genetic programming-based symbolic regression algorithm. The reliability of successful evolution of the true solution, or an algebraic equivalent, is demonstrated.

S. Liu, H. Miao, Z. Li, M. Olson, V. Pascucci, P.T. Bremer
Subtitled “arXiv preprint arXiv:2312.04494,” AVA: Towards Autonomous Visualization Agents through Visual Perception-Driven Decision-Making. 2023.

With recent advances in multi-modal foundation models, the previously text-only large language models (LLM) have evolved to incorporate visual input, opening up unprecedented opportunities for various applications in visualization. Our work explores the utilization of the visual perception ability of multi-modal LLMs to develop Autonomous Visualization Agents (AVAs) that can interpret and accomplish user-defined visualization objectives through natural language. We propose the first framework for the design of AVAs and present several usage scenarios intended to demonstrate the general applicability of the proposed paradigm. The addition of visual perception allows AVAs to act as the virtual visualization assistant for domain experts who may lack the knowledge or expertise in fine-tuning visualization outputs. Our preliminary exploration and proof-of-concept agents suggest that this approach can be widely applicable whenever the choices of appropriate visualization parameters require the interpretation of previous visual output. Feedback from unstructured interviews with experts in AI research, medical visualization, and radiology has been incorporated, highlighting the practicality and potential of AVAs. Our study indicates that AVAs represent a general paradigm for designing intelligent visualization systems that can achieve high-level visualization goals, which pave the way for developing expert-level visualization agents in the future.

Modeling Coupled 1D PDEs of Cardiovascular Flow with Spatial Neural ODEs
H. Csala, A. Mohan, D. Livescu, A. Arzani. In Machine Learning and the Physical Sciences Workshop, NeurIPS 2023, 2023.

Tackling coupled sets of partial differential equations (PDEs) through scientific machine learning presents a complex challenge, but it is essential for developing data-driven physics-based models. We employ a novel approach to model the coupled PDEs that govern the blood flow in stenosed arteries with deformable walls, while incorporating realistic inlet flow waveforms. We propose a low-dimensional model based on neural ordinary differential equations (ODEs) inspired by 1D blood flow equations. Our unique approach formulates the problem as ODEs in space rather than time, effectively overcoming issues related to time-dependent boundary conditions and PDE coupling. This innovative framework accurately captures flow rate and area variations, even when extrapolating to unseen waveforms. The promising results from this approach offer a different perspective on deploying neural ODEs to model coupled PDEs with unsteady boundary conditions, which are prevalent in many engineering applications.

Event-Driven FaaS Workflows for Enabling IoT Data Processing at the Cloud Edge Continuum
C. Sicari, D. Balouek, M. Villari, M. Parashar. In CC 2023 - International Conference on Utility and Cloud Computing, 2023.

Continuum Computing encompasses the integration of diverse infrastructures, including cloud, edge, and fog, to facilitate seamless
migration of applications based on their specific needs, ensuring optimal satisfaction of their requirements. The primary obstacles in this particular context mostly pertain to the incapacity to promptly respond to changes in the environment or the quality of service (QoS) constraints of the application, as well as the incapability to maintain an application in a stateless manner, hence impeding its relocation without the risk of data loss. The objective of this research is to tackle the aforementioned issues through the introduction of a framework based on Function-as-a-Service (FaaS) and event-driven architecture. This framework enables the decomposition, localization, and relocation of applications inside a Continuum infrastructure, facilitated by a rule engine that is both system and data-aware

Solving High Frequency and Multi-Scale PDEs with Gaussian Processes
Subtitled “arXiv:2311.04465,” S. Fang, M. Cooley, D. Long, S. Li, R. Kirby, S. Zhe. 2023.

Machine learning based solvers have garnered much attention in physical simulation and scientific computing, with a prominent example, physics-informed neural networks (PINNs). However, PINNs often struggle to solve high-frequency and multi-scale PDEs, which can be due to the spectral bias during neural network training. To address this problem, we resort to the Gaussian process (GP) framework. To flexibly capture the dominant frequencies, we model the power spectrum of the PDE solution with a student t mixture or Gaussian mixture. We then apply inverse Fourier transform to obtain the covariance function (according to the Wiener-Khinchin theorem). The covariance derived from the Gaussian mixture spectrum corresponds to the known spectral mixture kernel. We are the first to discover its rationale and effectiveness for PDE solving. Next, we estimate the mixture weights in the log domain, which we show is equivalent to placing a Jeffreys prior. It automatically induces sparsity, prunes excessive frequencies, and adjusts the remaining toward the ground truth. Third, to enable efficient and scalable computation on massive collocation points, which are critical to capture high frequencies, we place the collocation points on a grid, and multiply our covariance function at each input dimension. We use the GP conditional mean to predict the solution and its derivatives so as to fit the boundary condition and the equation itself. As a result, we can derive a Kronecker product structure in the covariance matrix. We use Kronecker product properties and multilinear algebra to greatly promote computational efficiency and scalability, without any low-rank approximations. We show the advantage of our method in systematic experiments.

An NSF REU Site Based on Trust and Reproducibility of Intelligent Computation: Experience Report
M. Hall, G. Gopalakrishnan, E. Eide, J. Cohoon, J. Phillips, M. Zhang, S. Elhabian, A. Bhaskara, H. Dam, A. Yadrov, T. Kataria. In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 2023.

This paper presents an overview of an NSF Research Experience for Undergraduate (REU) Site on Trust and Reproducibility of Intelligent Computation, delivered by faculty and graduate students in the Kahlert School of Computing at University of Utah. The chosen themes bring together several concerns for the future in producing computational results that can be trusted: secure, reproducible, based on sound algorithmic foundations, and developed in the context of ethical considerations. The research areas represented by student projects include machine learning, high-performance computing, algorithms and applications, computer security, data science, and human-centered computing. In the first four weeks of the program, the entire student cohort spent their mornings in lessons from experts in these crosscutting topics, and used one-of-a-kind research platforms operated by the University of Utah, namely NSF-funded CloudLab and POWDER facilities; reading assignments, quizzes, and hands-on exercises reinforced the lessons. In the subsequent five weeks, lectures were less frequent, as students branched into small groups to develop their research projects. The final week focused on a poster presentation and final report. Through describing our experiences, this program can serve as a model for preparing a future workforce to integrate machine learning into trustworthy and reproducible applications.

CLASSMix: Adaptive stain separation-based contrastive learning with pseudo labeling for histopathological image classification
Subtitled “arXiv:2312.06978v2,” B. Zhang, H. Manoochehri, M.M. Ho, F. Fooladgar, Y. Chong, B. Knudsen, D. Sirohi, T. Tasdizen. 2023.

Histopathological image classification is one of the critical aspects in medical image analysis. Due to the high expense associated with the labeled data in model training, semi-supervised learning methods have been proposed to alleviate the need of extensively labeled datasets. In this work, we propose a model for semi-supervised classification tasks on digital histopathological Hematoxylin and Eosin (H&E) images. We call the new model Contrastive Learning with Adaptive Stain Separation and MixUp (CLASS-M). Our model is formed by two main parts: contrastive learning between adaptively stain separated Hematoxylin images and Eosin images, and pseudo-labeling using MixUp. We compare our model with other state-of-the-art models on clear cell renal cell carcinoma (ccRCC) datasets from our institution and The Cancer Genome Atlas Program (TCGA). We demonstrate that our CLASS-M model has the best performance on both datasets. The contributions of different parts in our model are also analyzed.

Accelerating Data-Intensive Seismic Research Through Parallel Workflow Optimization and Federated Cyberinfrastructure
M. Adair, I. Rodero, M. Parashar, D. Melgar. In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, ACM, pp. 1970--1977. 2023.
DOI: 10.1145/3624062.3624276

Earthquake early warning systems use synthetic data from simulation frameworks like MudPy to train models for predicting the magnitudes of large earthquakes. MudPy, although powerful, has limitations: a lengthy simulation time to generate the required data, lack of user-friendliness, and no platform for discovering and sharing its data. We introduce FakeQuakes DAGMan Workflow (FDW), which utilizes Open Science Grid (OSG) for parallel computations to accelerate and streamline MudPy simulations. FDW significantly reduces runtime and increases throughput compared to a single-machine setup. Using FDW, we also explore partitioned parallel HTCondor DAGMan workflows to enhance OSG efficiency. Additionally, we investigate leveraging cyberinfrastructure, such as Virtual Data Collaboratory (VDC), for enhancing MudPy and OSG. Specifically, we simulate using Cloud bursting policies to enforce FDW job-offloading to VDC during OSG peak demand, addressing shared resource issues and user goals; we also discuss VDC’s value in facilitating a platform for broad access to MudPy products.

Deep neural operators as accurate surrogates for shape optimization
K. Shukla, V. Oommen, A. Peyvan, M. Penwarden, N. Plewacki, L. Bravo, A. Ghoshal, R.M. Kirby, G. Karniadakis. In Engineering Applications of Artificial Intelligence, Vol. 129, pp. 107615. 2023.
ISSN: 0952-1976

Deep neural operators, such as DeepONet, have changed the paradigm in high-dimensional nonlinear regression, paving the way for significant generalization and speed-up in computational engineering applications. Here, we investigate the use of DeepONet to infer flow fields around unseen airfoils with the aim of shape constrained optimization, an important design problem in aerodynamics that typically taxes computational resources heavily. We present results that display little to no degradation in prediction accuracy while reducing the online optimization cost by orders of magnitude. We consider NACA airfoils as a test case for our proposed approach, as the four-digit parameterization can easily define their shape. We successfully optimize the constrained NACA four-digit problem with respect to maximizing the lift-to-drag ratio and validate all results by comparing them to a high-order CFD solver. We find that DeepONets have a low generalization error, making them ideal for generating solutions of unseen shapes. Specifically, pressure, density, and velocity fields are accurately inferred at a fraction of a second, hence enabling the use of general objective functions beyond the maximization of the lift-to-drag ratio considered in the current work. Finally, we validate the ability of DeepONet to handle a complex 3D waverider geometry at hypersonic flight by inferring shear stress and heat flux distributions on its surface at unseen angles of attack. The main contribution of this paper is a modular integrated design framework that uses an over-parametrized neural operator as a surrogate model with good generalizability coupled seamlessly with multiple optimization solvers in a plug-and-play mode.

Streaming Factor Trajectory Learning for Temporal Tensor Decomposition
Subtitled “,” S. Fang, X. Yu, S. Li, Z. Wang, R. Kirby, S. Zhe. 2023.

Practical tensor data is often along with time information. Most existing temporal decomposition approaches estimate a set of fixed factors for the objects in each tensor mode, and hence cannot capture the temporal evolution of the objects' representation. More important, we lack an effective approach to capture such evolution from streaming data, which is common in real-world applications. To address these issues, we propose Streaming Factor Trajectory Learning for temporal tensor decomposition. We use Gaussian processes (GPs) to model the trajectory of factors so as to flexibly estimate their temporal evolution. To address the computational challenges in handling streaming data, we convert the GPs into a state-space prior by constructing an equivalent stochastic differential equation (SDE). We develop an efficient online filtering algorithm to estimate a decoupled running posterior of the involved factor states upon receiving new data. The decoupled estimation enables us to conduct standard Rauch-Tung-Striebel smoothing to compute the full posterior of all the trajectories in parallel, without the need for revisiting any previous data. We have shown the advantage of SFTL in both synthetic tasks and real-world applications.

Instance-wise Linearization of Neural Network for Model Interpretation
Subtitled “arXiv:2310.16295v1,” Z. Li, S. Liu, K. Bhavya, T. Bremer, V. Pascucci. 2023.

Neural network have achieved remarkable successes in many scientific fields. However, the interpretability of the neural network model is still a major bottlenecks to deploy such technique into our daily life. The challenge can dive into the non-linear behavior of the neural network, which rises a critical question that how a model use input feature to make a decision. The classical approach to address this challenge is feature attribution, which assigns an important score to each input feature and reveal its importance of current prediction. However, current feature attribution approaches often indicate the importance of each input feature without detail of how they are actually processed by a model internally. These attribution approaches often raise a concern that whether they highlight correct features for a model prediction.

For a neural network model, the non-linear behavior is often caused by non-linear activation units of a model. However, the computation behavior of a prediction from a neural network model is locally linear, because one prediction has only one activation pattern. Base on the observation, we propose an instance-wise linearization approach to reformulates the forward computation process of a neural network prediction. This approach reformulates different layers of convolution neural networks into linear matrix multiplication. Aggregating all layers' computation, a prediction complex convolution neural network operations can be described as a linear matrix multiplication F(x)=Wx+b. This equation can not only provides a feature attribution map that highlights the important of the input features but also tells how each input feature contributes to a prediction exactly. Furthermore, we discuss the application of this technique in both supervise classification and unsupervised neural network learning parametric t-SNE dimension reduction.

Strengthening and Democratizing Artificial Intelligence Research and Development
M. Parashar, T. deBlanc-Knowles, E. Gianchandani, L.E. Parker. In Computer, Vol. 56, No. 11, IEEE, pp. 85-90. 2023.
DOI: 10.1109/MC.2023.3284568

This article summarizes the vision, roadmap, and implementation plan for a National Artificial Intelligence Research Resource that aims to provide a widely accessible cyberinfrastructure for artificial intelligence R&D, with the overarching goal of bridging the resource–access divide.

Energy Stable and Structure-Preserving Schemes for the Stochastic Galerkin Shallow Water Equations
Subtitled “arXiv:2310.06229,” D. Dai, Y. Epshteyn, A. Narayan. 2023.

The shallow water flow model is widely used to describe water flows in rivers, lakes, and coastal areas. Accounting for uncertainty in the corresponding transport-dominated non-linear PDE models presents theoretical and numerical challenges that motivate the central advances of this paper. Starting with a spatially one-dimensional hyperbolicity-preserving, positivity-preserving stochastic Galerkin formulation of the parametric/uncertain shallow water equations, we derive an entropy-entropy flux pair for the system. We exploit this entropy-entropy flux pair to construct structure-preserving second-order energy conservative, and first- and second-order energy stable finite volume schemes for the stochastic Galerkin shallow water system. The performance of the methods is illustrated on several numerical experiments.

Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels
Subtitled “arXiv:2310.05387v1,” D. Long, W.W. Xing, A.S. Krishnapriyan, R.M. Kirby, S. Zhe, M.W. Mahoney. 2023.

Discovering governing equations from data is important to many scientific and engineering applications. Despite promising successes, existing methods are still challenged by data sparsity as well as noise issues, both of which are ubiquitous in practice. Moreover, state-of-the-art methods lack uncertainty quantification and/or are costly in training. To overcome these limitations, we propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS). We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises. We combine it with a Bayesian spike-and-slab prior — an ideal Bayesian sparse distribution — for effective operator selection and uncertainty quantification. We develop an expectation propagation expectation-maximization (EP-EM) algorithm for efficient posterior inference and function estimation. To overcome the computational challenge of kernel regression, we place the function values on a mesh and induce a Kronecker product construction, and we use tensor algebra methods to enable efficient computation and optimization. We show the significant advantages of KBASS on a list of benchmark ODE and PDE discovery tasks.

HiPPIS A High-Order Positivity-Preserving Mapping Software for Structured Meshes
T. A. J. Ouermi, R. M Kirby, M. Berzins. In ACM Trans. Math. Softw, ACM, Nov, 2023.
ISSN: 0098-3500
DOI: 10.1145/3632291

Polynomial interpolation is an important component of many computational problems. In several of these computational problems, failure to preserve positivity when using polynomials to approximate or map data values between meshes can lead to negative unphysical quantities. Currently, most polynomial-based methods for enforcing positivity are based on splines and polynomial rescaling. The spline-based approaches build interpolants that are positive over the intervals in which they are defined and may require solving a minimization problem and/or system of equations. The linear polynomial rescaling methods allow for high-degree polynomials but enforce positivity only at limited locations (e.g., quadrature nodes). This work introduces open-source software (HiPPIS) for high-order data-bounded interpolation (DBI) and positivity-preserving interpolation (PPI) that addresses the limitations of both the spline and polynomial rescaling methods. HiPPIS is suitable for approximating and mapping physical quantities such as mass, density, and concentration between meshes while preserving positivity. This work provides Fortran and Matlab implementations of the DBI and PPI methods, presents an analysis of the mapping error in the context of PDEs, and uses several 1D and 2D numerical examples to demonstrate the benefits and limitations of HiPPIS.

Multi-Resolution Active Learning of Fourier Neural Operators
Subtitled “arXiv:2309.16971,” S. Li, X. Yu, W. Xing, R.M. Kirby, A. Narayan, S. Zhe. 2023.

Fourier Neural Operator (FNO) is a popular operator learning framework. It not only achieves the state-of-the-art performance in many tasks, but also is highly efficient in training and prediction. However, collecting training data for the FNO can be a costly bottleneck in practice, because it often demands expensive physical simulations. To overcome this problem, we propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible while optimizing the learning efficiency. Specifically, we propose a probabilistic multi-resolution FNO and use ensemble Monte-Carlo to develop an effective posterior inference algorithm. To conduct active learning, we maximize a utility-cost ratio as the acquisition function to acquire new examples and resolutions at each step. We use moment matching and the matrix determinant lemma to enable tractable, efficient utility computation. Furthermore, we develop a cost annealing framework to avoid over-penalizing high-resolution queries at the early stage. The over-penalization is severe when the cost difference is significant between the resolutions, which renders active learning often stuck at low-resolution queries and inferior performance. Our method overcomes this problem and applies to general multi-fidelity active learning and optimization problems. We have shown the advantage of our method in several benchmark operator learning tasks.

Toward Democratizing Access to Science Data: Introducing the National Data Platform,
M. Parashar, I. Altintas. In IEEE 19th International Conference on e-Science, IEEE, 2023.
DOI: 10.1109/e-Science58273.2023.10254930

Open and equitable access to scientific data is essential to addressing important scientific and societal grand challenges, and to research enterprise more broadly. This paper discusses the importance and urgency of open and equitable data access, and explores the barriers and challenges to such access. It then introduces the vision and architecture of the National Data Platform, a recently launched project aimed at catalyzing an open, equitable and extensible data ecosystem.

Computer Science Abstractions To Help Reason About Decentralized Stablecoin Design
B. Charoenwong, R.M. Kirby, J. Reiter. In IEEE Access, IEEE, 2023.

Computer science as a discipline is known for its penchant for using abstractions as a tool for reasoning. It is no surprise that computer science might have something valuable to lend to the world of decentralized stablecoin design, as it is in fact a “computing" problem. In this paper, we examine the possibility of a decentralized and capital-efficient stablecoin using smart contracts that algorithmically trade to maintain stability and study the potential new functionality that smart contracts enable. By exploiting traditional abstractions from computer science, we show that a capital-efficient algorithmic stablecoin cannot be provably stable. Additionally, we provide a formal exposition of the workings of Central Bank Digital Currencies, connecting this to the space of possible stablecoin designs. We then discuss several outstanding conjectures from both academics and practitioners and finally highlight the regulatory similarities between money-market funds and working stablecoins. Our work builds upon the current and growing interplay between the realms of engineering and financial services, and it also demonstrates how ways of thinking as a computer scientist can aid practitioners. We believe this research is vital for understanding and developing the future of financial technology.

Dynamic Data-Driven Application Systems for Reservoir Simulation-Based Optimization: Lessons Learned and Future Trends,
M. Parashar, T. Kurc, H. Klie, M.F. Wheeler, J.H. Saltz, M. Jammoul, R. Dong. In Handbook of Dynamic Data Driven Applications Systems: Volume 2, Springer International Publishing, pp. 287--330. 2023.
DOI: 10.1007/978-3-031-27986-7_11

Since its introduction in the early 2000s, the Dynamic Data-Driven Applications Systems (DDDAS) paradigm has served as a powerful concept for continuously improving the quality of both models and data embedded in complex dynamical systems. The DDDAS unifying concept enables capabilities to integrate multiple sources and scales of data, mathematical and statistical algorithms, advanced software infrastructures, and diverse applications into a dynamic feedback loop. DDDAS has not only motivated notable scientific and engineering advances on multiple fronts, but it has been also invigorated by the latest technological achievements in artificial intelligence, cloud computing, augmented reality, robotics, edge computing, Internet of Things (IoT), and Big Data. Capabilities to handle more data in a much faster and smarter fashion is paving the road for expanding automation capabilities. The purpose of this chapter is to review the fundamental components that have shaped reservoir-simulation-based optimization in the context of DDDAS. The foundations of each component will be systematically reviewed, followed by a discussion on current and future trends oriented to highlight the outstanding challenges and opportunities of reservoir management problems under the DDDAS paradigm. Moreover, this chapter should be viewed as providing pathways for establishing a synergy between renewable energy and oil and gas industry with the advent of the DDDAS method.

Strengthening the US Department of Energy's Recruitment Pipeline: The DOE/NNSA Predictive Science Academic Alliance Program (PSAAP) Experience
J. K. Holmen, V. G. Vergara Larrea, E. W. Draeger, E. T. Phipps, P. J. Smith, M. Berzins, S. T. Smith, J. N. Thornock, S. Parete-Koon. In Practice and Experience in Advanced Research Computing, ACM, pp. 137--144. 2023.

The US Department of Energy (DOE) oversees a system of 17 national laboratories responsible for developing unique scientific capabilities beyond the scope of academic and industrial institutions. These labs strive to keep America at the forefront of discovery and are home to some of the Nation’s best minds and the world’s best scientific and research facilities. Collaborations between national laboratories and academic institutions are critical to develop and recruit talent for the DOE workforce. Academia’s cooperative education model poses challenges for DOE recruitment pipelines centered around traditional internships. This paper discusses a promising DOE recruitment pipeline, the National Nuclear Security Administration’s (NNSA) Predictive Science Academic Alliance Program (PSAAP) initiative. As a part of this, experiences capturing the successes and challenges faced by the University of Utah’s Carbon Capture Multidisciplinary Simulation Center (CCMSC) through their participation in the PSAAP-II initiative are shared. These experiences demonstrate the success of Utah’s PSAAP center as a recruitment pipeline with approximately 43% of CCMSC students going to a national laboratory after graduation. Potential opportunities to strengthen the DOE’s recruitment pipeline are also discussed.