Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

SCI Publications


M. Han, J. Li, S. Sane, S. Gupta, B. Wang, S. Petruzza, C.R. Johnson. “Interactive Visualization of Time-Varying Flow Fields Using Particle Tracing Neural Networks,” Subtitled “arXiv preprint arXiv:2312.14973,” 2024.


Lagrangian representations of flow fields have gained prominence for enabling fast, accurate analysis and exploration of time-varying flow behaviors. In this paper, we present a comprehensive evaluation to establish a robust and efficient framework for Lagrangian-based particle tracing using deep neural networks (DNNs). Han et al. (2021) first proposed a DNN-based approach to learn Lagrangian representations and demonstrated accurate particle tracing for an analytic 2D flow field. In this paper, we extend and build upon this prior work in significant ways. First, we evaluate the performance of DNN models to accurately trace particles in various settings, including 2D and 3D time-varying flow fields, flow fields from multiple applications, flow fields with varying complexity, as well as structured and unstructured input data. Second, we conduct an empirical study to inform best practices with respect to particle tracing model architectures, activation functions, and training data structures. Third, we conduct a comparative evaluation of prior techniques that employ flow maps as input for exploratory flow visualization. Specifically, we compare our extended model against its predecessor by Han et al. (2021), as well as the conventional approach that uses triangulation and Barycentric coordinate interpolation. Finally, we consider the integration and adaptation of our particle tracing model with different viewers. We provide an interactive web-based visualization interface by leveraging the efficiencies of our framework, and perform high-fidelity interactive visualization by integrating it with an OSPRay-based viewer. Overall, our experiments demonstrate that using a trained DNN model to predict new particle trajectories requires a low memory footprint and results in rapid inference. Following best practices for large 3D datasets, our deep learning approach using GPUs for inference is shown to require approximately 46 times less memory while being more than 400 times faster than the conventional methods.

M.M. Ho, S. Dubey, Y. Chong, B. Knudsen, T. Tasdizen. “F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation,” Subtitled “arXiv:2404.12650v1,” 2024.


The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis.

M.M. Ho, E. Ghelichkhan, Y. Chong, Y. Zhou, B.S. Knudsen, T. Tasdizen. “DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading,” Subtitled “arXiv:2404.13097,” 2024.


Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.

T. Hoefler, M. Copik, P. Beckman, A. Jones, I. Foster, M. Parashar, D. Reed, M. Troyer, T. Schulthess, D. Ernst, J. Dongarra. “XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing,” Subtitled “arXiv:2401.04552v1,” 2024.


HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution platform that provides transparent access to computing resources, regardless of the underlying cloud or HPC service provider. Bridging HPC and cloud advancements, XaaS presents a unified architecture built on performance-portable containers. Our converged model concentrates on low-overhead, high-performance communication and computing, targeting resource-intensive workloads from climate simulations to machine learning. XaaS lifts the restricted allocation model of Function-as-a-Service (FaaS), allowing users to benefit from the flexibility and efficient resource utilization of serverless while supporting long-running and performance-sensitive workloads from HPC.

J. K. Holmen , M. Garcıa, A. Bagusetty, V. Madananth, A. Sanderson,, M. Berzins. “Making Uintah Performance Portable for Department of Energy Exascale Testbeds,” In Euro-Par 2023: Parallel Processing, pp. 1--12. 2024.


To help ease ports to forthcoming Department of Energy (DOE) exascale systems, testbeds have been made available to select users. These testbeds are helpful for preparing codes to run on the same hardware and similar software as in their respective exascale systems. This paper describes how the Uintah Computational Framework, an open-source asynchronous many-task (AMT) runtime system, has been modified to be performance portable across the DOE Crusher, DOE Polaris, and DOE Sunspot testbeds in preparation for portable simulations across the exascale DOE Frontier and DOE Aurora systems. The Crusher, Polaris, and Sunspot testbeds feature the AMD MI250X, NVIDIA A100, and Intel PVC GPUs, respectively. This performance portability has been made possible by extending Uintah’s intermediate portability layer [18] to additionally support the Kokkos::HIP, Kokkos::OpenMPTarget, and Kokkos::SYCL back-ends. This paper also describes notable updates to Uintah’s support for Kokkos, which were required to make this extension possible. Results are shown for a challenging radiative heat transfer calculation, central to the University of Utah’s predictive boiler simulations. These results demonstrate single-source portability across AMD-, NVIDIA-, and Intel-based GPUs using various Kokkos back-ends.

Q. Huang, J. Le, S. Joshi, J. Mendes, G. Adluru, E. DiBella. “Arterial Input Function (AIF) Correction Using AIF Plus Tissue Inputs with a Bi-LSTM Network,” In Tomography, Vol. 10, pp. 660-673. 2024.


Background: The arterial input function (AIF) is vital for myocardial blood flow quantification in cardiac MRI to indicate the input time–concentration curve of a contrast agent. Inaccurate AIFs can significantly affect perfusion quantification. Purpose: When only saturated and biased AIFs are measured, this work investigates multiple ways of leveraging tissue curve information, including using AIF + tissue curves as inputs and optimizing the loss function for deep neural network training. Methods: Simulated data were generated using a 12-parameter AIF mathematical model for the AIF. Tissue curves were created from true AIFs combined with compartment-model parameters from a random distribution. Using Bloch simulations, a dictionary was constructed for a saturation-recovery 3D radial stack-of-stars sequence, accounting for deviations such as flip angle, T2* effects, and residual longitudinal magnetization after the saturation. A preliminary simulation study established the optimal tissue curve number using a bidirectional long short-term memory (Bi-LSTM) network with just AIF loss. Further optimization of the loss function involves comparing just AIF loss, AIF with compartment-model-based parameter loss, and AIF with compartment-model tissue loss. The optimized network was examined with both simulation and hybrid data, which included in vivo 3D stack-of-star datasets for testing. The AIF peak value accuracy and ?????? results were assessed. Results: Increasing the number of tissue curves can be beneficial when added tissue curves can provide extra information. Using just the AIF loss outperforms the other two proposed losses, including adding either a compartment-model-based tissue loss or a compartment-model parameter loss to the AIF loss. With the simulated data, the Bi-LSTM network reduced the AIF peak error from −23.6 ± 24.4% of the AIF using the dictionary method to 0.2 ± 7.2% (AIF input only) and 0.3 ± 2.5% (AIF + ten tissue curve inputs) of the network AIF. The corresponding ?????? error was reduced from −13.5 ± 8.8% to −0.6 ± 6.6% and 0.3 ± 2.1%. With the hybrid data (simulated data for training; in vivo data for testing), the AIF peak error was 15.0 ± 5.3% and the corresponding ?????? error was 20.7 ± 11.6% for the AIF using the dictionary method. The hybrid data revealed that using the AIF + tissue inputs reduced errors, with peak error (1.3 ± 11.1%) and ?????? error (−2.4 ± 6.7%). Conclusions: Integrating tissue curves with AIF curves into network inputs improves the precision of AI-driven AIF corrections. This result was seen both with simulated data and with applying the network trained only on simulated data to a limited in vivo test dataset.

X. Huang, H. Miao, A. Townsend, K. Champley, J. Tringe, V. Pascucci, P.T. Bremer. “Bimodal Visualization of Industrial X-Ray and Neutron Computed Tomography Data,” In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2024.
DOI: 10.1109/TVCG.2024.3382607


Advanced manufacturing creates increasingly complex objects with material compositions that are often difficult to characterize by a single modality. Our collaborating domain scientists are going beyond traditional methods by employing both X-ray and neutron computed tomography to obtain complementary representations expected to better resolve material boundaries. However, the use of two modalities creates its own challenges for visualization, requiring either complex adjustments of bimodal transfer functions or the need for multiple views. Together with experts in nondestructive evaluation, we designed a novel interactive bimodal visualization approach to create a combined view of the co-registered X-ray and neutron acquisitions of industrial objects. Using an automatic topological segmentation of the bivariate histogram of X-ray and neutron values as a starting point, the system provides a simple yet effective interface to easily create, explore, and adjust a bimodal visualization. We propose a widget with simple brushing interactions that enables the user to quickly correct the segmented histogram results. Our semiautomated system enables domain experts to intuitively explore large bimodal datasets without the need for either advanced segmentation algorithms or knowledge of visualization techniques. We demonstrate our approach using synthetic examples, industrial phantom objects created to stress bimodal scanning techniques, and real-world objects, and we discuss expert feedback.

K.E. Isaacs, H. Kaiser. “Halide Code Generation Framework in Phylanx,” In Euro-Par 2022: Parallel Processing Workshops , Springer, 2024.


Separating algorithms from their computation schedule has become a de facto solution to tackle the challenges of developing high performance code on modern heterogeneous architectures. Common approaches include Domain-specific languages (DSLs) which provide familiar APIs to domain experts, code generation frameworks that automate the generation of fast and portable code, and runtime systems that manage threads for concurrency and parallelism. In this paper, we present the Halide code generation framework for Phylanx distributed array processing platform. This extension enables compile-time optimization of Phylanx primitives for target architectures. To accomplish this, (1) we implemented new Phylanx primitives using Halide, and (2) partially exported Halide’s thread pool API to carry out parallelism on HPX (Phylanx’s runtime) threads. (3) showcased HPX performance analysis tools made available to Halide applications. The evaluation of the work has been done in two steps. First, we compare the performance of Halide applications running on its native runtime with that of the new HPX backend to verify there is no cost associated with using HPX threads. Next, we compare performances of a number of original implementations of Phylanx primitives against the new ones in Halide to verify performance and portability benefits of Halide in the context of Phylanx.

K. Iyer, J. Adams, S.Y. Elhabian. “SCorP: Statistics-Informed Dense Correspondence Prediction Directly from Unsegmented Medical Images,” Subtitled “arXiv preprint arXiv:2404.17967,” 2024.


Statistical shape modeling (SSM) is a powerful computational framework for quantifying and analyzing the geometric variability of anatomical structures, facilitating advancements in medical research, diagnostics, and treatment planning. Traditional methods for shape modeling from imaging data demand significant manual and computational resources. Additionally, these methods necessitate repeating the entire modeling pipeline to derive shape descriptors (e.g., surface-based point correspondences) for new data. While deep learning approaches have shown promise in streamlining the construction of SSMs on new data, they still rely on traditional techniques to supervise the training of the deep networks. Moreover, the predominant linearity assumption of traditional approaches restricts their efficacy, a limitation also inherited by deep learning models trained using optimized/established correspondences. Consequently, representing complex anatomies becomes challenging. To address these limitations, we introduce SCorP, a novel framework capable of predicting surface-based correspondences directly from unsegmented images. By leveraging the shape prior learned directly from surface meshes in an unsupervised manner, the proposed model eliminates the need for an optimized shape model for training supervision. The strong shape prior acts as a teacher and regularizes the feature learning of the student network to guide it in learning image-based features that are predictive of surface correspondences. The proposed model streamlines the training and inference phases by removing the supervision for the correspondence prediction task while alleviating the linearity assumption. Experiments on the LGE MRI left atrium dataset and Abdomen CT-1K liver datasets demonstrate that the proposed technique enhances the accuracy and robustness of image-driven SSM, providing a compelling alternative to current fully supervised methods.

J Johnson, L McDonald, T Tasdizen. “Improving uranium oxide pathway discernment and generalizability using contrastive self-supervised learning,” In Computational Materials Science, Vol. 223, Elsevier, 2024.


In the field of Nuclear Forensics, there exists a plethora of different tools to aid investigators when performing analysis of unknown nuclear materials. Many of these tools offer visual representations of the uranium ore concentrate (UOC) materials that include complimentary and contrasting information. In this paper, we present a novel technique drawing from state-of-the-art machine learning methods that allows information from scanning electron microscopy images (SEM) to be combined to create digital encodings of the material that can be used to determine the material’s processing route. Our technique can classify UOC processing routes with greater than 96% accuracy in a fraction of a second and can be adapted to unseen samples at similarly high accuracy. The technique’s high accuracy and speed allow forensic investigators to quickly get preliminary results, while generalization allows the model to be adapted to new materials or processing routes quickly without the need for complete retraining of the model.

O. Joshi, T. Skóra, A. Yarema, R.D. Rabbitt, T.C. Bidone. “Contributions of the individual domains of αIIbβ3 integrin to its extension: Insights from multiscale modeling ,” In Cytoskeleton, 2024.


The platelet integrin αIIbβ3 undergoes long-range conformational transitions between bent and extended conformations to regulate platelet aggregation during hemostasis and thrombosis. However, how exactly αIIbβ3 transitions between conformations remains largely elusive. Here, we studied how transitions across bent and extended-closed conformations of αIIbβ3 integrin are regulated by effective interactions between its functional domains. We first carried out μs-long equilibrium molecular dynamics (MD) simulations of full-length αIIbβ3 integrins in bent and intermediate conformations, the latter characterized by an extended headpiece and closed legs. Then, we built heterogeneous elastic network models, perturbed inter-domain interactions, and evaluated their relative contributions to the energy barriers between conformations. Results showed that integrin extension emerges from: (i) changes in interfaces between functional domains; (ii) allosteric coupling of the head and upper leg domains with flexible lower leg domains. Collectively, these results provide new insights into integrin conformational activation based on short- and long-range interactions between its functional domains and highlight the importance of the lower legs in the regulation of integrin allostery.

T. Kataria, B. Knudsen, S.Y. Elhabian. “StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining,” Subtitled “arXiv preprint arXiv:2403.11340,” 2024.


Hematoxylin and Eosin (H&E) staining is the most commonly used for disease diagnosis and tumor recurrence tracking. Hematoxylin excels at highlighting nuclei, whereas eosin stains the cytoplasm. However, H&E stain lacks details for differentiating different types of cells relevant to identifying the grade of the disease or response to specific treatment variations. Pathologists require special immunohistochemical (IHC) stains that highlight different cell types. These stains help in accurately identifying different regions of disease growth and their interactions with the cell’s microenvironment. The advent of deep learning models has made Image-to-Image (I2I) translation a key research area, reducing the need for expensive physical staining processes. Pix2Pix and CycleGAN are still the most commonly used methods for virtual staining applications. However, both suffer from hallucinations or staining irregularities when H&E stain has less discriminate information about the underlying cells IHC needs to highlight (e.g.,CD3 lymphocytes). Diffusion models are currently the state-of-the-art models for image generation and conditional generation tasks. However, they require extensive and diverse datasets (millions of samples) to converge, which is less feasible for virtual staining applications. Inspired by the success of multitask deep learning models for limited dataset size, we propose StainDiffuser, a novel multitask dual diffusion architecture for virtual staining that converges under a limited training budget. StainDiffuser trains two diffusion processes simultaneously: (a) generation of cell-specific IHC stain from H&E and (b) H&E-based cell segmentation using coarse segmentation only during training. Our results show that StainDiffuser produces high-quality results for easier (CK8/18,epithelial marker) and difficult stains(CD3, Lymphocytes).

V. Koppelmans, M.F.L. Ruitenberg, S.Y. Schaefer, J.B. King, J.M. Jacobo, B.P. Silvester, A.F. Mejia, J. van der Geest, J.M. Hoffman, T. Tasdizen, K. Duff. “Classification of Mild Cognitive Impairment and Alzheimer's Disease Using Manual Motor Measures,” In Neurodegener Dis, 2024.
DOI: 10.1159/000539800
PubMed ID: 38865972


Introduction: Manual motor problems have been reported in mild cognitive impairment (MCI) and Alzheimer's disease (AD), but the specific aspects that are affected, their neuropathology, and potential value for classification modeling is unknown. The current study examined if multiple measures of motor strength, dexterity, and speed are affected in MCI and AD, related to AD biomarkers, and are able to classify MCI or AD.

Methods: Fifty-three cognitively normal (CN), 33 amnestic MCI, and 28 AD subjects completed five manual motor measures: grip force, Trail Making Test A, spiral tracing, finger tapping, and a simulated feeding task. Analyses included: 1) group differences in manual performance; 2) associations between manual function and AD biomarkers (PET amyloid β, hippocampal volume, and APOE ε4 alleles); and 3) group classification accuracy of manual motor function using machine learning.

Results: amnestic MCI and AD subjects exhibited slower psychomotor speed and AD subjects had weaker dominant hand grip strength than CN subjects. Performance on these measures was related to amyloid β deposition (both) and hippocampal volume (psychomotor speed only). Support vector classification well-discriminated control and AD subjects (area under the curve of 0.73 and 0.77 respectively), but poorly discriminated MCI from controls or AD.

Conclusion: Grip strength and spiral tracing appear preserved, while psychomotor speed is affected in amnestic MCI and AD. The association of motor performance with amyloid β deposition and atrophy could indicate that this is due to amyloid deposition in- and atrophy of motor brain regions, which generally occurs later in the disease process. The promising discriminatory abilities of manual motor measures for AD emphasize their value alongside other cognitive and motor assessment outcomes in classification and prediction models, as well as potential enrichment of outcome variables in AD clinical trials.

D. Lange, R. Judson-Torres, T.A. Zangle, A. Lex. “Aardvark: Composite Visualizations of Trees, Time-Series, and Images,” In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2024.


How do cancer cells grow, divide, proliferate and die? How do drugs influence these processes? These are difficult questions that we can attempt to answer with a combination of time-series microscopy experiments, classification algorithms, and data visualization. However, collecting this type of data and applying algorithms to segment and track cells and construct lineages of proliferation is error-prone; and identifying the errors can be challenging since it often requires cross-checking multiple data types. Similarly, analyzing and communicating the results necessitates synthesizing different data types into a single narrative. State-of-the-art visualization methods for such data use independent line charts, tree diagrams, and images in separate views. However, this spatial separation requires the viewer of these charts to combine the relevant pieces of data in memory. To simplify this challenging task, we describe design principles for weaving cell images, time-series data, and tree data into a cohesive visualization. Our design principles are based on choosing a primary data type that drives the layout and integrates the other data types into that layout. We then introduce Aardvark, a system that uses these principles to implement novel visualization techniques. Based on Aardvark, we demonstrate the utility of each of these approaches for discovery, communication, and data debugging in a series of case studies.

M. Lisnic, Z. Cutler, M. Kogan, A. Lex. “Visualization Guardrails: Designing Interventions Against Cherry-Picking in Interactive Data Explorers,” Subtitled “Preprint,” 2024.


The growing popularity of interactive time series exploration platforms has made visualizing data of public interest more accessible to general audiences. At the same time, the democratized access to professional-looking explorers with preloaded data enables the creation of convincing visualizations with carefully cherry-picked items. Prior research shows that people use data explorers to create and share charts that support their potentially biased or misleading views on public health or economic policy and that such charts have, for example, contributed to the spread of COVID-19 misinformation. Interventions against misinformation have focused on post hoc approaches such as fact-checking or removing misleading content, which are known to be challenging to execute. In this work, we explore whether we can use visualization design to impede cherry-picking—one of the most common methods employed by deceptive charts created on data exploration platforms. We describe a design space of guardrails—interventions against cherry-picking in time series explorers. Using our design space, we create a prototype data explorer with four types of guardrails and conduct two crowd-sourced experiments. In the first experiment, we challenge participants to create cherry-picked charts. We then use these charts in a second experiment to evaluate the guardrails’ impact on the perception of cherry-picking. We find evidence that guardrails—particularly superimposing relevant primary data—are successful at encouraging skepticism in a subset of experimental conditions but come with limitations. Based on our findings, we propose recommendations for developing effective guardrails for visualizations.

Q.C. Nguyen, T. Tasdizen, M. Alirezaei, H. Mane, X. Yue, J.S. Merchant, W. Yu, L. Drew, D. Li, T.T. Nguyen. “Neighborhood built environment, obesity, and diabetes: A Utah siblings study,” In SSM - Population Health, Vol. 26, 2024.



This study utilizes innovative computer vision methods alongside Google Street View images to characterize neighborhood built environments across Utah.


Convolutional Neural Networks were used to create indicators of street greenness, crosswalks, and building type on 1.4 million Google Street View images. The demographic and medical profiles of Utah residents came from the Utah Population Database (UPDB). We implemented hierarchical linear models with individuals nested within zip codes to estimate associations between neighborhood built environment features and individual-level obesity and diabetes, controlling for individual- and zip code-level characteristics (n = 1,899,175 adults living in Utah in 2015). Sibling random effects models were implemented to account for shared family attributes among siblings (n = 972,150) and twins (n = 14,122).


Consistent with prior neighborhood research, the variance partition coefficients (VPC) of our unadjusted models nesting individuals within zip codes were relatively small (0.5%–5.3%), except for HbA1c (VPC = 23%), suggesting a small percentage of the outcome variance is at the zip code-level. However, proportional change in variance (PCV) attributable to zip codes after the inclusion of neighborhood built environment variables and covariates ranged between 11% and 67%, suggesting that these characteristics account for a substantial portion of the zip code-level effects. Non-single-family homes (indicator of mixed land use), sidewalks (indicator of walkability), and green streets (indicator of neighborhood aesthetics) were associated with reduced diabetes and obesity. Zip codes in the third tertile for non-single-family homes were associated with a 15% reduction (PR: 0.85; 95% CI: 0.79, 0.91) in obesity and a 20% reduction (PR: 0.80; 95% CI: 0.70, 0.91) in diabetes. This tertile was also associated with a BMI reduction of −0.68 kg/m2 (95% CI: −0.95, −0.40)


We observe associations between neighborhood characteristics and chronic diseases, accounting for biological, social, and cultural factors shared among siblings in this large population-based study.

Q.C. Nguyen, M. Alirezaei, X. Yue, H. Mane, D. Li, L. Zhao, T.T. Nguyen, R. Patel, W. Yu, M. Hu, D. Quistberg, T. Tasdizen. “Leveraging computer vision for predicting collision risks: a cross-sectional analysis of 2019–2021 fatal collisions in the USA,” In Injury Prevention, BMJ, 2024.


Objective The USA has higher rates of fatal motor vehicle collisions than most high-income countries. Previous studies examining the role of the built environment were generally limited to small geographic areas or single cities. This study aims to quantify associations between built environment characteristics and traffic collisions in the USA.

Methods Built environment characteristics were derived from Google Street View images and summarised at the census tract level. Fatal traffic collisions were obtained from the 2019–2021 Fatality Analysis Reporting System. Fatal and non-fatal traffic collisions in Washington DC were obtained from the District Department of Transportation. Adjusted Poisson regression models examined whether built environment characteristics are related to motor vehicle collisions in the USA, controlling for census tract sociodemographic characteristics.

Results Census tracts in the highest tertile of sidewalks, single-lane roads, streetlights and street greenness had 70%, 50%, 30% and 26% fewer fatal vehicle collisions compared with those in the lowest tertile. Street greenness and single-lane roads were associated with 37% and 38% fewer pedestrian-involved and cyclist-involved fatal collisions. Analyses with fatal and non-fatal collisions in Washington DC found streetlights and stop signs were associated with fewer pedestrians and cyclists-involved vehicle collisions while road construction had an adverse association.

Conclusion This study demonstrates the utility of using data algorithms that can automatically analyse street segments to create indicators of the built environment to enhance understanding of large-scale patterns and inform interventions to decrease road traffic injuries and fatalities.

R. Nihalaani, T. Kataria, J. Adams, S.Y. Elhabian. “Estimation and Analysis of Slice Propagation Uncertainty in 3D Anatomy Segmentation,” Subtitled “arXiv preprint arXiv:2403.12290,” 2024.


Supervised methods for 3D anatomy segmentation demonstrate superior performance but are often limited by the availability of annotated data. This limitation has led to a growing interest in self-supervised approaches in tandem with the abundance of available unannotated data. Slice propagation has emerged as an self-supervised approach that leverages slice registration as a self-supervised task to achieve full anatomy segmentation with minimal supervision. This approach significantly reduces the need for domain expertise, time, and the cost associated with building fully annotated datasets required for training segmentation networks. However, this shift toward reduced supervision via deterministic networks raises concerns about the trustworthiness and reliability of predictions, especially when compared with more accurate supervised approaches. To address this concern, we propose the integration of calibrated uncertainty quantification (UQ) into slice propagation methods, providing insights into the model’s predictive reliability and confidence levels. Incorporating uncertainty measures enhances user confidence in self-supervised approaches, thereby improving their practical applicability. We conducted experiments on three datasets for 3D abdominal segmentation using five UQ methods. The results illustrate that incorporating UQ improves not only model trustworthiness, but also segmentation accuracy. Furthermore, our analysis reveals various failure modes of slice propagation methods that might not be immediately apparent to end-users. This study opens up new research avenues to improve the accuracy and trustworthiness of slice propagation methods.

M. Parashar. “Enabling Responsible Artificial Intelligence Research and Development Through the Democratization of Advanced Cyberinfrastructure,” In Harvard Data Science Review, Special Issue 4: Democratizing Data, 2024.


Artificial intelligence (AI) is driving discovery, innovation, and economic growth, and has the potential to transform science and society. However, realizing the positive, transformative potential of AI requires that AI research and development (R&D) progress responsibly; that is, in a way that protects privacy, civil rights, and civil liberties, and promotes principles of fairness, accountability, transparency, and equity. This article explores the importance of democratizing AI R&D for achieving the goal of responsible AI and its potential impacts.

M. Parashar. “Everywhere & Nowhere: Envisioning a Computing Continuum for Science,” Subtitled “arXiv:2406.04480v1,” 2024.


Emerging data-driven scientific workflows are seeking to leverage distributed data sources to understand end-to-end phenomena, drive experimentation, and facilitate important decision-making. Despite the exponential growth of available digital data sources at the edge, and the ubiquity of non trivial computational power for processing this data, realizing such science workflows remains challenging. This paper explores a computing continuum that is everywhere and nowhere – one spanning resources at the edges, in the core and in between, and providing abstractions that can be harnessed to support science. It also introduces recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.