AI and HPC for Ecology

hpc
computational-ecology
deep-learning
infrastructure
teaching
A decade of HPC and AI/ML work for ecology — building and deploying deep learning models, developing and teaching technical courses, and consulting across USGS science programs.
Updated

June 8, 2026

The Cognavitron — hover for details · click to navigate


San Diego Supercomputer Center

My first serious exposure to HPC came in 2000, when I attended the week-long NPACI Parallel Computing Institute at the San Diego Supercomputer Center (SDSC) as a graduate student at UC San Diego. NPACI — the National Partnership for Advanced Computational Infrastructure — was NSF’s predecessor to TeraGrid and eventually XSEDE. The institute introduced me to parallel programming concepts and HPC environments at a time when I was still doing field ecology.

Years later, James Sheppard and I partnered with SDSC researchers Bob Sinkovits and Glenn Lockwood to optimize and parallelize the 3D movement-based kernel density estimator (MKDE). The 3D MKDE method for estimating animal space use in three spatial dimensions is computationally intensive. Computing 3D MKDEs over a large dataset is embarrassingly parallel but slow enough to require distributed computing infrastructure. We implemented a parallelized version using OpenMP — parallelizing the core loops in C/C++ functions called from R — and ran it on Gordon, SDSC’s flash memory-based supercomputer, demonstrating it on eagle and condor telemetry data from coastal California. The result was a substantial speedup and a paper at XSEDE 2014. The mkde R package, which implements the estimator, is available on CRAN.

Jeff Tracey (left) and James Sheppard (right) by the Trestles supercomputer, at the San Diego Supercomputer Center.
Robert Sinkovits (left) and James Sheppard (right) during a planning session at the San Diego Supercomputer Center.

Gordon supercomputer at SDSC. (Photo: Alan Decker / SDSC)

3D MKDE home ranges for a pair of California condors (Gymnogyps californianus), animated over a moving time window. The are four panels: one shows the move paths of both indivduals (top left), both 3D MKDEs together (top right), and two panels that show the separate 3D MKDEs for each individual (bottom) *(Visualization: A. Chourasia and J. A. Tracey)*.

USGS Advanced Research Computing

During my time at the USGS Western Ecological Research Center (WERC), I began wroking with Advanced Research Computing (ARC), a group dedicated to bringing high-performance computing (HCP) capability to USGS, allowing researchers to tackle larger problems. To accomplish this, ARC builds and maintains scientific computing resources, trains users, and provides user support.

ARC’s first supercomputer was Yeti — a system that set the foundation for what became a serious USGS HPC program. Yeti has since been decommissioned as it was used well beyond its service life.

Yeti was joined by Denali and Tallgrass — two Cray systems that brought substantially more compute power and enabled the kinds of large-scale AI/ML workflows that are now central to ecological research. I used Tallgrass extensively as it was designed for AI/ML workloads and had six nodes with NVIDIA V100 Tesla GPUs. More recently, ARC acquired the Hovenweep superocomputer, which provided a replacement for Yeti, which was well beyond its service life.

Yeti — USGS ARC's first supercomputer. An IBM system that established the foundation for federal HPC in ecological research. Now decommissioned. (Photo: Jeff A. Tracey)
Denali (center, Denali mountain graphic) and Tallgrass (left, zebra-stripe panels) — USGS ARC Cray supercomputers. The USGS logo is visible on the Tallgrass cabinet at left. These systems supported large-scale ecological simulation, movement modeling, and deep learning workflows. (Photo: Jeff Falgout, USGS)

I began working with ARC while still at WERC when I developed and taught a workshop on R for HPC for them so that we could introduce USGS researchers to parallel computing in R on ARC’s systems. ARC supported my attendance at several annual Supercomputing (SC) conferences, as part of that teaching relationship. At SC19, Natalya Rapstine and I presented a poster and extended abstract on deep learning methods for unsupervised clustering of telemetry data. The method used a deep recurrent autoencoder with LSTM units to encode movement sequences, followed by Deep Embedded Clustering (DEC) to classify move steps into behavioral modes — without requiring labeled training data. Applied to both simulated data and golden eagle GPS telemetry from San Diego County, run on Tallgrass’s NVIDIA V100 GPU nodes. DEC outperformed baseline K-means across all experiments.

Presenting a poster with ARC colleagues at Supercomputing 19 (SC19) Natalya Rapstine is standing directly in front of the poster in conversation with a conference attendee. Janice G. anf Jeff F. (right to left), the founding members of ARC, are at the right speaking with another attendee (Photo: Jeff Falgout, USGS)

I formally joined ARC as a computational scientist during the pandemic. At ARC, I spent half a decade providing HPC and AI/ML support for ecological research — providing consultations, collaborating on projects, helping researchers get their code up and running on GPU-enabled supercomputers, training scientists on how to build, train, and use deep learning models. The ARC team was a great group of people to work with and I worked with some very dedicated, smart government scientists during my time there.

A significant part of the work was course development and teaching. The R for HPC workshop — covering scientific R and parallel computing on ARC systems — grew from a one-off offering into a course I taught quarterly. By FY23 it had reached 87 learners in a single fiscal year, with one session drawing 50 participants after wider promotion. After one of those sessions, a cybersecurity specialist told me the material covered as much content as a semester-long university course and that USGS was years ahead of what universities were teaching at the time.

I built the Introduction to Deep Learning and Image Classification workshop from scratch over two fiscal years. It was a three-day course covering neural network theory, TensorFlow and Keras workflows, GPU hardware, convolutional network architectures, regularization, image augmentation, hyperparameter optimization, and transfer learning from pretrained architectures — with more than five complete worked examples and a closing mini-competition. By FY24 I was teaching it six times a year; over FY23 alone it reached 101 learners. A restructured standalone introductory module was added in FY24 and the course moved onto DOI Talent.

I also developed an AI literacy talk for non-technical audiences — delivered under titles like “Deep Learning for Wildlife at USGS” and “Increase Your IQ on AI” — aimed at building institutional understanding of what these tools could and couldn’t do, not just how to use them. I delivered it seven times in FY24 to audiences ranging from working scientists to regional deputy directors and senior USGS leadership, including a standing-room session at the ITEM conference.

The last Supercomputing conference (SC) I attened in person as a member of ARC was SC23. During this meeting, ARC represented USGS at a booth in the exhibit hall, where we engaged with researchers, vendors, and other attendees.

USGS ARC booth at Supercomputing 23, Denver. Left to right: Phil O., and Jeff Tracey. (Photo: USGS)
USGS ARC booth at Supercomputing 23. Left to right: colleagues from Colorado School of Mines, Leah C., Janice G. (ARC team lead), Phil O. (Photo: USGS)

Selected ARC Projects

Consulting engagements at ARC were substantive, ongoing collaborations — not one-off questions. A few that I’m particularly proud of:

Pacific Walrus Camera Classifier — In collaboration with Anthony Fischbach and colleagues at USGS Alaska Science Center and USFWS, I built and trained convolutional neural network models to classify camera trap images from Pacific walrus haulout sites into four categories: no walrus, walrus only, carcass only, and carcass with walrus. The classifiers reached 95.4–95.6% accuracy. I also developed the code to index and manage large image datasets and ran systematic hyperparameter tuning across multiple architectures on Tallgrass. The model, training code, and training images are publicly available through USGS GitLab and ScienceBase.

Anuran Species Classifier (ARMI) — A multi-year collaboration with Jennifer Rowe, Meredith Driskin, Hardin Waddle, and the USGS Amphibian Research and Monitoring Initiative to build CNN-based classifiers for detecting anuran species from passive acoustic recordings. We met monthly and worked through multiple coding and debugging sessions over the course of the project. The bullfrog classifier reached 96.7% accuracy.

Wildlife Underpass Camera Trap Classification — I ran more than 4,500 training experiments on Tallgrass’s GPU nodes to develop a classifier for wildlife camera trap images from road underpass monitoring sites. The trained model was ultimately used to label over three million previously unlabeled images. This work continues — see the Wildlife Cameras project page.

Work in bioacoustics and wildlife camera systems is ongoing — see the Current Projects page.


Papers

Tracey JA, Sheppard JK, Lockwood GK, Chourasia A, Tatineni M, Fisher RN, Sinkovits RS (2014). Efficient 3D movement-based kernel density estimator and application to wildlife ecology. Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, pp. 1–8. https://doi.org/10.1145/2616498.2616522

Rapstine NI, Tracey JA, Gordon JM, Fisher RN (2019). Unsupervised Clustering of Telemetry Data. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Press. https://sc19.supercomputing.org/proceedings/tech_poster/tech_poster_pages/rpost221.html