-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlistOfExperiences-bySkill.tex
139 lines (114 loc) · 7.51 KB
/
listOfExperiences-bySkill.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
% TODO: put in numbers and impact
% TODO: explain role in each
% TODO: check for grammare
% TODO: get checked by recruiters
\newcommand{\myExpOne}{
%\item Technical lead for HPC Tools and Runtime Systems R\&D for Sandia's Exascale Software.
\item Pathfinding and software engineering for tools for Kokkos integrated with (1) HPC performance monitoring and feedback via LDMS and (2) PMPI and adaptive runtime systems for MPI.
%\item Implementing and specifying tooling API features for Kokkos Tools for OpenMP, OpenACC, MPI, and C++ standards.
\item Developed AI-assisted HPC Tools through LLMs (coderosetta.com) and autotuning (TAU+APEX) for Kokkos applications run on NVIDIA GPUs, resulting in a poster presentation at GTC 2025.
\item Research and pathfinding on the use of AI chips, e.g., Cerebras WSE-3, for science simulations.
\item Submitted two proposals on correctness tools for HPC, each with \$1.5M in funding for 3 years.
}
\subsection*{Technical Leadership and Industry-grade Open-source HPC Software}
\textbf{Sandia National Laboratories}\\
{Principal Member of Technical Staff II} \hfill \textit{July 2024 - Present}
%\vspace{-0.02in}
\noindent
\begin{itemize}[itemsep=-0.1em]\onlyitems[include={1,2}]
\myExpOne
\end{itemize}
\newcommand{\myExpTwo}{
\item Developed and maintained Kokkos Tools for the CMake and Spack build system, tooling overheads, CI/CD, auto-tuning, and nvtx/roctx/vtune integration, leading to 15 merged github PRs.
% \item Contributed to autotuning features to the Kokkos 4.5 release.
\item Developed a debugging tool that detected 7 common Kokkos user bugs by analyzing LLVM IR of Kokkos programs via symbolic execution, leading to a paper at SC24's Correctness workshop.
% \item Implemented prototype LLVM OpenMP feature for index set splitting of an OpenMP loop, and to OpenMP 6.0's new split directive.
\item Implemented new features in LLVM OpenMP, leading to a 1.2x speedup for a Kokkos-OpenMP+CUDA benchmark, 3 OpenMP 6.0 features, and 19 feature proposals for OpenMP 6.1.
}
\noindent
{Senior Member of Technical Staff} \hfill \textit{August 2022 - June 2024}
%\vspace{-0.02in}
\begin{itemize}[itemsep=-0.1em]
\myExpTwo
\end{itemize}
\newcommand{\myExpThree}{
% \item Contributed to developing an LLVM OpenMP implementation, specifically the OpenMP implementation's compiler and its runtime, targetted for Department of Energy's upcoming Exascale Supercomputer platforms.
\item Implemented OpenMP user-defined multi-GPU scheduling for LLVM, offering 2.1x speedup over using MPI parallelization, leading to papers at IWOMP 2020 and BCB 2021.
\item Implemented performance optimizations in LLVM for OpenMP asynchronous GPU offloading that achieved a 1.2x speedup, leading to a paper at SC22's HiPar workshop.
\item Developed performance benchmarks that evaluated 5 major vendor OpenMP GPU implementations, leading to an ACM journal paper and an IWOMP 2021 workshop paper.
% \item Developed benchmarks and evaluating OpenMP implementations, e.g., LLVM's OpenMP, NVIDIA's OpenMP, on Exascale Supercomputers.
\item Demonstrated technical leadership as technical project manager for the ECP SOLLVE project, submitting 12 ECP milestone reports, organizing 7 GPU hackathons, and defining 3 project KPIs.
%and voting in 5 OpenMP Committee meetings.
}
\noindent
\textbf{Brookhaven National Laboratory}\hfill
{Assistant Computational Scientist} \hfill \textit{May 2019 - August 2022}
\begin{itemize}[itemsep=-0.1em]
\myExpThree
\end{itemize}
\subsection*{HPC Software Development and Peformance Engineering}
\newcommand{\myExpFour}{
\item Implemented User-defined Loop Schedules (UDS) for OpenMP and RAJA via a prototype library for LLVM and GCC, leading to a paper at IWOMP 2018 and 3 github PRs merged in Charm++.
\item Performance analysis and optimization of MPI+CUDA science simulations application on NVIDIA GPUs via CUPTI and auto-tuning, leading to 1.4x speedup of an application for chip design.
\item Developed novel and efficient multi-level loop schedulers in Charm++, leading to a 1.2x speedup on the particle-in-cell benchmark PRK code and a Best Poster Candidate at SC18.
%\item Added the UDS feature to RAJA and Charm++'s CkLoop, with 1 github PR merged in Charm++.
}
\noindent
\textbf{USC/ISI + Charmworks, Inc.}\hfill
{Software Engineer} \hfill \textit{Jan 2016 - May 2019}
\vspace{-0.1in}
\begin{itemize}[itemsep=-0.1em]
\myExpFour
\end{itemize}
\comments{
\newcommand{\myExpFive}{
\item Performance analysis and optimization of 3-D image reconstruction application on NVIDIA GPUs via CUPTI and auto-tuning, leading to a performance-enhanced CUDA version of the application.
\item Developed tuning support for coordinated loop scheduling and load balancing in Charm++, leading to a 1.2x speedup on a particle-in-cell benchmark code and a Best Poster Candidate at SC18.
}
\noindent
\textbf{USC - Information Sciences Institute}\hfill
\textit{Computer Scientist} \hfill \textit{Dec 2016 - May 2018}
\begin{itemize}[itemsep=-0.1em]
\myExpFive
\end{itemize}
% accomplished x as measured by y, by doing z
\newcommand{\myExpSix}{
\item Extended Charm++ to offer a novel runtime system capability of coordinating inter-node load balancing and intra-node loop scheduling, leading to 2 github PRs merged in Charm++.
}
\noindent
\textbf{Charmworks, Inc.}\hfill
\textit{Software Developer} \hfill \textit{Jan 2016 - Dec 2016}
%\vspace*{-0.02in}
\begin{itemize}
\myExpSix
\end{itemize}
\noindent
\textbf{University of Illinois}\hfill
\textit{Postdoctoral Associate} \hfill \textit{Jul 2015 - Dec 2015}
\begin{itemize}[itemsep=-0.1em]
\item Sped up a plasma-physics Fortran MPI+OpenACC code by 1.2x via a combination of GPU offload optimizations and loop transformations on an NVIDIA K80 GPU.
\end{itemize}
}
\noindent
\textbf{LLNL + UIUC}\hfill Researcher \hfill \textit{Jan 2010 – Dec 2015}
\vspace*{-0.1in}
\begin{itemize}[itemsep=-0.1em]
%\item Measured MPI communication delays for micro-benchmarks codes run on supercomputers and worked to find tools to measure dequeue overheads of OpenMP loop schedulers.
%\item Created a software system for automated performance optimization and application programmer usability of low-overhead hybrid scheduling
%strategies.
\item Implemented a ROSE-based compiler pass and PMPI-based runtime system for MPI+OpenMP applications to use loop scheduling techniques, leading to a 1.4x speedup on a multi-core cluster.
\item Developed prototype for MPIch shared memory extensions, leading a paper with 370+ citations.
\item Implemented multi-core and GPU performance optimizations for domains of linear algebra, blood flow, fusion, and combustion, leading to 2 papers at IPDPS.
%\item Assessed further opportunities for performance improvement of low-overhead schedulers, including improvement of spatial locality
%of low-overhead schedulers.
\end{itemize}
\subsection*{General Software Engineering}
\noindent
\textbf{Proteus Technologies + Wolfram} \hfill {Software Developer} \hfill \textit{Aug 2007 – Aug 2008}
\vspace*{-0.1in}
\begin{itemize}[itemsep=-0.1em]
\item Developed and tested a service-oriented software to monitor health of a large-scale distributed system, leading to an internal white paper and software package.
\item Implemented functionality in Mathematica for users to send emails within a kernel, via sendmail and TLS, leading to a software feature in Mathematica.
%\item Developed company standards for software development through system requirements specifications, Design Documentation.
%\item Designed and implemented algorithms for power management of clusters, leading to a white paper
\end{itemize}