-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlistOfExperiencesShort.tex
106 lines (98 loc) · 8.22 KB
/
listOfExperiencesShort.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
\textbf{{Brookhaven National Laboratory $\>$$\>$$\>$$\>$Computer Scientist$\>$$\>$$\>$$\>$May 2019 - present}}
\begin{itemize}
%\item Broadly, developing novel performance tuning techniques for MPI+OpenMP applications for emerging supercomputers, with a focus on new and challenging computational simulations and next-generation computer architectures.
\item Contributing to developing an LLVM OpenMP implementation, specifically the OpenMP implementation's compiler and its runtime, targetted for Department of Energy's upcoming Exascale Supercomputer platforms.
\item Designing and implementing OpenMP task-to-multiGPU scheduling strategies to improve within-node load balancing of applications running on supercomputers having multiple GPUs per node.
\item Developing tunable locality-aware loop scheduling strategies, and generally user-defined loop schedules, in LLVM's OpenMP implementation, in the context of MPI+OpenMP applications runnning on supercomputers having multicore processors and GPUs.
\item Contributing to the OpenMP Language Committee to support OpenMP parallelization on multiple GPUs of a node for C, C++ and Fortran, and for user-defined schedules in OpenMP.
\item Developing benchmarks and evaluating OpenMP implementations, e.g., LLVM's OpenMP, NVIDIA's OpenMP, on Exascale Supercomputers.
\item Organizing, leading and participating in hackathons for using OpenMP on Department of Energy's Exascale Supercomputers.
\item Serving as Technical Project Manager for DoE Exascale Computing Project’s SOLLVE project and representing Brookhaven National Laboratory the OpenMP Architecture Review Board.
\end{itemize}
\dates{June 2018 - April 2019}
\location{Champaign, Illinois}
\title{Software Developer}
\employer{Charmworks, Inc.}
\textbf{Charmworks, Inc. $\>$$\>$$\>$$\>$Software Developer$\>$$\>$$\>$$\>$June 2018 - April 2019}
%\begin{position}
\vspace{0.0in}
\begin{itemize}
\item Collaborated with Lawrence Livermore National Lab on a proposal for a synergistic loop scheduling and load balancing strategy.
\item Worked on making User-defined Loop Scheduling portable across different parallel programming library, done with Oak Ridge National Lab through DoE Exascale Computing Program.
\item Added examples of loop scheduling in OpenMP in the Examples section of OpenMP Specification.
\item Worked on a NSF startup SBIR proposal for loop scheduling for desktop computers.
\item Collaborated on developing a proposal to add an OpenMP User-defined Schedule to the OpenMP specification based on an OpenMPCon 2017 paper, presenting a proposal at the OpenMP F2F in Santa Clara and the upcoming F2F in Toronto.
\item Worked on papers for User-defined Loop Scheduling for publication.
\item Assisted with slides for pitch and marketing of Charm++ software, and providing feedback for tutorials on Charm++.
\item Integrated a shared memory library for sophisticated loop scheduling strategies, including some based on my dissertation, into the current version of Charm++.
%item Comparing performance of a loop scheduling strategy available in the integrated shared memory library with the perfor\
%mance of the corresponding loop scheduling strategy available in LLVM’s OpenMP library.
\end{itemize}
%\end{position}
%\textbf{{Brookhaven National Laboratory $\>$$\>$$\>$$\>$Computer Scientist$\>$$\>$$\>$$\>$May 2019 - present}}
%\begin{itemize}
% \item Working on locality-aware loop scheduling strategies, and generally user-defined loop schedules, in the context of MPI+OpenMP applications
% \item Developing novel performance tuning techniques for MPI+OpenMP applications for emerging supercomputers, with a focus on new and challenging computational simulations and next-generation computer architectures.
% \item Serving as Program Manager for DoE Exascale Computing Project’s project SOLLVE and asBrookhaven National Laboratory’s representative in the OpenMP Architecture Review Board.
%\end{itemize}
%\dates{Jun. '18 - Apr. '19}
%\location{Champaign, Illinois}
%\title{Software Developer}
%\employer{Charmworks, Inc.}
%\textbf{Charmworks, Inc. $\>$$\>$$\>$$\>$Software Developer$\>$$\>$$\>$$\>$Jun. '18 - Apr. '19}
%\begin{position}
%\vspace{0.0in}
%\begin{itemize}
%\item Worked on a performance optimization strategies that synergize loop scheduling done within-node with load balancing done across-node.
%\item Involved in contributing sections on loop scheduling in the OpenMP Specification (specification at openmp.org).
%\item Worked on a startup proposal for loop scheduling for applications for desktop computers (as opposed to HPC Applications)
%%\item Assisting with slides for pitch and marketing of Charm++ software, and providing feedback for tutorials on Charm++.
%\item Integrated a shared memory library for sophisticated loop scheduling strategies, including some based on my dissertation, into the current version of Charm++.
%%item Comparing performance of a loop scheduling strategy available in the integrated shared memory library with the performance of the corresponding loop scheduling strategy available in LLVM’s OpenMP library.
%\end{itemize}
%%\end{position}
\textbf{University of Southern California$\>$$\>$$\>$$\>$Computer Scientist$\>$$\>$$\>$$\>$Dec. 2016 - Jun 2018}
\vspace*{-0.0in}
\begin{itemize}
\item Designed techniques that combine loop scheduling and load balancing to improve performance of scientific applications.
\item Worked with OpenMP Language Committee to support user-defined loop schedules in OpenMP.
\item Translated an x-ray tomography code written in Matlab code to C code and then parallelizing it to run on a supercomputer having nodes with GPGPUs.
\item Worked on modifications to LLVM compiler to support new OpenMP loop schedules.
% \item Worked on ensuring external network infrastructure to support transfer of application code's input data files were adequate for an application code's efficient execution using the Globus Toolkit.
%\item \small Managing a git repository for a team working on
%performance optimizations of the application program.
\item Worked in team to manage computational performance aspects of running an application program involving Fast Fourier Transformation and image reconstruction algorithms.
%\item \small Doing optimizations for MPI+CUDA application code involving low-overhead loop scheduling and loop optimizations such as loop unrolling.
%\item \small Working on transformations in LLVM.
\end{itemize}
%TODO: adaptive VS hybrid VS ...
\textbf{Charmworks, Inc.$\>$$\>$$\>$$\>$Developer$\>$$\>$$\>$$\>$Jan. 2016 - Nov. 2016}
\vspace*{-0.0in}
\begin{itemize}
\item Implemented mixed static/dynamic loop scheduling
strategies within Charm++'s thread scheduling library.
%TODO: consider adding 'including in cloud environments' the end of
%the sentence.
%TODO: make paragraph
\item Helped to improve portability of Charm++ to a variety of platforms.
\item Assisted with business aspects of a high-tech startup.
\end{itemize}
\textbf{ University of Illinois$\>$$\>$$\>$$\>$Postdoctoral Associate$\>$$\>$$\>$$\>$Jul. 2015 – Dec. 2015}
\vspace*{-0.0in}
\begin{itemize}
\item Developed library that allows application programmers to use strategies from dissertation.
\item Adapted a plasma physics application code to work on a
GPGPU processor and Intel Xeon Phi.
\item Incorporated over-decomposition and locality-aware scheduling into strategies from dissertation.
\end{itemize}
\textbf{Lawrence Livermore Nat’l Lab$\>$$\>$$\>$$\>$Lawrence Scholar$\>$$\>$$\>$$\>$Feb. 2012 – Jun. 2014}
\vspace*{-0.0in}
\begin{itemize}
\item Measured MPI communication delays for micro-benchmarks codes run on supercomputers and worked to find tools to measure dequeue overheads of OpenMP loop schedulers.
\item Created a software system for automated performance optimization and application programmer usability of low-overhead hybrid scheduling
strategies.
\item Developed a ROSE-based custom compiler for automatically transforming MPI+OpenMP applications to use low-overhead scheduling
techniques and runtime.
\item Assessed further opportunities for performance improvement of low-overhead schedulers, including improvement of spatial locality
of low-overhead schedulers.
\end{itemize}