Skip to content

MemoryUsage

Vivek Kale edited this page Dec 4, 2024 · 10 revisions

Tool Description

The tool provides a timeline of memory usage for each individual Kokkos Memory Space.

The tool is located at: https://github.com/kokkos/kokkos-tools/tree/develop/profiling/memory-usage

Compilation

Via Makefile

Simply type make inside the source directory. When compiling for specific platforms modify the simple Makefile to use the correct compiler and compiler flags.

Via CMake

One can also use the cmake build system. Create a build directory and go to that directory. Type ccmake .. to ensure that kp_memory_usage is in the list of profilers. Then, type cmake ..; make -j; sudo make install to build the profiler.

Usage

This is a standard tool which does not yet support tool chaining. In Bash, do:

export KOKKOS_TOOLS_LIBS={PATH_TO_TOOL_DIRECTORY}/kp_memory_usage.so
./application COMMANDS

This tool stores 24 bytes per allocation and deallocation for the timeline.

Output

The MemoryUsage tool will generate one for file for each active Kokkos Memory Space with the utilization timeline, and the files are in the form HOSTNAME-PROCESSID-MEMSPACE.memspace_usage. It will also generate output on the total memory transferred between all Memory Spaces.

Example

Consider the following code:

#include<Kokkos_Core.hpp>

  typedef Kokkos::View<int*,Kokkos::CudaSpace> a_type;
  typedef Kokkos::View<int*,Kokkos::CudaUVMSpace> b_type;
  typedef Kokkos::View<int*,Kokkos::CudaHostPinnedSpace> c_type;

int main() {
  Kokkos::initialize();
  {
    int N = 10000000;
    for(int i =0; i<2; i++) { 
      a_type a("A",N);
      {

        b_type b("B",N);
        c_type c("C",N);
        for(int j =0; j<N; j++) {
          b(j)=2*j;
          c(j)=3*j;
        }
      }
    }
  }
  Kokkos::finalize();
}

This will produce the following output:

HOSTNAME-PROCESSID-Cuda.memspace_usage

# Space Cuda
# Time(s)  Size(MB)   HighWater(MB)   HighWater-Process(MB)
0.311913 0.0 0.0 81.8
0.312108 0.0 0.0 81.8
0.312667 38.2 38.2 81.8
0.379795 0.0 38.2 158.1
0.380185 38.2 38.2 158.1
0.444391 0.0 38.2 158.1

HOSTNAME-PROCESSID-CudaUVM.memspace_usage

# Space CudaUVM
# Time(s)  Size(MB)   HighWater(MB)   HighWater-Process(MB)
0.317260 38.1 38.1 81.8
0.377285 0.0 38.1 158.1
0.384785 38.1 38.1 158.1
0.441988 0.0 38.1 158.1

HOSTNAME-PROCESSID-CudaHostPinned.memspace_usage

# Space CudaHostPinned
# Time(s)  Size(MB)   HighWater(MB)   HighWater-Process(MB)
0.311749 0.0 0.0 81.8
0.335289 38.1 38.1 120.0
0.368485 0.0 38.1 158.1
0.400073 38.1 38.1 158.1
0.433218 0.0 38.1 158.1

Here, the information shown is Time, Size, HighWater and HighWater-Process. The Size is total memory of the Kokkos allocations. HighWater is the maximum so far of that Kokkos allocation. A high water mark is the maximum amount of memory that has been utilized from a memory allocation, often referred to as the maximum resident set size (max RSS). The HighWater-process is the maximum RSS as defined by Linux for the process.

Clone this wiki locally