A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
-
Updated
Aug 19, 2025 - C++
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Thin, unified, C++-flavored wrappers for the CUDA APIs
Training neural networks in TensorFlow 2.0 with 5x less memory
A Toolkit for Training, Tracking, Saving Models and Syncing Results
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
OpenCV & Spout C++ library. Shared GPU memory and processing at reach.
Rust embedded things running on the seL4 microkernel for the Raspberry Pi 3
A tiny, useful command-line tool to show each user gpu usage, pid under each gpu, provide more details than nvidia-smi/gpustat
A simple tool to find out GPU VRAM requirements for running LLMs
Demonstration of generating mini-batches in Tensorlfow from GPU memory.
A fork of Kubernetes with support of schedulable resource of NVIDIA GPU memory
Dynamic GPU Layer Swapping: Train large models on consumer GPUs with intelligent memory management
A CLI tool for estimating GPU VRAM requirements for Hugging Face models, supporting various data types, parallelization strategies, and fine-tuning scenarios like LoRA.
📊 A command line monitoring tool (graph) for NVIDIA GPUs
Accurate VRAM calculator for Local LLMs (Llama 4, DeepSeek V3, Qwen 2.5). Calculates GGUF quantization, GQA context overhead, and offloading limits
GPU memory-efficient training for PyTorch - 90%+ memory savings through gradient compression
Add a description, image, and links to the gpu-memory topic page so that developers can more easily learn about it.
To associate your repository with the gpu-memory topic, visit your repo's landing page and select "manage topics."