Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)NVIDIA GPUs storyIn this comprehensive post, I’ve meticulously compiled the captivating journey of NVIDIA GPUs, tracing their humble beginnings to their…Mar 25Mar 25
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)“A Study of Checkpointing in Large Scale Training of Deep Neural Networks” paper summaryIntroductionOct 14, 2023Oct 14, 2023
Learning about your system’s hardware setup within the Ubuntu terminalIn this post, I gathered the commands that are needed to know about the system’s CPU, GPU, system memory, disk, and how CPUs and GPUs are…Sep 21, 2023Sep 21, 2023
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)“AntMan: Dynamic Scaling on GPU Clusters for Deep Learning” paper summaryIntroductionAug 11, 2023Aug 11, 2023
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)“Looking beyond GPUs for DNN Scheduling on Multi-Tenant Clusters” paper summaryIntroductionAug 7, 2023Aug 7, 2023
Metrics in Machine LearningThis post reviews machine learning, its different subbranches very concisely as introduction, then delve into metrics. It is important to…Jun 15, 2023Jun 15, 2023
Profiling PyTorch model with NVIDIA Nsight Systems gives Error when MIG is enabled for any of the…This post’s aim to spread the word for disabling MIG mode of all GPUs when using Nsight systems for profiling a deep learning model…May 3, 2023May 3, 2023
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)Profiling a Training Task with PyTorch Profiler and viewing it on TensorboardThis post briefly and with an example shows how to profile a training task of a model with the help of PyTorch profiler. Developers use…May 3, 2023May 3, 2023
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)The Extinction of ProgrammingWhen “traditional programming” or “programming” terms are used, we usually mean using programming languages like python, Java, C#, etc. to…Feb 18, 2023Feb 18, 2023
Published inComputing Systems and Hardware for emerging applications (especially Machine Learning, Deep Learning)Setting up Conda, TensorFlow, and PyTorch for your Deep Learning ExperimentationThis post reviews how to build the experimenting environment with conda for deep learning. I compile the commands to ease the purpose of…Jan 27, 20231Jan 27, 20231