--

HI Andressa - Sorry somehow I missed responding to you. The models use a lot of GPU memory during inference. These memories are organized as pages. What nVidia unified memory features, which seemlesly swap the pages between GPU and CPU, providing a larger memory space to work with. SO the inferences and tuning, can continue and do not stop/crash due to memory overflow. Hope that helps

--

--

A B Vijay Kumar
A B Vijay Kumar

Written by A B Vijay Kumar

IBM Fellow, Master Inventor, Mobile, RPi & Cloud Architect & Full-Stack Programmer