如何检查PyTorch是否使用GPU?

小开

最佳答案

这些函数应该有助于:

>>> import torch


>>> torch.cuda.is_available()
True


>>> torch.cuda.device_count()
1


>>> torch.cuda.current_device()
0


>>> torch.cuda.device(0)
<torch.cuda.device at 0x7efce0b03be0>


>>> torch.cuda.get_device_name(0)
'GeForce GTX 950M'

这告诉我们:

CUDA是可用的，可以在一台设备上使用。
Device 0指的是GPU GeForce GTX 950M，它目前被PyTorch选中。

小开

在你开始运行训练循环之后，如果你想要手动从终端上观察你的程序是否在利用GPU资源，以及利用到什么程度，那么你可以简单地使用watch，如下所示:

$ watch -n 2 nvidia-smi

这将持续更新使用统计每2秒，直到你按ctrl+c

如果你需要更多的GPU统计数据的控制，你可以使用使用__ABC1的更复杂的nvidia-smi版本。下面是一个简单的例子:

$ watch -n 3 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv

这将输出统计信息如下:

请注意:在--query-gpu=...中，逗号分隔的查询名称之间不应该有任何空格。否则，这些值将被忽略，不返回统计信息。

此外，你可以通过以下方法检查PyTorch安装是否正确检测到CUDA安装:

In [13]: import  torch


In [14]: torch.cuda.is_available()
Out[14]: True

True状态意味着PyTorch被正确配置，是使用GPU，尽管你必须在你的代码中使用必要的语句移动/放置张量。

如果你想在Python代码中执行此操作，请查看以下模块:

https://github.com/jonsafari/nvidia-ml-py或在pypi这里:https://pypi.python.org/pypi/nvidia-ml-py/

小开

在GPU上创建一个张量，如下所示:

$ python
>>> import torch
>>> print(torch.rand(3,3).cuda())

不要退出，打开另一个终端，检查python进程是否使用该GPU:

$ nvidia-smi

小开

在官方网站的入门页面，你可以像这样检查PyTorch的GPU是否可用:

import torch
torch.cuda.is_available()

参考:PyTorch |开始吧

小开

由于这里没有提出，所以我添加了一个使用torch.device的方法，因为这非常方便，在正确的device上初始化张量时也是如此。

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()


#Additional Info when using cuda
if device.type == 'cuda':
print(torch.cuda.get_device_name(0))
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

编辑:torch.cuda.memory_cached已重命名为torch.cuda.memory_reserved。所以对于旧版本使用memory_cached。

< >强输出:< / >强

Using device: cuda


Tesla K80
Memory Usage:
Allocated: 0.3 GB
Cached:    0.6 GB

如上所述，使用device它是可能:

移动张量到相应的device:
```
torch.rand(10).to(device)
```
直接在device上创建张量:
```
torch.rand(10, device=device)
```

这使得在CPU和GPU之间轻松切换而无需更改实际代码。

编辑:

由于有一些关于缓存和分配内存的问题和困惑，我添加了一些关于它的额外信息:

< >强torch.cuda.max_memory_cached(device=None) < / >强

返回缓存分配器为a管理的最大GPU内存，单位为字节鉴于设备。

< >强torch.cuda.memory_allocated(device=None) < / >强 返回给定设备的当前GPU内存使用情况(以字节为单位)。

你可以直接移交一个device，如上文所述，或者你可以留下它没有一个，它将使用< >强current_device() < / >强。

附加说明:旧的图形卡与Cuda计算能力3.0或更低可能是可见的，但不能被Pytorch使用!
感谢hekimgil指出这一点!发现GPU0 GeForce GT 750M, cuda能力3.0。PyTorch不再支持这个GPU，因为它太老了。我们支持的最低cuda能力是3.5."

小开

查询是否有可用的GPU。

torch.cuda.is_available()

如果上面的函数返回False，

你要么没有GPU，

或者没有安装Nvidia驱动程序，所以OS看不到GPU，

或者GPU被环境变量CUDA_VISIBLE_DEVICES隐藏。当CUDA_VISIBLE_DEVICES的值为-1时，则所有设备都被隐藏。你可以在代码中用os.environ['CUDA_VISIBLE_DEVICES']检查这个值

如果上面的函数返回True，这并不一定意味着你正在使用GPU。在Pytorch中，您可以在创建设备时将张量分配给它们。默认情况下，张量被分配给cpu。要检查张量的分配位置，请执行以下操作:

# assuming that 'a' is a tensor created somewhere else a.device # returns the device where the tensor is allocated

注意，您不能操作在不同设备中分配的张量。要了解如何将一个张量分配给GPU，请参见这里:https://pytorch.org/docs/stable/notes/cuda.html

小开

如果你在这里是因为你的pytorch总是为False提供torch.cuda.is_available()，这可能是因为你安装的pytorch版本没有GPU支持。(例如:你在笔记本电脑上编码，然后在服务器上测试)。

解决方案是卸载并使用pytorch 下载页面中的正确命令重新安装pytorch。也可参考这 pytorch issue。

小开

从实际的角度来看，有一个小题外话:

import torch dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

这个dev现在知道是cuda还是cpu。

在使用cuda时，处理模型和张量的方式是不同的。一开始有点奇怪。

import torch import torch.nn as nn dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") t1 = torch.randn(1,2) t2 = torch.randn(1,2).to(dev) print(t1) # tensor([[-0.2678, 1.9252]]) print(t2) # tensor([[ 0.5117, -3.6247]], device='cuda:0') t1.to(dev) print(t1) # tensor([[-0.2678, 1.9252]]) print(t1.is_cuda) # False t1 = t1.to(dev) print(t1) # tensor([[-0.2678, 1.9252]], device='cuda:0') print(t1.is_cuda) # True class M(nn.Module): def __init__(self): super().__init__() self.l1 = nn.Linear(1,2) def forward(self, x): x = self.l1(x) return x model = M() # not on cuda model.to(dev) # is on cuda (all parameters) print(next(model.parameters()).is_cuda) # True

这一切都很棘手，一旦理解它，就可以帮助您快速处理较少的调试。

小开

这里几乎所有的答案都引用torch.cuda.is_available()。然而，这只是硬币的一部分。它告诉你GPU(实际上是CUDA)是否可用，而不是它是否实际被使用。在一个典型的设置中，你会像这样设置你的设备:

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

但在更大的环境中(例如研究)，通常也会给用户更多的选择，因此根据输入，他们可以禁用CUDA，指定CUDA id，等等。在这种情况下，是否使用GPU不仅仅取决于GPU是否可用。在设备被设置为torch设备之后，你可以获取它的type属性来验证它是否是CUDA。

if device.type == 'cuda': # do something

小开

简单地从命令提示符或Linux环境中运行以下命令。

python -c 'import torch; print(torch.cuda.is_available())'

上面应该打印True

python -c 'import torch; print(torch.rand(2,3).cuda())'

它应该打印以下内容:

tensor([[0.7997, 0.6170, 0.7042], [0.4174, 0.1494, 0.0516]], device='cuda:0')

小开

查询命令

PyTorch能看到图形处理器吗? torch.cuda.is_available()

张量默认存储在GPU上吗? torch.rand(10).device

设置默认张量类型为CUDA: torch.set_default_tensor_type(torch.cuda.FloatTensor)

这个张量是GPU张量吗? my_tensor.is_cuda

这个模型是否存储在GPU上? all(p.is_cuda for p in my_model.parameters())

小开

使用下面的代码

import torch torch.cuda.is_available()

将只显示GPU是否存在并被pytorch检测到。

但是在“任务管理器”->performance"GPU的利用率将会非常低。

这意味着你实际上是在使用CPU运行。

解决上述问题的检查和修改:

图形设置——>打开硬件加速GPU设置，重新启动。

打开NVIDIA控制面板->桌面——比;在通知区域显示图形处理器 [注意:如果你是新安装的windows，那么你也必须同意NVIDIA控制面板中的条款和条件]

这应该有用!

小开

这是可能的

torch.cuda.is_available()

返回True，但在运行时得到以下错误

>>> torch.rand(10).to(device)

MBT建议:

RuntimeError: CUDA error: no kernel image is available for execution on the device

这个链接解释了这一点

．.． torch.cuda。Is_available只检查你的驱动程序是否与我们在二进制文件中使用的cuda版本兼容。这意味着CUDA 10.1与您的驱动程序是兼容的。但是当你用CUDA计算时，它找不到你的拱门的代码。

小开

如果您正在使用Linux，我建议安装nvtop https://github.com/Syllo/nvtop 你会得到这样的结果: 

小开

步骤1:导入火炬库

import torch

#步骤2:创建张量

tensor = torch.tensor([5, 6])

#步骤3:查找设备类型

#output 1:在下面，输出我们可以得到size(tensor.shape)， dimension(tensor.ndim)，
. #和处理张量的设备
tensor, tensor.device, tensor.ndim, tensor.shape (tensor([5, 6]), device(type='cpu'), 1, torch.Size([2]))

#或

#输出2:在下面，输出我们可以得到的唯一设备类型

tensor.device device(type='cpu')

#我的系统使用的cpu处理器是“第11代英特尔(R)酷睿(TM) i5-1135G7 @ 2.40GHz 2.42 ghz”;

#find，如果张量处理GPU?

print(tensor, torch.cuda.is_available() # the output will be tensor([5, 6]) False

#上面的输出是假的，因此它不在gpu上

#快乐编码:)

查询	命令
PyTorch能看到图形处理器吗?	`torch.cuda.is_available()`
张量默认存储在GPU上吗?	`torch.rand(10).device`
设置默认张量类型为CUDA:	`torch.set_default_tensor_type(torch.cuda.FloatTensor)`
这个张量是GPU张量吗?	`my_tensor.is_cuda`
这个模型是否存储在GPU上?	`all(p.is_cuda for p in my_model.parameters())`