未能得到卷积算法。这可能是因为 cudnN 未能初始化,

在 Tensorflow/kera 中,当从 https://github.com/pierluigiferrari/ssd_keras运行代码时,使用估计器: ssd300 _ value,我收到了这个错误。

无法得到卷积算法。这可能是因为 cuDNN 未能初始化,所以尝试查看上面是否打印了警告日志消息。

这与未解决的问题非常相似: 谷歌 Colab 错误: 未能得到卷积算法。这可能是因为 cuDNN 未能初始化

关于我正在研究的问题:

Python: 3.6.4.

Tensorflow 版本: 1.12.0。

Kera 版本: 2.2.4。

V10.0引擎。

V7.41.5.

NVIDIA GeForce GTX 1080.

我还跑了:

import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))

没有错误或问题。

极简主义的例子是:

 from keras import backend as K
from keras.models import load_model
from keras.optimizers import Adam
from scipy.misc import imread
import numpy as np
from matplotlib import pyplot as plt


from models.keras_ssd300 import ssd_300
from keras_loss_function.keras_ssd_loss import SSDLoss
from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes
from keras_layers.keras_layer_DecodeDetections import DecodeDetections
from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast
from keras_layers.keras_layer_L2Normalization import L2Normalization
from data_generator.object_detection_2d_data_generator import DataGenerator
from eval_utils.average_precision_evaluator import Evaluator
import tensorflow as tf
%matplotlib inline
import keras
keras.__version__






# Set a few configuration parameters.
img_height = 300
img_width = 300
n_classes = 20
model_mode = 'inference'




K.clear_session() # Clear previous models from memory.


model = ssd_300(image_size=(img_height, img_width, 3),
n_classes=n_classes,
mode=model_mode,
l2_regularization=0.0005,
scales=[0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05], # The scales
for MS COCO [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]
aspect_ratios_per_layer=[[1.0, 2.0, 0.5],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5],
[1.0, 2.0, 0.5]],
two_boxes_for_ar1=True,
steps=[8, 16, 32, 64, 100, 300],
offsets=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
clip_boxes=False,
variances=[0.1, 0.1, 0.2, 0.2],
normalize_coords=True,
subtract_mean=[123, 117, 104],
swap_channels=[2, 1, 0],
confidence_thresh=0.01,
iou_threshold=0.45,
top_k=200,
nms_max_output_size=400)


# 2: Load the trained weights into the model.


# TODO: Set the path of the trained weights.
weights_path = 'C:/Users/USAgData/TF SSD
Keras/weights/VGG_VOC0712Plus_SSD_300x300_iter_240000.h5'


model.load_weights(weights_path, by_name=True)


# 3: Compile the model so that Keras won't complain the next time you load it.


adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)


ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)


model.compile(optimizer=adam, loss=ssd_loss.compute_loss)




dataset = DataGenerator()


# TODO: Set the paths to the dataset here.
dir= "C:/Users/USAgData/TF SSD Keras/VOC/VOCtest_06-Nov-2007/VOCdevkit/VOC2007/"
Pascal_VOC_dataset_images_dir = dir+ 'JPEGImages'
Pascal_VOC_dataset_annotations_dir = dir + 'Annotations/'
Pascal_VOC_dataset_image_set_filename = dir+'ImageSets/Main/test.txt'


# The XML parser needs to now what object class names to look for and in which order to map them to integers.
classes = ['background',
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor']


dataset.parse_xml(images_dirs=[Pascal_VOC_dataset_images_dir],
image_set_filenames=[Pascal_VOC_dataset_image_set_filename],
annotations_dirs=[Pascal_VOC_dataset_annotations_dir],
classes=classes,
include_classes='all',
exclude_truncated=False,
exclude_difficult=False,
ret=False)






evaluator = Evaluator(model=model,
n_classes=n_classes,
data_generator=dataset,
model_mode=model_mode)






results = evaluator(img_height=img_height,
img_width=img_width,
batch_size=8,
data_generator_mode='resize',
round_confidences=False,
matching_iou_threshold=0.5,
border_pixels='include',
sorting_algorithm='quicksort',
average_precision_mode='sample',
num_recall_points=11,
ignore_neutral_boxes=True,
return_precisions=True,
return_recalls=True,
return_average_precisions=True,
verbose=True)
130660 次浏览

I had this error and I fixed it by uninstalling all CUDA and cuDNN versions from my system. Then I installed CUDA Toolkit 9.0 (without any patches) and cuDNN v7.4.1 for CUDA 9.0.

The problem is with the incompatibility of newer versions of tensorflow 1.10.x plus versions with cudnn 7.0.5 and cuda 9.0. Easiest fix is to downgrade tensorflow to 1.8.0

pip install --upgrade tensorflow-gpu==1.8.0

I've seen this error message for three different reasons, with different solutions:

1. You have cache issues

I regularly work around this error by shutting down my python process, removing the ~/.nv directory (on linux, rm -rf ~/.nv), and restarting the Python process. I don't exactly know why this works. It's probably at least partly related to the second option:

2. You're out of memory

The error can also show up if you run out of graphics card RAM. With an nvidia GPU you can check graphics card memory usage with nvidia-smi. This will give you a readout of how much GPU RAM you have in use (something like 6025MiB / 6086MiB if you're almost at the limit) as well as a list of what processes are using GPU RAM.

If you've run out of RAM, you'll need to restart the process (which should free up the RAM) and then take a less memory-intensive approach. A few options are:

  • reducing your batch size
  • using a simpler model
  • using less data
  • limit TensorFlow GPU memory fraction: For example, the following will make sure TensorFlow uses <= 90% of your RAM:
import keras
import tensorflow as tf


config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9  # 0.6 sometimes works better for folks
keras.backend.tensorflow_backend.set_session(tf.Session(config=config))

This can slow down your model evaluation if not used together with the items above, presumably since the large data set will have to be swapped in and out to fit into the small amount of memory you've allocated.

A second option is to have TensorFlow start out using only a minimum amount of memory and then allocate more as needed (documented here):

os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

3. You have incompatible versions of CUDA, TensorFlow, NVIDIA drivers, etc.

If you've never had similar models working, you're not running out of VRAM and your cache is clean, I'd go back and set up CUDA + TensorFlow using the best available installation guide - I have had the most success with following the instructions at https://www.tensorflow.org/install/gpu rather than those on the NVIDIA / CUDA site. Lambda Stack is also a good way to go.

I was struggling with this problem for a week. The reason was very silly: I used high-res photos for training.

Hopefully, this will save someone's time :)

The problem can also occur if there are incompatible version of cuDNN, which could be the case if you installed Tensorflow with conda, as conda also installs CUDA and cuDNN while installing Tensorflow.

The solution is to install the Tensorflow with pip, and install CUDA and cuDNN separately without conda e.g. if you have CUDA 10.0.130 and cuDNN 7.4.1 (tested configurations), then

pip install tensorflow-gpu==1.13.1

1) close all other notebooks, that use GPU

2) TF 2.0 needs cuDNN SDK (>= 7.4.1)

extract and add path to 'bin' folder into "environment variables / system variables / path": "D:\Programs\x64\Nvidia\cudnn\bin"

I had this problem after upgrading to TF2.0. The following started giving error:

   outputs = tf.nn.conv2d(images, filters, strides=1, padding="SAME")

I am using Ubuntu 16.04.6 LTS (Azure datascience VM) and TensorFlow 2.0. Upgraded per instruction on this TensorFlow GPU instructions page. This resolved the issue for me. By the way, its bunch of apt-get update/installs and I executed all of them.

Keras is included in TensorFlow 2.0 above. So

  • remove import keras and
  • replace from keras.module.module import class statement to --> from tensorflow.keras.module.module import class
  • Maybe your GPU memory is filled. So use allow growth = True in GPU option. This is deprecated now. But use this below code snippet after imports may solve your problem.
import tensorflow as tf
from tensorflow.compat.v1.keras.backend import set_session
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
config.log_device_placement = True  # to log device placement (on which device the operation ran)
sess = tf.compat.v1.Session(config=config)
set_session(sess)

As already observed by Anurag Bhalekar above, this can be fixed by a dirty workaround by setting up and running a model in your code before loading an old model with load_model() from keras. This correctly initializes cuDNN which can then be used for load_model(), it seems.

In my case, I am using Spyder IDE to run all my python scripts. Specifically, I set up, train and save a CNN in one script. After that, another script loads the saved model for visualization. If I open Spyder and directly run the visualization script to load an old, saved model, I get the same error as mentioned above. I was still able to load the model and to modify it, but when I tried to create a prediction, I got the error.

However, If I first run my training script in a Spyder instance and then run the visualization script in the same Sypder instance, it works fine without any errors:

#training a model correctly initializes cuDNN
model=Sequential()
model.add(Conv2D(32,...))
model.add(Dense(num_classes,...))
model.compile(...)
model.fit() #this all works fine

Then afterwards, the following code including load_model() works fine:

#this script relies on cuDNN already being initialized by the script above
from keras.models import load_model
model = load_model(modelPath) #works
model = Model(inputs=model.inputs, outputs=model.layers[1].output) #works
feature_maps = model.predict(img) #produces the error only if the first piece of code is not run

I could not figure out why this is or how to solve the problem in a different way, but for me, training a small working keras model before using load_model() is a quick and dirty fix that does not require any reinstallation of cuDNN or otherwise.

Same error i got , The Reason of getting this error is due to the mismatch of the version of the cudaa/cudnn with your tensorflow version there are two methods to solve this:

  1. Either you Downgrade your Tensorflow Version pip install --upgrade tensorflowgpu==1.8.0

  2. Or You can follow the steps at Here.

    tip: Choose your ubuntu version and follow the steps.:-)

I had the same problem but with a simpler solution than the others posted here. I have both CUDA 10.0 and 10.2 installed but I only had cuDNN for 10.2 and this version [at the time of this post] is not compatible with TensorFlow GPU. I just installed the cuDNN for CUDA 10.0 and now everything runs fine!

Workaround: Fresh install TF 2.0 and ran a simple Minst tutorial, it was alright, opened another notebook, tried to run and encountered this issue. I existed all notebooks and restarted Jupyter and open only one notebook, ran it successfully Issue seems to be either memory or running more than one notebook on GPU

Thanks

I struggled with this for a while working on an AWS Ubuntu instance.

Then, I found the solution, which was quite simple in this case.

Do not install tensorflow-gpu with pip (pip install tensorflow-gpu), but with conda (conda install tensorflow-gpu) so that it is in the conda environment and it installs the cudatoolkit and the cudnn in the right environment.

That worked for me, saved my day, and hope it helps somebody else.

See the original solution here from learnermaxRL: https://github.com/tensorflow/tensorflow/issues/24828#issuecomment-453727142

I had the same issue, I solved it thanks to that :

os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

or

physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
tf.config.experimental.set_memory_growth(physical_devices[0], True)

I got same problem with you and my config is tensorflow1.13.1,cuda10.0,cudnn7.6.4. I try to change cudnn's version to 7.4.2 lucky, I solve the problem.

Just add

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession


config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

This is a follow up to https://stackoverflow.com/a/56511889/2037998 point 2.

2. You're out of memory

I used the following code to limit the GPU RAM usage:

import tensorflow as tf


gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1*X GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=(1024*4))])
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)

This code sample comes from: TensorFlow: Use a GPU: Limiting GPU memory growth Put this code before of any other TF/Keras code you are using.

Note: The application might still use a bit more GPU RAM than the number above.

Note 2: If the system also runs other applications (like a UI) these programs can also consume some GPU RAM. (Xorg, Firefox,... sometimes up to 1GB of GPU RAM combined)

Enabling memory growth on GPU at the start of my code solved the problem:

import tensorflow as tf


physical_devices = tf.config.experimental.list_physical_devices('GPU')
print("Num GPUs Available: ", len(physical_devices))
tf.config.experimental.set_memory_growth(physical_devices[0], True)

Num GPUs Available: 1

Reference: https://deeplizard.com/learn/video/OO4HD-1wRN8

in starting of your notebook or code add below lines of code

import tensorflow as tf


physical_devices = tf.config.experimental.list_physical_devices('GPU')


tf.config.experimental.set_memory_growth(physical_devices[0], True)

I had a similar problem. Tensorflow complained that it expected a certain version of cuDNN but wasn't the one it found. So, I downloaded the version it expected from https://developer.nvidia.com/rdp/cudnn-archive and installed it. It now works.

I had this same issue with RTX 2080. Then following code worked for me.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession


config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

I had the same problem. I am using conda environment so my packages are automatically managed by conda. I solved the problem by constraining the memory allocation of tensorflow v2, python 3.x

physical_devices = tf.config.experimental.list_physical_devices(‘GPU’)
tf.config.experimental.set_memory_growth(physical_devices[0], True)

This solved the my problem. However, this limits the memory very much. When I simulteniously run the

nvidia-smi

I saw that it was about 700mb. So in order to see more options one can inspect the codes at tensorflow's website

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)

In my case the code snip above solved the problem perfectly.

Note: I didn't tried installing tensorflow with pip, this worked with conda installed tensorflow effectively.

Ubuntu: 18.04

python: 3.8.5

tensorflow: 2.2.0

cudnn : 7.6.5

cudatoolkit : 10.1.243

Was facing the same issue, I think GPU is not able to load all the data at once. I resolved it by reducing the batch size.

I also had the same issue with Tensorflow 2.4 and Cuda 11.0 with CuDNN v 8.0.4. I had wasted almost 2 to 3 days to solve this issue. The problem was just a driver mismatch. I was installing Cuda 11.0 Update 1, I thought this is update 1 so might work well but that was the culprit there. I uninstalled Cuda 11.0 Update 1 and installed it without an update. Here is the list of drivers that worked for TensorFlow 2.4 at RTX 2060 6GB GPU.

A list of required hardware and software requirements are mentioned here

I also had to do this

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

to avoid this error

2020-12-23 21:54:14.971709: I tensorflow/stream_executor/stream.cc:1404] [stream=000001E69C1DA210,impl=000001E6A9F88E20] did not wait for [stream=000001E69C1DA180,impl=000001E6A9F88730]
2020-12-23 21:54:15.211338: F tensorflow/core/common_runtime/gpu/gpu_util.cc:340] CPU->GPU Memcpy failed
[I 21:54:16.071 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel 8b907ea5-33f1-4b2a-96cc-4a7a4c885d74 restarted
kernel 8b907ea5-33f1-4b2a-96cc-4a7a4c885d74 restarted

These are some of the error samples which I was getting

Type 1

UnpicklingError: invalid load key, 'H'.


During handling of the above exception, another exception occurred:


ValueError                                Traceback (most recent call last)
<ipython-input-2-f049ceaad66a> in <module>

Type 2


InternalError: Blas GEMM launch failed : a.shape=(15, 768), b.shape=(768, 768), m=15, n=768, k=768 [Op:MatMul]


During handling of the above exception, another exception occurred:

Type 3

failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534375: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534683: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534923: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539327: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539523: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539665: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at conv_ops_fused_impl.h:697 : Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.


If you have installed Tensorflow-gpu using Conda, then install the cudnn and cudatoolkit which were installed along with it and re-run the notebook.

NOTE: Trying to uninstall only these two packages in conda would force a chain of other packages to be uninstalled as well. So, use the following command to uninstall only these packages

(1) To remove the cuda

conda remove --force cudatookit

(2) To remove the cudnn

conda remove --force cudnn

Now run Tensorflow, it should work!

Without any rep I can't add this as a comment to the two existing answers above from Anurag and Obnebion, neither can I upvote the answers, so I make a new answer even though it seems to be breaking guidelines. Anyway, I originally had the problem that the other answers on this page address, and fixed it, but then re-encountered the same message later on when I started to use checkpoint callbacks. At this point, only the Anurag/Obnebion answer was relevant. It turns out I'd originally been saving the model as a .json and the weights separately as .h5, then using model_from_json along with a separate model.load_weights to get the weights back again. That worked (I have CUDA 10.2 and tensorflow 2.x). It's only when I tried to switch to this all-in-one save/load_model from the checkpoint callback that it broke. This is the small change I made to keras.callbacks.ModelCheckpoint in the _save_model method:

                            if self.save_weights_only:
self.model.save_weights(filepath, overwrite=True)
else:
model_json = self.model.to_json()
with open(filepath+'.json','w') as fb:
fb.write(model_json)
fb.close()
self.model.save_weights(filepath+'.h5', overwrite=True)
with open(filepath+'-hist.pickle','wb') as fb:
trainhistory = {"history": self.model.history.history,"params": self.model.history.params}
pickle.dump(trainhistory,fb)
fb.close()
# self.model.save(filepath, overwrite=True)

The history pickle dump is just a kludge for yet another question on stack overflow, what happens to the history object when you exit early from a Checkpoint callback. Well you can see in the _save_model method there is a line which pulls the loss monitor array out of the logs dict... but never writes it to a file! So I just put in the kludge accordingly. Most people don't recommend using pickles like this. My code is just a hack so it doesn't matter.

I was having the same issue but adding these line of code at the start solved my problem:

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

works with tensorflow V2.

It seems like the libraries need some warm up. This isn't an effective solution for production but you can at least carry on with other bugs...

from keras.models import Sequential
import numpy as np
from keras.layers import Dense
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
model = Sequential()
model.add(Dense(1000,input_dim=(784),activation='relu') )  #imnput layer
model.add(Dense(222,activation='relu'))                     #hidden layer
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))
model.add(Dense(10,activation='sigmoid'))
model.compile(optimizer="adam",loss='categorical_crossentropy',metrics=["accuracy"])
x_train = np.reshape(x_train,(60000,784))/255
x_test = np.reshape(x_test,(10000,784))/255
from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
model.fit(x_train[:1000],y_train[:1000],epochs=1,batch_size=32)

If you are the Chinese ,please make sure that your work path is not include chinese,and change your batch_size more and more smaller.Thanks!

Just install TensorFlow with GPU with this command : pip install tensorflow; You don't need to install GPU separately. If you install GPU separately then this is a high chance it will mismatch the versions of them.

But For releases 1.15 and older, CPU and GPU packages are separate.