NLTK 下载 SSL: 证书验证失败

尝试为 nltk 安装 Punkt 时出现以下错:

nltk.download('punkt')
[nltk_data] Error loading Punkt: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed
[nltk_data]     (_ssl.c:590)>
False
81471 次浏览

The downloader script is broken. As a temporal workaround can manually download the punkt tokenizer from here and then place the unzipped folder in the corresponding location. The default folders for each OS are:

  • Windows: C:\nltk_data\tokenizers
  • OSX: /usr/local/share/nltk_data/tokenizers
  • Unix: /usr/share/nltk_data/tokenizers

It means that you are not using HTTPS to work consistently with other run time dependencies for Python etc.

If you are using Linux (Ubuntu)

~$ sudo apt-get install ca-certificates

Should solve the issue.

If you are using this in a script with a docker file, you have to make sure you have install the the ca-certificates modules in your docker file.

Run the Python interpreter and type the commands:

import nltk
nltk.download()

from here: http://www.nltk.org/data.html

if you get an SSL/Certificate error, run the following command

bash /Applications/Python 3.6/Install Certificates.command

from here: ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)

First go to the path /Applications/Python 3.6/ and run Install Certificates.command

You will admin rights for the same.

If you are unable to download it, then as other answer suggest you can download directly and place it. You need to place them in the following directory structure.

> nltk_data
> corpora
> brown
> conll2000
> movie_reviews
> wordnet
> taggers
> averaged_perceptron_tagger
> tokenizers
> punkt

TLDR: Here is a better solution: https://github.com/gunthercox/ChatterBot/issues/930#issuecomment-322111087

Note that when you run nltk.download(), a window will pop up and let you select which packages to download (Download is not automatically started right away).

To complement the accepted answer, the following is a complete list of directories that will be searched on Mac (not limited to the one mentioned in the accepted answer): - '/Users/YOUR_USERNAME/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' - '/Users/YOUR_USERNAME/YOUR_VIRTUAL_ENV_DIRECTORY/nltk_data' - '/Users/YOUR_USERNAME/YOUR_VIRTUAL_ENV_DIRECTORY/share/nltk_data' - '/Users/YOUR_USERNAME/YOUR_VIRTUAL_ENV_DIRECTORY/lib/nltk_data'

In case the link above dies, here is the solution pasted in its entirety:

import nltk
import ssl


try:
_create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
pass
else:
ssl._create_default_https_context = _create_unverified_https_context


nltk.download()

Run the above code in your favourite Python IDE or via the command line.

My solution is:

  • Download punkt.zip from here and unzip
  • Create nltk_data/tokenizers folders under home folder
  • Put punkt folder under tokenizers folder

My solution after nothing worked. I navigated, via the GUI to the Python 3.7 folder, opened the 'Certificates.command' file in terminal and the SSL issue was immediately resolved.

This works by disabling SSL check!

import nltk
import ssl


try:
_create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
pass
else:
ssl._create_default_https_context = _create_unverified_https_context


nltk.download()

There is a very simple way to fix all of this as written in the formal bug report for anyone else coming across this problem recently (e.g. 2019) and using MacOS. From the bug report at https://bugs.python.org/issue28150:

...there is a simple double-clickable or command-line-runnable script ("/Applications/Python 3.6/Install Certificates.command") that does two things: 1. uses pip to install certifi and 2. creates a symlink in the OpenSSL directory to certifi's installed bundle location.

Simply running the "Install Certificates.command" script worked for me on MacOS (10.15 beta as of this writing) and I was off and running.

You just need to Install the certificate doing this simple step

In the python application folder double-click on the file 'Certificates.command'

this will make a prompt window show in your screen and basically will automatically install the certificate for you, close this window and try again.

Search 'Install Certificates.command' in the finder and open it.

Then do the following steps in the terminal:

python3
import nltk
nltk.download()

A bit late to the party but I just entered Certificates.command into Spotlight which found it and ran it. All fixed in seconds.

I'm running mac Catalina and using python 3.7 installed by Homebrew

This is how I solved it for MAC OS. Initially after installing nltk, I was getting the SSL error.

Solution: Goto

cd /Applications/Python\ 3.8

Run the command

./Install\ Certificates.command

Now if you try again, it should work!

Thanks a lot to this article!

For mac users, just copy paste the following in the terminal:

/Applications/Python\ 3.10/Install\ Certificates.command ; exit;

For me, the solution was much simpler: I was still connected to my corporate network/VPN which blocks certain types of downloads. Switching the network made the SSL error disappear.

Updating the python certificates worked for me.

At the top of your script, keep:

import nltk
nltk.download('punkt')

In a separate terminal run (Mac):

bash /Applications/Python <version>/Install Certificates.command