如何在 Colab 永久安装图书馆?

在谷歌协作,我可以安装一个新的库使用 !pip install package-name。但是当我明天再次打开笔记本的时候,我每次都需要重新安装它。

有没有永久安装库的方法? 不需要每次安装都花时间来使用?

43790 次浏览

是的。您可以在谷歌驱动器中安装这个库。然后添加到 sys.path的路径。

import os, sys
from google.colab import drive
drive.mount('/content/drive')
nb_path = '/content/notebooks'
os.symlink('/content/drive/My Drive/Colab Notebooks', nb_path)
sys.path.insert(0,nb_path)

然后可以安装一个库,例如 jdc,并指定目标。

!pip install --target=$nb_path jdc

稍后,当您再次运行笔记本时,您可以跳过 !pip install行。你可以只用 import jdc和使用它。这是一个笔记本的例子。

Https://colab.research.google.com/drive/1kpmdi9cjimudrzxsytdaurjtbahzivjq

顺便说一下,我真的很喜欢 jdc%%add_to。它使工作与一个大班容易得多。

如果你想要一个没有授权的解决方案。你可以使用嵌入在你的笔记本中的 gcsuse + 服务帐户密钥挂载。像这样:

# first install gcsfuse
%%capture
!echo "deb http://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt update
!apt install gcsfuse

然后从谷歌云控制台获得您的服务帐户凭证,并将其嵌入到笔记本电脑中

%%writefile /key.json
{
"type": "service_account",
"project_id": "kora-id",
"private_key_id": "xxxxxxx",
"private_key": "-----BEGIN PRIVATE KEY-----\nxxxxxxx==\n-----END PRIVATE KEY-----\n",
"client_email": "colab-7@kora-id.iam.gserviceaccount.com",
"client_id": "100380920993833371482",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/colab-7%40kora-id.iam.gserviceaccount.com"
}

然后设置环境以查找此凭据文件

%env GOOGLE_APPLICATION_CREDENTIALS=/key.json

然后,您必须创建(或者已经拥有)一个 gcs 桶,并将其挂载到一个虚构的目录中。

!mkdir /content/my-bucket
!gcsfuse my-bucket /content/my-bucket

最后,在这里安装库,就像我上面的答案一样。

import sys
nb_path = '/content/my-bucket'
sys.path.insert(0, nb_path)
# Do this just once
!pip install --target=$nb_path jdc

你现在可以 import jdc没有 !pip install它下次。

如果您需要安装多个库,这里有一个代码片段:

def install_library_to_drive(libraries_list):
""" Install library on gdrive. Run this only once. """
drive_path_root = 'path/to/mounted/drive/directory/where/you/will/install/libraries'
for lib in libraries_list:
drive_path_lib = drive_path_root + lib
!pip install -q $lib --target=$drive_path_lib
sys.path.insert(0, drive_path_lib)


def load_library_from_drive(libraries_list):
""" Technically, it just appends install dir to a sys.path """
drive_path_root = 'path/to/mounted/drive/directory/where/you/will/install/libraries'
for lib in libraries_list:
drive_path_lib = drive_path_root + lib
sys.path.insert(0, drive_path_lib)


libraries_list = ["torch", "jsonlines", "transformers"] # list your libraries
install_library_to_drive(libraries_list) # Run this just once
load_library_from_drive(libraries_list)