在木星笔记本服务器上下载路径中的所有文件

作为一个使用 Jupiter 笔记本完成作业的用户,我可以通过 web 界面访问作业。我假设这些任务存储在服务器上我个人空间的某个地方,因此我应该能够下载它们。如何下载个人用户空间中的所有文件?(例如 wget)

下面是路径结构:

https://urltoserver/user/username

有几个目录: 分配,数据等。

https://urltoserver/user/username/assignments

https://urltoserver/user/username/data

...

我想下载所有的文件夹(递归)。只要能让我在本地网上看到什么就行。如果有一些禁用的文件夹,那么好吧,跳过这些,下载其余的。

请指定的命令完全因为我不能弄清楚自己(我尝试 wget)

99427 次浏览

I don't think this is possible with wget, even with the wget -r option. You may have to download them individually (using the Download option in the dashboard view (which is only available on single, non-directory, non-running notebook items) if that is available to you.

However, it is likely that you are not able to download them since if your teacher is using grading software like nbgrader then the students having access to the notebooks themselves is undesirable - since the notebooks can contain information about the answers as well.

Try running this as separate cell in one of your notebooks:

!tar chvfz notebook.tar.gz *

If you want to cover more folders up the tree, write ../ before the * for every step up the directory. The file notebook.tar.gz will be saved in the same folder as your notebook.

Try first to get the directory by:

import os
os.getcwd()

And then use snipped from How to create a zip archive of a directory. You can download complete directory by zipping it. Good luck!

You can create a new terminal from the "New" menu and call the command described on https://stackoverflow.com/a/47355754/8554972:

tar cvfz notebook.tar.gz *

The file notebook.tar.gz will be saved in the same folder as your notebook.

The easiest way is to archive all content using tar, but there is also an API for files downloading.

GET /files/_FILE_PATH_

To get all files in folder you can use:

GET /api/contents/work

Example:

curl https://server/api/contents?token=your_token
curl https://server/files/path/to/file.txt?token=your_token --output some.file

Source: Jupyter Docs

I am taking Prof. Andrew Ng's Deeplearning.ai program via Coursera. The curriculum uses Jupyter Notebooks online. Along with the notebooks are folders with large files. Here's what I used to successfully download all assignments with the associated files and folders to my local Windows 10 PC.

Start with the following line of code as suggested in the post by Serzan Akhmetov above:

!tar cvfz allfiles.tar.gz *

This produces a tarball which, if small enough, can be downloaded from the Jupyter notebook itself and unzipped using 7-Zip. However, this course has individual files of size 100's of MB and folders with 100's of sample images. The resulting tarball is too large to download via browser.

So add one more line of code to split files into manageable chunk sizes as follows:

!split -b 50m allfiles.tar.gz allfiles.tar.gz.part.

This will split the archive into multiple parts each of size 50 Mb (or your preferred size setting). Each part will have an extension like allfiles.tar.gz.part.xx. Download each part as before.

The final task is to untar the multi-part archive. This is very simple with 7-Zip. Just select the first file in the series for extraction with 7-Zip. This is the file named allfiles.tar.gz.part.aa for the example used. It will pull all the necessary parts together as long as they are in the same folder.

Hope this helps add to Serzan's excellent answer above.

from google.colab import files


files.download("/content/data.txt")

These lines might work if you are working in a google colab or Jupyter notebook.

The first line imports the library files The second one, downloads your created file, example:"data.txt" (your file name) located inside content folder.

I've made a slightly update based on @Sun Bee's solution, and it will allow you to create multiple file backup with a timestamp subfix.

!tar cvfz allfiles-`date +"%Y%m%d-%H%M"`.tar.gz *

you just need to do

zip -r filename.zip folder_name