How can I export GitHub issues to Excel?

How can I export all my issues from an Enterprise GitHub repository to an Excel file? I have tried searching many Stack Overflow answers but did not succeed. I tried this solution too (exporting Git issues to CSV and getting "ImportError: No module named requests" errors. Is there any tool or any easy way to export all the issues to Excel?

88002 次浏览

If that is a one-time task, you may play around with GitHub WebAPI. It allows to export the issues in JSON format. Then you can convert it to Excel (e.g. using some online converter).

Just open the following URL in a browser substituting the {owner} and {repo} with real values:

https://api.github.com/repos/{owner}/{repo}/issues?page=1&per_page=100

To export from a private repo using curl, you can run the following:

curl -i https://api.github.com/repos/<repo-owner>/<repo-name>/issues --header "Authorization: token <token>"

The token can be generated under Personal access tokens

Inspect the API description for all details.

Export Pull Requests can export issues to a CSV file, which can be opened with Excel. It also supports GitLab and Bitbucket.

From its documentation:

Export open PRs and issues in sshaw/git-link and sshaw/itunes_store_transporter:

epr sshaw/git-link sshaw/itunes_store_transporter > pr.csv

Export open pull request not created by sshaw in padrino/padrino-framework:

epr -x pr -c '!sshaw' padrino/padrino-framework > pr.csv

It has several options for filtering what gets exported.

I tried the methods described in other comments regarding exporting issues in JSON format. It worked ok but the formatting was somehow screwed up. Then I found in Excel help that it is able to access APIs directly and load the data from the JSON response neatly into my Excel sheets.

The Google terms I used to find the help I needed were "excel power query web.content GET json". I found a How To Excel video which helped a lot.

URL that worked in the Excel query (same as from other posts):

https://api.github.com/repos/{owner}/{repo}/issues?page=1&per_page=100

Personally, I also add the parameter &state=open, otherwise I need to request hundreds of pages. At one point I reached GitHub's limit on unauthenticated API calls/hour for my IP address.

It is unfortunate that github.com does not make this easier.

In the mean time, if you have jq and curl, you can do this in two lines using something like the following example that outputs issue number, title and labels (tags) and works for private repos as well (if you don't want to filter by label, just remove the labels={label}& part of the url). You'll need to substitute $owner, $repo, $label, and $username:

# with personal access token = $PAT
echo "number, title, labels" > issues.csv
curl "https://api.github.com/repos/$owner/$repo/issues?labels=$label&page=1&per_page=100" -u "$username:$PAT" \
| jq -r '.[] | [.number, .title, (.labels|map(.name)|join("/"))]|@csv' >> issues.csv


# without PAT (will be prompted for password)
echo "number, title, labels" > issues.csv
curl "https://api.github.com/repos/$owner/$repo/issues?labels=$label&page=1&per_page=100" -u "$username" \
| jq -r '.[] | [.number, .title, (.labels|map(.name)|join("/"))]|@csv' >> issues.csv

Note that if your data exceeds 1 page, it may require additional calls.

The hub command-line wrapper for github makes this pretty simple.

You can do something like this:

$ hub issue -f "%t,%l%n" > list.csv

which gives you something like this

$ more issue.csv


Issue 1 title, tag1 tag2
Issue 2 title, tag3 tag2
Issue 3 title, tag1

I have tinkered with this for quite some time and found that Power BI is a good way of keeping the data up to date in the spreadsheet. I had to look into Power BI a little to make this work, because getting the right info out of the structured JSON fields, and collapsing lists into concatenated strings, especially for labels, wasn't super intuitive. But this Power BI query works well for me by removing all the noise and getting relevant info into an easily digestible format that can be reviewed with stakeholders:

let
MyJsonRecord = Json.Document(Web.Contents("https://api.github.com/repos/<your org>/<your repo>/issues?&per_page=100&page=1&state=open&filter=all", [Headers=[Authorization="Basic <your auth token>", Accept="application/vnd.github.symmetra-preview+json"]])),
MyJsonTable = Table.FromRecords(MyJsonRecord),
#"Column selection" = Table.SelectColumns(MyJsonTable,{"number", "title", "user", "labels", "state", "assignee", "assignees", "comments", "created_at", "updated_at", "closed_at", "body"}),
#"Expanded labels" = Table.ExpandListColumn(#"Column selection", "labels"),
#"Expanded labels1" = Table.ExpandRecordColumn(#"Expanded labels", "labels", {"name"}, {"labels.name"}),
#"Grouped Rows" = Table.Group(#"Expanded labels1", {"number","title", "user", "state", "assignee", "assignees", "comments", "created_at", "updated_at", "closed_at", "body"}, \{\{"Label", each Text.Combine([labels.name],","), type text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"number", "title", "state", "assignee", "comments", "created_at", "updated_at", "closed_at", "body", "Label"}),
#"Expanded assignee" = Table.ExpandRecordColumn(#"Removed Other Columns", "assignee", {"login"}, {"assignee.login"})
in
#"Expanded assignee"

I added and then removed columns in this and did not clean this up - feel free to do that before you use it. Obviously, you also have to fill in your own organization name and repo name into the URL, and obtain the auth token. I have tested the URL with a Chrome REST plugin and got the token from entering the user and api key there. You can authenticate explicitly from Excel with the user and key if you don't want to deal with the token. I just find it simpler to go the anonymous route in the query setup and instead provide the readily formatted request header.

Also, this works for repos with up to 100 open issues. If you have more, you need to duplicate the query (for page 2 etc) and combine the results.

Steps for using this query:

  • in a new sheet, on the "Data" tab, open the "Get Data" drop-down
  • select "Launch Power Query Editor"
  • in the editor, choose "New Query", "Other Sources", "Blank query"
  • now you click on "Advanced Editor" and paste the above query
  • click the "Done" button on the Advanced Editor, then "Close and Load" from the tool bar
  • the issues are loading in your spreadsheet and you are in business
  • no crappy third-party tool needed

You can also try https://github.com/remoteorigin/git-issues-downloader but be sure to used the develop branch. The npm version and master branch is buggy.

Or you can use this patched version with

npm install -g https://github.com/mkobar/git-issues-downloader

and then run with (for public repo)

git-issues-downloader -n -p none -u none https://github.com/<user>/<repository>

or for a private repo:

git-issues-downloader -n -p <password or token> -u <user> https://github.com/<user>/<repository>

Works great.

Here is a tool that does it for you (uses the GitHub API): https://github.com/gavinr/github-csv-tools

GitHub's JSON API can be queried from directly in Excel using Power Query. It does require some knowledge about how to convert JSON into Excel table format but that's fairly Googlable.

Here's how to first get to the data:

  • In Excel, on Ribbon, click Data > Get Data > From JSON. In dialog box, enter API URL ... in format similar to (add parms as you wish): https://api.github.com/repos/{owner}/{repo}/issues

  • A dialog box labeled "Access Web content" will appear.

  • On the left-hand side, click the Basic tab.

  • In the User name textbox, enter your GitHub username.

  • In the Password textbox, enter a GitHub password/Personal Access token.

  • Click Connect.

  • Power Query Editor will be displayed with a list of items that say Record.

... now Google around for how to transform accordingly so that the appropriate issue data can be displayed as a single table.

With the official GitHub CLI you can easily export all issues into a CSV format.

brew install gh

Log in:

gh auth login

Change directory to a repository and run this command:

gh issue list --limit 1000 --state all | tr '\t' ',' > issues.csv

In the European .csv files the separator is a semicolon ';', not a comma. Modify the separator as you want.

As a one-time task, building on 'hub'-based recommendation from @Chip... on a windows system with GitBash prompt already installed:

  1. Download the latest hub executable (such as Windows 64 bit) https://github.com/github/hub/releases/ and extract it (hub.exe is in the .../bin directory).

  2. Create a github personal access token https://github.com/settings/tokens and copy the token text string to the clipboard.

  3. Create a text file (such as in notepad) to use as the input file to hub.exe... the first line is your github user name and on the 2nd line paste the personal access token, followed by a newline (so that both lines will processed when input to hub). Here I presume the file is infile.txt in the repository's base directory.

  4. Run Git Bash... and remember to cd (change directory) to the repository of interest! Then enter a line like:

    <path_to_hub_folder>/bin/hub.exe issue -s all -f "%U|%t|%S|%cI|%uI|%L%n" < infile.txt > outfile.csv

  5. Then open the file with '|' as the column delimiter. (and consider deleting the personal access token on github).

You can do it using the python package PyGithub

from github import Github
token = Github('personal token key here')
repo = token.get_repo('repo-owner/repo-name')
issues = repo.get_issues(state='all')
for issue in issues:
print(issue.url)

Here I got back the URL, you can get back the content instead if you want by changing the '.URL' part. Then just export the issues links or content to CSV

You can also check out the one-liner that I created (it involves GitHub CLI and jq)

gh issue list --limit 10000 --state all --json number,title,assignees,state,url | jq  -r '["number","title","assignees","state","url"], (.[] | [.number, .title, (.assignees | if .|length==0 then "Unassigned" elif .|length>1 then map(.login)|join(",") else .[].login end) , .state, .url]) | @tsv' > issues-$(date '+%Y-%m-%d').tsv

Gist with documentation

gh GitHub CLI integrates now jq with --jq <expression> to filter JSON output using a jq expression as documented on GitHub CLI Manual https://cli.github.com/manual/gh_issue_list.

TSV dump.

gh issue list --limit 10 --state all --json title,body --jq '["title","body"], (.[] | [.title,.body]) | @tsv' > issues-$(date '+%Y-%m-%d').tsv

CSV dump

Surprisingly 000D unicode character need to be filtered out with tr $'\x{0D}' ' '.

gh issue list --limit 10 --state all --json title,body --jq '["title","body"], (.[] | [.title,.body]) | @csv' | tr $'\x{0D}' ' ' > issues-$(date '+%Y-%m-%d').csv