有没有办法将 BigQuery 表的模式导出为 JSON?

BigQuery桌子具有模式,可以在 web UI、 更新中查看,也可以用 bq工具作为 JSON 文件用于 载入数据。但是,我无法找到将这个模式从现有表转储到 JSON 文件(最好从命令行转储)的方法。这可能吗?

108773 次浏览

a way to dump schema from an existing table to a JSON file (preferably from the command-line). Is that possible?

try below

bq show bigquery-public-data:samples.wikipedia

You can use –format flag to prettify output

--format: none|json|prettyjson|csv|sparse|pretty:

Format for command output. Options include:

none:       ...
pretty:     formatted table output
sparse:     simpler table output
prettyjson: easy-to-read JSON format
json:       maximally compact JSON
csv:        csv format with header

The first three are intended to be human-readable, and the latter three are for passing to another program. If no format is selected, one will be chosen based on the command run.

Realized I provided partial answer :o)

Below does what PO wanted

bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq '.schema.fields'

You can add the flag --schema[1] in order to avoid table data information.

bq show --schema --format=prettyjson [PROJECT_ID]:[DATASET].[TABLE] > [SCHEMA_FILE]


bq show --schema --format=prettyjson myprojectid:mydataset.mytable > /tmp/myschema.json

[1] https://cloud.google.com/bigquery/docs/managing-table-schemas

You can use REST API call to get BigQuery table schema as JSON. Documentation link: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/get

curl 'https://bigquery.googleapis.com/bigquery/v2/projects/project-name/datasets/dataset-name/tables/table-name' \
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
--header 'Accept: application/json' \
--compressed

Answer update

Since October 2020, you can also run a SQL query on INFORMATION_SCHEMA.COLUMNS which is kind of an introspective functionality.

SELECT *
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS

and nest the data using an aggregation function such as

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type)) as columns
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS
GROUP BY table_name

The are also interesting metadata in INFORMATION_SCHEMA.VIEWS if you also need the source code from your views.

Then hit save results / JSON from the BigQuery interface, or wrap it into the bq query command line in your case.

Source: BigQuery release notes

As of 15th May 2022, this worked:

  1. In google cloud, Go to cloud shell
  2. Select the project from drop down (left) of cloud shell
  3. Use below command bq show --schema --format=prettyjson .
  1. select table on the bq UI.
  2. select columns you would want to export schema for.
  3. use the copy menu to copy schema as JSON.

table schema

The following bash script & sql always helped me solve the problem to extract all tables schema to JSON file from a dataset:

#!/bin/bash
#gen-default-schema.sh
input=$1
source_type=$2
result=tables_${source_type}.result


bq query --format=csv --use_legacy_sql=false --flagfile=$input | awk '{if(NR>1)print}' > $result


while IFS= read -r line
do
tbl_name=`echo "$line" | awk -F. '{print $NF}'`
schema_file=`echo "$tbl_name" | cut -d'_' -f2-`.schema
echo $schema_file
bq show --schema --format=prettyjson $line > ./temp/${source_type}/${schema_file}
echo "done"
done < "$result"

Input file example.sql ($1)

SELECT
table_catalog || ":" || table_schema || "." || table_name
FROM (
SELECT
table_catalog,
table_schema,
table_name
FROM
`project-id`.<dataset_id>.INFORMATION_SCHEMA.TABLES
ORDER BY
table_name ASC )

To run:

$bash gen-default-schema.sh example.sql example

This will place all the JSON schema under ./temp folder

IF you want to do this from google cloud-console then a short SQL query can achieve this.

It'll give you all the info from schema and you can change the STRUCT( ... ) with https://cloud.google.com/bigquery/docs/information-schema-column-field-paths#schema as you wish.

Alternatively use INFORMATION_SCHEMA.<something> with other views to get different meta info to JSON.

As @Michel Hua said in their answer, select Query results -> JSON in bigquery to get JSON after running the SQL query

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type, description)) as columns
FROM `your-project-id`.your_dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name = 'your_table_name'
GROUP BY table_name