有没有办法将 BigQuery 表的模式导出为 JSON？

小开

最佳答案

a way to dump schema from an existing table to a JSON file (preferably from the command-line). Is that possible?

try below

bq show bigquery-public-data:samples.wikipedia

You can use –format flag to prettify output

Format for command output. Options include:

none:       ...
pretty:     formatted table output
sparse:     simpler table output
prettyjson: easy-to-read JSON format
json:       maximally compact JSON
csv:        csv format with header

The first three are intended to be human-readable, and the latter three are for passing to another program. If no format is selected, one will be chosen based on the command run.

Realized I provided partial answer :o)

Below does what PO wanted

bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq '.schema.fields'

小开

You can add the flag --schema[1] in order to avoid table data information.

bq show --schema --format=prettyjson [PROJECT_ID]:[DATASET].[TABLE] > [SCHEMA_FILE]


bq show --schema --format=prettyjson myprojectid:mydataset.mytable > /tmp/myschema.json

[1] https://cloud.google.com/bigquery/docs/managing-table-schemas

小开

You can use REST API call to get BigQuery table schema as JSON. Documentation link: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/get

curl 'https://bigquery.googleapis.com/bigquery/v2/projects/project-name/datasets/dataset-name/tables/table-name' \
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
--header 'Accept: application/json' \
--compressed

小开

Answer update

Since October 2020, you can also run a SQL query on INFORMATION_SCHEMA.COLUMNS which is kind of an introspective functionality.

SELECT *
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS

and nest the data using an aggregation function such as

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type)) as columns
FROM <YOUR_DATASET>.INFORMATION_SCHEMA.COLUMNS
GROUP BY table_name

The are also interesting metadata in INFORMATION_SCHEMA.VIEWS if you also need the source code from your views.

Then hit save results / JSON from the BigQuery interface, or wrap it into the bq query command line in your case.

Source: BigQuery release notes

小开

As of 15th May 2022, this worked:

In google cloud, Go to cloud shell
Select the project from drop down (left) of cloud shell
Use below command bq show --schema --format=prettyjson .

小开

select table on the bq UI.
select columns you would want to export schema for.
use the copy menu to copy schema as JSON.

小开

The following bash script & sql always helped me solve the problem to extract all tables schema to JSON file from a dataset:

#!/bin/bash
#gen-default-schema.sh
input=$1
source_type=$2
result=tables_${source_type}.result


bq query --format=csv --use_legacy_sql=false --flagfile=$input | awk '{if(NR>1)print}' > $result


while IFS= read -r line
do
tbl_name=`echo "$line" | awk -F. '{print $NF}'`
schema_file=`echo "$tbl_name" | cut -d'_' -f2-`.schema
echo $schema_file
bq show --schema --format=prettyjson $line > ./temp/${source_type}/${schema_file}
echo "done"
done < "$result"

Input file example.sql ($1)

SELECT
table_catalog || ":" || table_schema || "." || table_name
FROM (
SELECT
table_catalog,
table_schema,
table_name
FROM
`project-id`.<dataset_id>.INFORMATION_SCHEMA.TABLES
ORDER BY
table_name ASC )

To run:

$bash gen-default-schema.sh example.sql example

This will place all the JSON schema under ./temp folder

小开

IF you want to do this from google cloud-console then a short SQL query can achieve this.

It'll give you all the info from schema and you can change the STRUCT( ... ) with https://cloud.google.com/bigquery/docs/information-schema-column-field-paths#schema as you wish.

Alternatively use INFORMATION_SCHEMA.<something> with other views to get different meta info to JSON.

As @Michel Hua said in their answer, select Query results -> JSON in bigquery to get JSON after running the SQL query

SELECT table_name, ARRAY_AGG(STRUCT(column_name, data_type, description)) as columns
FROM `your-project-id`.your_dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name = 'your_table_name'
GROUP BY table_name