使用 SQL 完全复制 postgres 表

免责声明: 这个问题类似于堆栈溢出问题 给你,但是这些答案都不适用于我的问题,我将在后面解释。

我试图在 postgres 中复制一个大型表(大约40M 行,100多列) ,其中许多列都被索引。目前,我使用这一点 SQL:

CREATE TABLE <tablename>_copy (LIKE <tablename> INCLUDING ALL);
INSERT INTO <tablename>_copy SELECT * FROM <tablename>;

这种方法有两个问题:

  1. 它在数据摄取之前添加索引,所以比创建没有索引的表和复制所有数据后建立索引要花费更长的时间。
  2. 这不能正确复制“ SERIAL”样式列。它没有在新表上设置一个新的“计数器”,而是将新表中列的默认值设置为过去表的计数器,这意味着在添加行时它不会增加。

表的大小使索引成为一个实时问题。这也使得转储到文件然后重新摄取变得不可行。我也没有使用命令行的优势。我需要在 SQL 中执行此操作。

我想要做的是直接使用一些奇迹命令复制一个精确的副本,或者如果不可能的话,复制带有所有约束但没有索引的表,并确保它们是“精神上的”约束(也就是串列的一个新计数器)。然后使用 SELECT *复制所有数据,然后复制所有索引。

消息来源

  1. 关于数据库复制的 Stack Overflow 问题 : 这不是我要问的问题,原因有三

    • 它使用命令行选项 pg_dump -t x2 | sed 's/x2/x3/g' | psql,在此设置中,我无法访问命令行
    • 它在数据摄取之前创建索引,这很慢
    • 它没有通过 default nextval('x1_id_seq'::regclass)正确地更新序列列作为证据
  2. 方法来重置 postgres 表的序列值 : 这很棒,但不幸的是,它非常手动。

102312 次浏览

Apparently you want to "rebuild" a table. If you only want to rebuild a table, not copy it, then you should use CLUSTER instead.

SELECT count(*) FROM table; -- make a seq scan to make sure the table is at least
-- decently cached
CLUSTER someindex ON table;

You get to choose the index, try to pick one that suits your queries. You can always use the primary key if no other index is suitable.

If your table is too large to be cached, CLUSTER can be slow though.

The closest "miracle command" is something like

pg_dump -t tablename | sed -r 's/\btablename\b/tablename_copy/' | psql -f -

In particular, this takes care of creating the indexes after loading the table data.

But that doesn't reset the sequences; you will have to script that yourself.

Well, you're gonna have to do some of this stuff by hand, unfortunately. But it can all be done from something like psql. The first command is simple enough:

select * into newtable from oldtable

This will create newtable with oldtable's data but not indexes. Then you've got to create the indexes and sequences etc on your own. You can get a list of all the indexes on a table with the command:

select indexdef from pg_indexes where tablename='oldtable';

Then run psql -E to access your db and use \d to look at the old table. You can then mangle these two queries to get the info on the sequences:

SELECT c.oid,
n.nspname,
c.relname
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(oldtable)$'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2, 3;


SELECT a.attname,
pg_catalog.format_type(a.atttypid, a.atttypmod),
(SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
FROM pg_catalog.pg_attrdef d
WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
a.attnotnull, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = '74359' AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

Replace that 74359 above with the oid you get from the previous query.

WARNING:

All the answers which use pg_dump and any sort of regular expression to replace the name of the source table are really dangerous. What if your data contains the substring that you are trying to replace? You will end up changing your data!

I propose a two-pass solution:

  1. eliminate data lines from the dump using some data-specific regexp
  2. perform search-and-replace on the remaining lines

Here's an example written in Ruby:

ruby -pe 'gsub(/(members?)/, "\\1_copy_20130320") unless $_ =~ /^\d+\t.*(?:t|f)$/' < members-production-20130320.sql > copy_members_table-20130320.sql

In the above I am trying to copy "members" table into "members_copy_20130320". My data-specific regexp is /^\d+\t.*(?:t|f)$/

The above type of solution works for me. Caveat emptor...

edit:

OK, here's another way in pseudo-shell syntax for the regexp-averse people:

  1. pg_dump -s -t mytable mydb > mytable_schema.sql
  2. search-and-replace table name in mytable_schema.sql > mytable_copy_schema.sql
  3. psql -f mytable_copy_schema.sql mydb

  4. pg_dump -a -t mytable mydb > mytable_data.sql

  5. replace "mytable" in the few SQL statement preceding the data section
  6. psql -f mytable_data.sql mydb

The create table as feature in PostgreSQL may now be the answer the OP was looking for.

https://www.postgresql.org/docs/9.5/static/sql-createtableas.html

create table my_table_copy as
select * from my_table

This will create an identical table with the data.

Adding with no data will copy the schema without the data.

create table my_table_copy as
select * from my_table
with no data

This will create the table with all the data, but without indexes and triggers etc.


create table my_table_copy (like my_table including all)

The create table like syntax will include all triggers, indexes, constraints, etc. But not include data.

create table newTableName (like   oldTableName including indexes);
insert into newTableName  select * from oldTableName

This worked for me 9.3

To copy a table completely, including both table structure and data, you use the following statement:

CREATE TABLE new_table AS
TABLE existing_table;

To copy a table structure without data, you add the WITH NO DATA clause to the CREATE TABLE statement as follows:

CREATE TABLE new_table AS
TABLE existing_table
WITH NO DATA;

To copy a table with partial data from an existing table, you use the following statement:

CREATE TABLE new_table AS
SELECT
*
FROM
existing_table
WHERE
condition;