MySQL: 比较两个表之间的差异

除 mysql 外,与 甲骨文差异: 如何比较两个表?相同。

假设我有两个表 t1和 t2,它们在布局上是相同的,但是可能包含不同的数据。

区分这两张桌子的最好方法是什么?

更准确地说,我试图找出一个简单的 SQL 查询,它告诉我 t1中某一行的数据是否与 t2中相应行的数据不同

看来我不能使用互联系统

SELECT * FROM robot intersect SELECT * FROM tbd_robot

我得到一个错误代码:

[错误代码: 1064,SQL 状态: 42000]您的 SQL 中有一个错误 检查与 MySQL 服务器版本相对应的手册 在第1行“ SELECT * FROM tbd _ Robot”附近使用正确的语法

我在语法上做错了什么吗? 如果没有,是否有其他的查询我可以使用?

编辑: 另外,我正在通过一个免费版本的 DbVisualizer 进行查询。不确定这是否是一个因素。

144793 次浏览

可以使用 UNION 手动构造交集。如果在两个表中都有一些惟一的字段,这很容易,例如 ID:

SELECT * FROM T1
WHERE ID NOT IN (SELECT ID FROM T2)


UNION


SELECT * FROM T2
WHERE ID NOT IN (SELECT ID FROM T1)

如果您没有唯一的值,您仍然可以展开上面的代码来检查所有字段,而不仅仅是 ID,并使用 AND 来连接它们(例如 ID NOT IN (...)和 OTHER _ Field NOT IN (...)等)

INTERSECT需要在 MySQL中进行仿真:

SELECT  'robot' AS `set`, r.*
FROM    robot r
WHERE   ROW(r.col1, r.col2, …) NOT IN
(
SELECT  col1, col2, ...
FROM    tbd_robot
)
UNION ALL
SELECT  'tbd_robot' AS `set`, t.*
FROM    tbd_robot t
WHERE   ROW(t.col1, t.col2, …) NOT IN
(
SELECT  col1, col2, ...
FROM    robot
)

我在这个 链接里找到了另一种解决方案

SELECT MIN (tbl_name) AS tbl_name, PK, column_list
FROM
(
SELECT ' source_table ' as tbl_name, S.PK, S.column_list
FROM source_table AS S
UNION ALL
SELECT 'destination_table' as tbl_name, D.PK, D.column_list
FROM destination_table AS D
)  AS alias_table
GROUP BY PK, column_list
HAVING COUNT(*) = 1
ORDER BY PK

根据 Haim 的回答,我创建了一个 PHP 代码来测试和显示两个数据库之间的所有差异。 如果在源数据库或测试数据库中存在表,也会显示这一点。 您必须根据您的详细信息更改 < > 变量内容。

<?php


$User = "<DatabaseUser>";
$Pass = "<DatabasePassword>";
$SourceDB = "<SourceDatabase>";
$TestDB = "<DatabaseToTest>";


$link = new mysqli( "p:". "localhost", $User, $Pass, "" );


if ( mysqli_connect_error() ) {


die('Connect Error ('. mysqli_connect_errno() .') '. mysqli_connect_error());


}


mysqli_set_charset( $link, "utf8" );
mb_language( "uni" );
mb_internal_encoding( "UTF-8" );


$sQuery = 'SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA="'. $SourceDB .'";';


$SourceDB_Content = query( $link, $sQuery );


if ( !is_array( $SourceDB_Content) ) {


echo "Table $SourceDB cannot be accessed";
exit(0);


}


$sQuery = 'SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA="'. $TestDB .'";';


$TestDB_Content = query( $link, $sQuery );


if ( !is_array( $TestDB_Content) ) {


echo "Table $TestDB cannot be accessed";
exit(0);


}


$SourceDB_Tables = array();
foreach( $SourceDB_Content as $item ) {
$SourceDB_Tables[] = $item["TABLE_NAME"];
}


$TestDB_Tables = array();
foreach( $TestDB_Content as $item ) {
$TestDB_Tables[] = $item["TABLE_NAME"];
}
//var_dump( $SourceDB_Tables, $TestDB_Tables );
$LookupTables = array_merge( $SourceDB_Tables, $TestDB_Tables );
$NoOfDiscrepancies = 0;
echo "


<table border='1' width='100%'>
<tr>
<td>Table</td>
<td>Found in $SourceDB (". count( $SourceDB_Tables ) .")</td>
<td>Found in $TestDB (". count( $TestDB_Tables ) .")</td>
<td>Test result</td>
<tr>


";


foreach( $LookupTables as $table ) {


$FoundInSourceDB = in_array( $table, $SourceDB_Tables ) ? 1 : 0;
$FoundInTestDB = in_array( $table, $TestDB_Tables ) ? 1 : 0;
echo "


<tr>
<td>$table</td>
<td><input type='checkbox' ". ($FoundInSourceDB == 1 ? "checked" : "") ."></td>
<td><input type='checkbox' ". ($FoundInTestDB == 1 ? "checked" : "") ."></td>
<td>". compareTables( $SourceDB, $TestDB, $table ) ."</td>
</tr>
";


}


echo "


</table>
<br><br>
No of discrepancies found: $NoOfDiscrepancies
";




function query( $link, $q ) {


$result = mysqli_query( $link, $q );


$errors = mysqli_error($link);
if ( $errors > "" ) {


echo $errors;
exit(0);


}


if( $result == false ) return false;
else if ( $result === true ) return true;
else {


$rset = array();


while ( $row = mysqli_fetch_assoc( $result ) ) {


$rset[] = $row;


}


return $rset;


}


}


function compareTables( $source, $test, $table ) {


global $link;
global $NoOfDiscrepancies;


$sQuery = "


SELECT column_name,ordinal_position,data_type,column_type FROM
(
SELECT
column_name,ordinal_position,
data_type,column_type,COUNT(1) rowcount
FROM information_schema.columns
WHERE
(
(table_schema='$source' AND table_name='$table') OR
(table_schema='$test' AND table_name='$table')
)
AND table_name IN ('$table')
GROUP BY
column_name,ordinal_position,
data_type,column_type
HAVING COUNT(1)=1
) A;


";


$result = query( $link, $sQuery );


$data = "";
if( is_array( $result ) && count( $result ) > 0 ) {


$NoOfDiscrepancies++;
$data = "<table><tr><td>column_name</td><td>ordinal_position</td><td>data_type</td><td>column_type</td></tr>";


foreach( $result as $item ) {


$data .= "<tr><td>". $item["column_name"] ."</td><td>". $item["ordinal_position"] ."</td><td>". $item["data_type"] ."</td><td>". $item["column_type"] ."</td></tr>";


}


$data .= "</table>";


return $data;


}
else {


return "Checked but no discrepancies found!";


}


}


?>
 select t1.user_id,t2.user_id
from t1 left join t2 ON t1.user_id = t2.user_id
and t1.username=t2.username
and t1.first_name=t2.first_name
and t1.last_name=t2.last_name

尝试这个。这将比较您的表,并找到所有匹配对,如果有任何不匹配返回左侧为 NULL。

根据 Haim 的回答,这里有一个简化的例子,如果你想比较两个表中都存在的值,否则如果一个表中有一行,而另一个表中没有,它也会返回它... ..。

我花了几个小时才搞清楚

SELECT ID, col
FROM
(
SELECT
tbl_a.ID, tbl_a.col FROM tbl_a
UNION ALL
SELECT
tbl_b.ID, tbl_b.col FROM tbl_b
) t
WHERE ID IN (select ID from tbl_a) AND ID IN (select ID from tbl_b)
GROUP BY
ID, col
HAVING COUNT(*) = 1
ORDER BY ID

因此,你需要添加额外的“ where in”子句:

WHERE ID IN (从 tbl _ a 中选择 ID)和 ID IN (从 tbl _ b 中选择 ID)


另外:

为方便阅读,如果您想指明表名,可以使用以下内容:

SELECT tbl, ID, col
FROM
(
SELECT
tbl_a.ID, tbl_a.col, "name_to_display1" as "tbl" FROM tbl_a
UNION ALL
SELECT
tbl_b.ID, tbl_b.col, "name_to_display2" as "tbl" FROM tbl_b
) t
WHERE ID IN (select ID from tbl_a) AND ID IN (select ID from tbl_b)
GROUP BY
ID, col
HAVING COUNT(*) = 1
ORDER BY ID

下面的问题,是比较表前后我做大更新。

如果使用 Linux,可以使用以下命令:

在终点站,

mysqldump -hlocalhost -uroot -p schema_name_here table_name_here > /home/ubuntu/database_dumps/dump_table_before_running_update.sql


mysqldump -hlocalhost -uroot -p schema_name_here table_name_here > /home/ubuntu/database_dumps/dump_table_after_running_update.sql


diff -uP /home/ubuntu/database_dumps/dump_some_table_after_running_update.sql /home/ubuntu/database_dumps/dump_table_before_running_update.sql > /home/ubuntu/database_dumps/diff.txt

你需要在线工具 为了

  • 格式化从转储导出的 SQL,

不是我见过的最好的

  • 我们有 Diff.txt,你必须手动采取 +-显示内,这是1行的插入语句,其中的值。

  • 对于 Diff.txt中的2行-& + 做在线差异,在线差异工具中将它们过去

G https://www.diffchecker.com[您可以保存和共享它,并且对文件大小没有限制! ]

注意 : 如果它的敏感/生产数据格外小心!

diff preview

How diff.txt will look like

我尝试了上面的答案,但是发现如果一个表有空值,而第二个表在一个列中有值,那么上面的交叉代码不会报告这个事实。

select p.pcn,p.period,p.account_no,p.ytd_debit,a.ytd_debit
-- select count(*) -- 157,283
from Plex.account_period_balance p -- 157,283/202207,148,998
join Azure.account_period_balance a  -- 157,283/202207,148,998
on p.pcn = a.pcn
and p.period = a.period
and p.account_no = a.account_no -- 157,283
where p.period_display = a.period_display -- 157,283
and p.debit = a.debit  -- 157,283
-- and p.ytd_debit = a.ytd_debit -- 148,998
-- and p.ytd_debit != a.ytd_debit -- 0