如何比较两个数据表
有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。
我们先假设一下有如下表结构:
CREATE TABLE jajal
(
user_id integer NOT NULL,
first_name character varying(255),
last_name character varying(255),
grade character(1),
CONSTRAINT jajal_pkey PRIMARY KEY (user_id)
)
然后,我们有两张表——jajal和jajal_copy,其内容如下:
jajal¶
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | Z |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
jajal_copy¶
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | C |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。
下面是具体的SQL语句:
使用FULL OUTER JOIN¶
SELECT
*
FROM
jajal j
FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
j.user_id IS NULL
OR jc.user_id IS NULL
运行结果如下:
user_id | first_name | last_name | grade | user_id | first_name | last_name | grade |
---|---|---|---|---|---|---|---|
[NULL] | [NULL] | [NULL] | [NULL] | 6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z | [NULL] | [NULL] | [NULL] | [NULL] |
使用NATURAL FULL OUTER JOIN¶
关于natural join,你可以看看Wikipedia是怎么说的。
SELECT
*
FROM
jajal j
NATURAL FULL OUTER JOIN jajal_copy jc
WHERE
j.user_id IS NULL
OR jc.user_id IS NULL
运行结果如下:
user_id | first_name | last_name | grade |
---|---|---|---|
6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z |
MySQL SQL 代码¶
MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
SELECT
*
FROM
jajal j
LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name
AND jc.last_name = j.last_name
AND jc.grade = j.grade
AND jc.user_id = j.user_id
WHERE
j.user_id IS NULL
或者你更喜欢NATURAL JOIN 版本
SELECT
*
FROM
jajal j
NATURAL LEFT JOIN jajal_copy jc
WHERE
jc.user_id IS NULL
UNION ALL
SELECT
*
FROM
jajal j
NATURAL RIGHT JOIN jajal_copy jc
WHERE
j.user_id IS NULL
当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:
DELIMITER $$
CREATE PROCEDURE `db_schema`.`tablediff`
(schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE sql_statement TEXT DEFAULT '';
DECLARE sql_statement_where TEXT DEFAULT '';
DECLARE sql_statement_pk TEXT DEFAULT '';
DECLARE col_name VARCHAR(64);
DECLARE col_name_cur CURSOR FOR
SELECT
COLUMN_NAME
FROM
information_schema.COLUMNS
WHERE
TABLE_SCHEMA = schema_name
AND TABLE_NAME = table1
;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN col_name_cur;
traverse_columns: LOOP
FETCH col_name_cur INTO col_name;
IF done THEN
CLOSE col_name_cur;
LEAVE traverse_columns;
END IF;
SET sql_statement_where = CONCAT(sql_statement_where,
' AND a.', col_name, ' = b.', col_name);
SET sql_statement_pk = CONCAT(sql_statement_pk,
'AND b.', col_name, ' IS NULL');
END LOOP;
SELECT
COLUMN_NAME INTO col_name
FROM
information_schema.KEY_COLUMN_USAGE
WHERE
CONSTRAINT_SCHEMA = schema_name
AND CONSTRAINT_NAME = 'PRIMARY'
AND TABLE_NAME = table1
LIMIT 1
;
IF col_name IS NOT NULL THEN
SET sql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');
END IF;
SET sql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');
SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');
SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
SET @s = sql_statement;
PREPARE stmt1 FROM @s;
EXECUTE stmt1;
DEALLOCATE PREPARE stmt1;
END$$
DELIMITER ;
PostgreSQL 下的SQL语句¶
下面是PostgreSQL的一个存储过程:
CREATE OR REPLACE FUNCTION tablediff (
IN schema_name VARCHAR(64),
IN table1 VARCHAR(64),
IN table2 VARCHAR(64)
) RETURNS BIGINT AS
$BODY$
DECLARE
the_result BIGINT DEFAULT 0;
sql_statement TEXT DEFAULT '';
sql_statement_where TEXT DEFAULT '';
sql_statement_pk TEXT DEFAULT '';
col_name VARCHAR(64);
col_name_cur CURSOR FOR
SELECT
column_name
FROM
information_schema.columns
WHERE
table_catalog = schema_name
AND table_schema = 'public'
AND table_name = table1
;
BEGIN
OPEN col_name_cur;
LOOP
FETCH col_name_cur INTO col_name;
IF NOT FOUND THEN
EXIT;
END IF;
sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name;
END LOOP;
SELECT
column_name INTO col_name
FROM
information_schema.table_constraints tc
JOIN information_schema.constraint_column_usage ccu ON
ccu.constraint_name = tc.constraint_name
WHERE
tc.table_catalog = schema_name
AND tc.table_schema = 'public'
AND tc.table_name = table1
LIMIT 1
;
IF col_name IS NOT NULL THEN
sql_statement_pk := ' a.' || col_name || ' IS NULL';
sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL';
END IF;
sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE';
sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk;
EXECUTE sql_statement INTO the_result;
RETURN the_result;
END;$BODY$
LANGUAGE 'plpgsql' STABLE;
文章:来源
(转载本站文章请注明作者和出处 酷 壳 – CoolShell ,请勿用于任何商业用途)
相关文章¶
- 性能调优攻略
- https://coolshell.cn/wp-content/plugins/wordpress-23-related-posts-plugin/static/thumbs/10.jpg6个有用的MySQL语句
- 程序员疫苗:代码注入
- NoSQL 数据建模技术
- 千万别惹程序员
- https://coolshell.cn/wp-content/plugins/wordpress-23-related-posts-plugin/static/thumbs/29.jpgQuora使用到的技术 The post 如何比较两个数据表 first appeared on 酷 壳 - CoolShell.