有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。
我们先假设一下有如下表结构:
1 |
CREATE TABLE jajal |
2 |
( |
3 |
user_id integer NOT NULL , |
4 |
first_name character varying (255), |
5 |
last_name character varying (255), |
6 |
grade character (1), |
7 |
CONSTRAINT jajal_pkey PRIMARY KEY (user_id) |
8 |
) |
然后,我们有两张表——jajal和jajal_copy,其内容如下:
jajal
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | Z |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
jajal_copy
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | C |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。
下面是具体的SQL语句:
使用FULL OUTER JOIN
01 |
SELECT |
02 |
* |
03 |
FROM |
04 |
jajal j |
05 |
FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name |
06 |
AND jc.last_name = j.last_name |
07 |
AND jc.grade = j.grade |
08 |
AND jc.user_id = j.user_id |
09 |
WHERE |
10 |
j.user_id IS NULL |
11 |
OR jc.user_id IS NULL |
运行结果如下:
user_id | first_name | last_name | grade | user_id | first_name | last_name | grade |
---|---|---|---|---|---|---|---|
[NULL] | [NULL] | [NULL] | [NULL] | 6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z | [NULL] | [NULL] | [NULL] | [NULL] |
使用NATURAL FULL OUTER JOIN
关于natural join,你可以看看Wikipedia是怎么说的。
1 |
SELECT |
2 |
* |
3 |
FROM |
4 |
jajal j |
5 |
NATURAL FULL OUTER JOIN jajal_copy jc |
6 |
WHERE |
7 |
j.user_id IS NULL |
8 |
OR jc.user_id IS NULL |
运行结果如下:
user_id | first_name | last_name | grade |
---|---|---|---|
6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z |
MySQL SQL 代码
MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
01 |
SELECT |
02 |
* |
03 |
FROM |
04 |
jajal j |
05 |
LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name |
06 |
AND jc.last_name = j.last_name |
07 |
AND jc.grade = j.grade |
08 |
AND jc.user_id = j.user_id |
09 |
WHERE |
10 |
jc.user_id IS NULL |
11 |
UNION ALL |
12 |
SELECT |
13 |
* |
14 |
FROM |
15 |
jajal j |
16 |
RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name |
17 |
AND jc.last_name = j.last_name |
18 |
AND jc.grade = j.grade |
19 |
AND jc.user_id = j.user_id |
20 |
WHERE |
21 |
j.user_id IS NULL |
或者你更喜欢NATURAL JOIN 版本
01 |
SELECT |
02 |
* |
03 |
FROM |
04 |
jajal j |
05 |
NATURAL LEFT JOIN jajal_copy jc |
06 |
WHERE |
07 |
jc.user_id IS NULL |
08 |
UNION ALL |
09 |
SELECT |
10 |
* |
11 |
FROM |
12 |
jajal j |
13 |
NATURAL RIGHT JOIN jajal_copy jc |
14 |
WHERE |
15 |
j.user_id IS NULL |
当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:
01 |
DELIMITER $$ |
02 |
|
03 |
CREATE PROCEDURE `db_schema`.`tablediff` |
04 |
(schema_name VARCHAR (64), table1 VARCHAR (64), table2 VARCHAR (64)) |
05 |
BEGIN |
06 |
DECLARE done INT DEFAULT 0; |
07 |
DECLARE sql_statement TEXT DEFAULT '' ; |
08 |
DECLARE sql_statement_where TEXT DEFAULT '' ; |
09 |
DECLARE sql_statement_pk TEXT DEFAULT '' ; |
10 |
DECLARE col_name VARCHAR (64); |
11 |
DECLARE col_name_cur CURSOR FOR |
12 |
SELECT |
13 |
COLUMN_NAME |
14 |
FROM |
15 |
information_schema.COLUMNS |
16 |
WHERE |
17 |
TABLE_SCHEMA = schema_name |
18 |
AND TABLE_NAME = table1 |
19 |
; |
20 |
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1; |
21 |
|
22 |
OPEN col_name_cur; |
23 |
traverse_columns: LOOP |
24 |
FETCH col_name_cur INTO col_name; |
25 |
|
26 |
IF done THEN |
27 |
CLOSE col_name_cur; |
28 |
LEAVE traverse_columns; |
29 |
END IF; |
30 |
|
31 |
SET sql_statement_where = CONCAT(sql_statement_where, |
32 |
' AND a.' , col_name, ' = b.' , col_name); |
33 |
SET sql_statement_pk = CONCAT(sql_statement_pk, |
34 |
'AND b.' , col_name, ' IS NULL' ); |
35 |
END LOOP; |
36 |
|
37 |
SELECT |
38 |
COLUMN_NAME INTO col_name |
39 |
FROM |
40 |
information_schema.KEY_COLUMN_USAGE |
41 |
WHERE |
42 |
CONSTRAINT_SCHEMA = schema_name |
43 |
AND CONSTRAINT_NAME = 'PRIMARY' |
44 |
AND TABLE_NAME = table1 |
45 |
LIMIT 1 |
46 |
; |
47 |
IF col_name IS NOT NULL THEN |
48 |
SET sql_statement_pk = CONCAT( 'AND b.' , col_name, ' IS NULL' ); |
49 |
END IF; |
50 |
|
51 |
SET sql_statement = CONCAT( 'SELECT * FROM ' , schema_name, '.' , table1, ' a LEFT JOIN ' , schema_name, '.' , table2, ' b ON TRUE' ); |
52 |
SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ' , sql_statement_pk); |
53 |
SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ' , schema_name, '.' , table1, ' b RIGHT JOIN ' , schema_name, '.' , table2, ' a ON TRUE' ); |
54 |
SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ' , sql_statement_pk); |
55 |
|
56 |
SET @s = sql_statement; |
57 |
PREPARE stmt1 FROM @s; |
58 |
EXECUTE stmt1; |
59 |
DEALLOCATE PREPARE stmt1; |
60 |
|
61 |
END $$ |
62 |
DELIMITER ; |
PostgreSQL 下的SQL语句
下面是PostgreSQL的一个存储过程:
01 |
CREATE OR REPLACE FUNCTION tablediff ( |
02 |
IN schema_name VARCHAR (64), |
03 |
IN table1 VARCHAR (64), |
04 |
IN table2 VARCHAR (64) |
05 |
) RETURNS BIGINT AS |
06 |
$BODY$ |
07 |
DECLARE |
08 |
the_result BIGINT DEFAULT 0; |
09 |
sql_statement TEXT DEFAULT '' ; |
10 |
sql_statement_where TEXT DEFAULT '' ; |
11 |
sql_statement_pk TEXT DEFAULT '' ; |
12 |
col_name VARCHAR (64); |
13 |
col_name_cur CURSOR FOR |
14 |
SELECT |
15 |
column_name |
16 |
FROM |
17 |
information_schema.columns |
18 |
WHERE |
19 |
table_catalog = schema_name |
20 |
AND table_schema = 'public' |
21 |
AND table_name = table1 |
22 |
; |
23 |
BEGIN |
24 |
OPEN col_name_cur; |
25 |
|
26 |
LOOP |
27 |
FETCH col_name_cur INTO col_name; |
28 |
IF NOT FOUND THEN |
29 |
EXIT; |
30 |
END IF; |
31 |
|
32 |
sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name; |
33 |
END LOOP; |
34 |
|
35 |
SELECT |
36 |
column_name INTO col_name |
37 |
FROM |
38 |
information_schema.table_constraints tc |
39 |
JOIN information_schema.constraint_column_usage ccu ON |
40 |
ccu.constraint_name = tc.constraint_name |
41 |
WHERE |
42 |
tc.table_catalog = schema_name |
43 |
AND tc.table_schema = 'public' |
44 |
AND tc.table_name = table1 |
45 |
LIMIT 1 |
46 |
; |
47 |
|
48 |
IF col_name IS NOT NULL THEN |
49 |
sql_statement_pk := ' a.' || col_name || ' IS NULL' ; |
50 |
sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL' ; |
51 |
END IF; |
52 |
|
53 |
sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE' ; |
54 |
sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk; |
55 |
|
56 |
EXECUTE sql_statement INTO the_result; |
57 |
|
58 |
RETURN the_result; |
59 |
END ;$BODY$ |
60 |
LANGUAGE 'plpgsql' STABLE; |
文章:来源