分享

如何比较两个数据表 | 酷壳 - CoolShell.cn

 HDTV 2010-05-20

如何比较两个数据表

2009年5月27日
785 次点击 阅读评论 发表评论

有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。

我们先假设一下有如下表结构: 

1 CREATE TABLE jajal
2 (
3     user_id integer NOT NULL,
4     first_name character varying(255),
5     last_name character varying(255),
6     grade character(1),
7     CONSTRAINT jajal_pkey PRIMARY KEY (user_id)
8 )


然后,我们有两张表——jajal和jajal_copy,其内容如下:

 jajal

user_id first_name last_name grade
1 Some Dude A
2 Other Guy B
3 You are Welcome B
4 What Other A
5 INeed You C
6 Mixed Nuts Z
7 Kirk Land B
8 Bit Shooter A
9 Sun Microsystem C
10 Extra Fancy B

jajal_copy

user_id first_name last_name grade
1 Some Dude A
2 Other Guy B
3 You are Welcome B
4 What Other A
5 INeed You C
6 Mixed Nuts C
7 Kirk Land B
8 Bit Shooter A
9 Sun Microsystem C
10 Extra Fancy B

 

要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。

下面是具体的SQL语句:

使用FULL OUTER JOIN 

01 SELECT
02      *
03 FROM
04      jajal j
05      FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name
06      AND jc.last_name = j.last_name
07      AND jc.grade = j.grade
08      AND jc.user_id = j.user_id
09 WHERE
10      j.user_id IS NULL
11      OR jc.user_id IS NULL

运行结果如下:

user_id first_name last_name grade user_id first_name last_name grade
[NULL] [NULL] [NULL] [NULL] 6 Mixed Nuts C
6 Mixed Nuts Z [NULL] [NULL] [NULL] [NULL]

 

使用NATURAL FULL OUTER JOIN

关于natural join,你可以看看Wikipedia是怎么说的。

1 SELECT
2        *
3 FROM
4        jajal j
5        NATURAL FULL OUTER JOIN jajal_copy jc
6 WHERE
7        j.user_id IS NULL
8        OR jc.user_id IS NULL

运行结果如下:

user_id first_name last_name grade
6 Mixed Nuts C
6 Mixed Nuts Z

 

MySQL SQL 代码

MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
01 SELECT
02 *
03 FROM
04 jajal j
05 LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name
06 AND jc.last_name = j.last_name
07 AND jc.grade = j.grade
08 AND jc.user_id = j.user_id
09 WHERE
10 jc.user_id IS NULL
11 UNION ALL
12 SELECT
13 *
14 FROM
15 jajal j
16 RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name
17 AND jc.last_name = j.last_name
18 AND jc.grade = j.grade
19 AND jc.user_id = j.user_id
20 WHERE
21 j.user_id IS NULL

或者你更喜欢NATURAL JOIN 版本

01 SELECT
02 *
03 FROM
04 jajal j
05 NATURAL LEFT JOIN jajal_copy jc
06 WHERE
07 jc.user_id IS NULL
08 UNION ALL
09 SELECT
10 *
11 FROM
12 jajal j
13 NATURAL RIGHT JOIN jajal_copy jc
14 WHERE
15 j.user_id IS NULL

当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:

01 DELIMITER $$
02   
03 CREATE PROCEDURE `db_schema`.`tablediff`
04     (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))
05 BEGIN
06     DECLARE done INT DEFAULT 0;
07     DECLARE sql_statement TEXT DEFAULT '';
08     DECLARE sql_statement_where TEXT DEFAULT '';
09     DECLARE sql_statement_pk TEXT DEFAULT '';
10     DECLARE col_name VARCHAR(64);
11     DECLARE col_name_cur CURSOR FOR
12         SELECT
13             COLUMN_NAME
14         FROM
15             information_schema.COLUMNS
16         WHERE
17             TABLE_SCHEMA = schema_name
18             AND TABLE_NAME = table1
19     ;
20     DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
21   
22     OPEN col_name_cur;
23     traverse_columns: LOOP
24         FETCH col_name_cur INTO col_name;
25   
26         IF done THEN
27             CLOSE col_name_cur;
28             LEAVE traverse_columns;
29         END IF;
30   
31         SET sql_statement_where = CONCAT(sql_statement_where,
32             ' AND a.', col_name, ' = b.', col_name);
33         SET sql_statement_pk = CONCAT(sql_statement_pk,
34             'AND b.', col_name, ' IS NULL');
35     END LOOP;
36   
37     SELECT
38         COLUMN_NAME INTO col_name
39     FROM
40         information_schema.KEY_COLUMN_USAGE
41     WHERE
42         CONSTRAINT_SCHEMA = schema_name
43         AND CONSTRAINT_NAME = 'PRIMARY'
44         AND TABLE_NAME = table1
45     LIMIT 1
46     ;
47     IF col_name IS NOT NULL THEN
48         SET sql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');
49     END IF;
50   
51     SET sql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');
52     SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
53     SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');
54     SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
55   
56     SET @s = sql_statement;
57     PREPARE stmt1 FROM @s;
58     EXECUTE stmt1;
59     DEALLOCATE PREPARE stmt1;
60   
61 END$$
62 DELIMITER ;

 

 

PostgreSQL 下的SQL语句

下面是PostgreSQL的一个存储过程:

01 CREATE OR REPLACE FUNCTION tablediff (
02     IN schema_name VARCHAR(64),
03     IN table1 VARCHAR(64),
04     IN table2 VARCHAR(64)
05 ) RETURNS BIGINT AS
06 $BODY$
07 DECLARE
08     the_result BIGINT DEFAULT 0;
09     sql_statement TEXT DEFAULT '';
10     sql_statement_where TEXT DEFAULT '';
11     sql_statement_pk TEXT DEFAULT '';
12     col_name VARCHAR(64);
13     col_name_cur CURSOR FOR
14         SELECT
15             column_name
16         FROM
17             information_schema.columns
18         WHERE
19             table_catalog = schema_name
20             AND table_schema = 'public'
21             AND table_name = table1
22     ;
23 BEGIN
24     OPEN col_name_cur;
25   
26     LOOP
27         FETCH col_name_cur INTO col_name;
28         IF NOT FOUND THEN
29             EXIT;
30         END IF;
31   
32         sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name;
33     END LOOP;
34   
35     SELECT
36         column_name INTO col_name
37     FROM
38         information_schema.table_constraints tc
39         JOIN information_schema.constraint_column_usage ccu ON
40             ccu.constraint_name = tc.constraint_name
41     WHERE
42         tc.table_catalog = schema_name
43         AND tc.table_schema = 'public'
44         AND tc.table_name = table1
45     LIMIT 1
46     ;
47   
48     IF col_name IS NOT NULL THEN
49         sql_statement_pk := ' a.' || col_name || ' IS NULL';
50         sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL';
51     END IF;
52   
53     sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE';
54     sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk;
55   
56     EXECUTE sql_statement INTO the_result;
57   
58     RETURN the_result;
59 END;$BODY$
60 LANGUAGE 'plpgsql' STABLE;

 

文章:来源

好烂啊有点差凑合看看还不错很精彩 (3 人打了分,平均分: 5.00 )
Loading ... Loading ...

 

陈皓 数据库 , ,

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多