我有一个varchar列的表,我想找到在这个列中有重复值的所有记录。我可以使用什么查询来查找重复项?


当前回答

CREATE TABLE tbl_master
    (`id` int, `email` varchar(15));

INSERT INTO tbl_master
    (`id`, `email`) VALUES
    (1, 'test1@gmail.com'),
    (2, 'test2@gmail.com'),
    (3, 'test1@gmail.com'),
    (4, 'test2@gmail.com'),
    (5, 'test5@gmail.com');

QUERY : SELECT id, email FROM tbl_master
WHERE email IN (SELECT email FROM tbl_master GROUP BY email HAVING COUNT(id) > 1)

其他回答

SELECT 
    t.*,
    (SELECT COUNT(*) FROM city AS tt WHERE tt.name=t.name) AS count 
FROM `city` AS t 
WHERE 
    (SELECT count(*) FROM city AS tt WHERE tt.name=t.name) > 1 ORDER BY count DESC
CREATE TABLE tbl_master
    (`id` int, `email` varchar(15));

INSERT INTO tbl_master
    (`id`, `email`) VALUES
    (1, 'test1@gmail.com'),
    (2, 'test2@gmail.com'),
    (3, 'test1@gmail.com'),
    (4, 'test2@gmail.com'),
    (5, 'test5@gmail.com');

QUERY : SELECT id, email FROM tbl_master
WHERE email IN (SELECT email FROM tbl_master GROUP BY email HAVING COUNT(id) > 1)

对GROUP BY子句执行SELECT操作。假设name是你想要在其中找到重复项的列:

SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1;

这将返回一个在第一列中包含名称值的结果,以及该值在第二列中出现次数的计数。

为了获得所有包含复制的数据,我使用了以下方法:

SELECT * FROM TableName INNER JOIN(
  SELECT DupliactedData FROM TableName GROUP BY DupliactedData HAVING COUNT(DupliactedData) > 1 order by DupliactedData)
  temp ON TableName.DupliactedData = temp.DupliactedData;

TableName =您正在使用的表。

DupliactedData =您正在寻找的重复数据。

如果要删除具有多个字段的重复行,首先将它们取消为唯一不同的行指定的新唯一键,然后使用group by命令删除具有相同新唯一键的重复行:

Create TEMPORARY table tmp select concat(f1,f2) as cfs,t1.* from mytable as t1;
Create index x_tmp_cfs on tmp(cfs);
Create table unduptable select f1,f2,... from tmp group by cfs;