我需要从一个相当大的SQL Server表(即300,000+行)中删除重复的行。

当然,由于RowID标识字段的存在,这些行不会完全重复。

MyTable

RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

我该怎么做呢?


当前回答

我有一个表,需要保存不重复的行。 我不确定速度和效率。

DELETE FROM myTable WHERE RowID IN (
  SELECT MIN(RowID) AS IDNo FROM myTable
  GROUP BY Col1, Col2, Col3
  HAVING COUNT(*) = 2 )

其他回答

这是删除重复记录最简单的方法

 DELETE FROM tblemp WHERE id IN 
 (
  SELECT MIN(id) FROM tblemp
   GROUP BY  title HAVING COUNT(id)>1
 )

使用这个

WITH tblTemp as
(
SELECT ROW_NUMBER() Over(PARTITION BY Name,Department ORDER BY Name)
   As RowNumber,* FROM <table_name>
)
DELETE FROM tblTemp where RowNumber >1
delete t1
from table t1, table t2
where t1.columnA = t2.columnA
and t1.rowid>t2.rowid

邮政:

delete
from table t1
using table t2
where t1.columnA = t2.columnA
and t1.rowid > t2.rowid

我有一个表,需要保存不重复的行。 我不确定速度和效率。

DELETE FROM myTable WHERE RowID IN (
  SELECT MIN(RowID) AS IDNo FROM myTable
  GROUP BY Col1, Col2, Col3
  HAVING COUNT(*) = 2 )

我想提一下这种方法,它可能是有帮助的,并且适用于所有SQL服务器: 通常只有一到两个副本,副本的id和数量是已知的。在这种情况下:

SET ROWCOUNT 1 -- or set to number of rows to be deleted
delete from myTable where RowId = DuplicatedID
SET ROWCOUNT 0