如何删除没有唯一行id存在的重复行?

我的座位是

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

我想留下以下重复删除后:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

我尝试了一些查询,但我认为他们取决于有一个行id,因为我没有得到想要的结果。例如:

DELETE
FROM table
WHERE col1 IN (
    SELECT id
    FROM table
    GROUP BY id
    HAVING (COUNT(col1) > 1)
)

当前回答

微软有一个关于如何删除重复文件的非常简洁的指南。查看http://support.microsoft.com/kb/139444

简而言之,当你只有几行要删除时,下面是删除重复项的最简单方法:

SET rowcount 1;
DELETE FROM t1 WHERE myprimarykey=1;

Myprimarykey是行标识符。

我将rowcount设置为1,因为我只有两行复制。如果我复制了3行,那么我就会将rowcount设置为2,这样它就会删除它看到的前两行,只在表t1中留下一行。

其他回答

DELETE p1 FROM Person p1,
    Person p2
WHERE
    p1.Email = p2.Email AND p1.Id > p2.Id

如果你有能力临时添加一个列到表中,这是一个适合我的解决方案:

ALTER TABLE dbo.DUPPEDTABLE ADD RowID INT NOT NULL IDENTITY(1,1)

然后使用MIN和GROUP BY的组合执行DELETE

DELETE b
FROM dbo.DUPPEDTABLE b
WHERE b.RowID NOT IN (
                     SELECT MIN(RowID) AS RowID
                     FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
                     GROUP BY a.ITEM_NUMBER,
                              a.CHARACTERISTIC,
                              a.INTVALUE,
                              a.FLOATVALUE,
                              a.STRINGVALUE
                 );

验证DELETE执行正确:

SELECT a.ITEM_NUMBER,
    a.CHARACTERISTIC,
    a.INTVALUE,
    a.FLOATVALUE,
    a.STRINGVALUE, COUNT(*)--MIN(RowID) AS RowID
FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
GROUP BY a.ITEM_NUMBER,
    a.CHARACTERISTIC,
    a.INTVALUE,
    a.FLOATVALUE,
    a.STRINGVALUE
ORDER BY COUNT(*) DESC 

结果中不应有计数大于1的行。最后,删除rowid列:

ALTER TABLE dbo.DUPPEDTABLE DROP COLUMN RowID;
DELETE from search
where id not in (
   select min(id) from search
   group by url
   having count(*)=1

   union

   SELECT min(id) FROM search
   group by url
   having count(*) > 1
)

另一种在不丢失信息的情况下删除重复行的方法如下:

delete from dublicated_table t1 (nolock)
join (
    select t2.dublicated_field
    , min(len(t2.field_kept)) as min_field_kept
    from dublicated_table t2 (nolock)
    group by t2.dublicated_field having COUNT(*)>1
) t3 
on t1.dublicated_field=t3.dublicated_field 
    and len(t1.field_kept)=t3.min_field_kept

在sql server中可以通过多种方式来实现 最简单的方法是: 将重复行表中的不同行插入到新的临时表中。然后从重复的行表中删除所有数据,然后从没有重复的临时表中插入所有数据,如下所示。

select distinct * into #tmp From table
   delete from table
   insert into table
   select * from #tmp drop table #tmp

   select * from table

使用公共表表达式(CTE)删除重复行

With CTE_Duplicates as 
(select id,name , row_number() 
over(partition by id,name order by id,name ) rownumber  from table  ) 
delete from CTE_Duplicates where rownumber!=1