如何删除没有唯一行id存在的重复行?

我的座位是

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

我想留下以下重复删除后:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

我尝试了一些查询,但我认为他们取决于有一个行id,因为我没有得到想要的结果。例如:

DELETE
FROM table
WHERE col1 IN (
    SELECT id
    FROM table
    GROUP BY id
    HAVING (COUNT(col1) > 1)
)

当前回答

在sql server中可以通过多种方式来实现 最简单的方法是: 将重复行表中的不同行插入到新的临时表中。然后从重复的行表中删除所有数据,然后从没有重复的临时表中插入所有数据,如下所示。

select distinct * into #tmp From table
   delete from table
   insert into table
   select * from #tmp drop table #tmp

   select * from table

使用公共表表达式(CTE)删除重复行

With CTE_Duplicates as 
(select id,name , row_number() 
over(partition by id,name order by id,name ) rownumber  from table  ) 
delete from CTE_Duplicates where rownumber!=1

其他回答

我更喜欢CTE从sql server表中删除重复的行

强烈推荐阅读本文::http://codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/

保持原创性

WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY col1,col2,col3 ORDER BY col1,col2,col3) AS RN
FROM MyTable
)

DELETE FROM CTE WHERE RN<>1

不保留原创

WITH CTE AS
(SELECT *,R=RANK() OVER (ORDER BY col1,col2,col3)
FROM MyTable)
 
DELETE CTE
WHERE R IN (SELECT R FROM CTE GROUP BY R HAVING COUNT(*)>1)

请参见下面的删除方式。

Declare @table table
(col1 varchar(10),col2 int,col3 int, col4 int, col5 int, col6 int, col7 int)
Insert into @table values 
('john',1,1,1,1,1,1),
('john',1,1,1,1,1,1),
('sally',2,2,2,2,2,2),
('sally',2,2,2,2,2,2)

创建一个名为@table的样例表,并用给定的数据加载它。

Delete  aliasName from (
Select  *,
        ROW_NUMBER() over (Partition by col1,col2,col3,col4,col5,col6,col7 order by col1) as rowNumber
From    @table) aliasName 
Where   rowNumber > 1

Select * from @table

注意:如果您按部分给出分区中的所有列,那么按顺序没有多大意义。

我知道,这个问题是三年前问的,我的答案是蒂姆发布的另一个版本,但发布只是为了对任何人有帮助。

参考https://support.microsoft.com/en-us/help/139444/how-to-remove-duplicate-rows-from-a-table-in-sql-server

删除重复的想法涉及

a)保护那些不重复的行 b)保留众多符合条件的重复行中的一行。

循序渐进的

1)首先确定满足重复定义的行 并将它们插入到临时表中,写入#tableAll。 2)选择不重复的(单行)或不同的行到临时表中 # tableUnique说。 3)从源表中删除连接#tableAll来删除 重复。 4)插入源表中所有来自#tableUnique的行。 5)删除#tableAll和#tableUnique

在尝试了上面建议的解决方案后,它适用于小型中型表。 我可以为非常大的表提出这个解决方案。因为它在迭代中运行。

Drop all dependency views of the LargeSourceTable you can find the dependecies by using sql managment studio, right click on the table and click "View Dependencies" Rename the table: sp_rename 'LargeSourceTable', 'LargeSourceTable_Temp'; GO Create the LargeSourceTable again, but now, add a primary key with all the columns that define the duplications add WITH (IGNORE_DUP_KEY = ON) For example: CREATE TABLE [dbo].[LargeSourceTable] ( ID int IDENTITY(1,1), [CreateDate] DATETIME CONSTRAINT [DF_LargeSourceTable_CreateDate] DEFAULT (getdate()) NOT NULL, [Column1] CHAR (36) NOT NULL, [Column2] NVARCHAR (100) NOT NULL, [Column3] CHAR (36) NOT NULL, PRIMARY KEY (Column1, Column2) WITH (IGNORE_DUP_KEY = ON) ); GO Create again the views that you dropped in the first place for the new created table Now, Run the following sql script, you will see the results in 1,000,000 rows per page, you can change the row number per page to see the results more often. Note, that I set the IDENTITY_INSERT on and off because one the columns contains auto incremental id, which I'm also copying

设置IDENTITY_INSERT LargeSourceTable ON 声明@PageNumber为INT, @RowspPage为INT 声明@TotalRows为INT 声明@dt varchar(19) SET @PageNumber = 0 SET @RowspPage = 1000000 select @TotalRows = count (*) from LargeSourceTable_TEMP

While ((@PageNumber - 1) * @RowspPage < @TotalRows )
Begin
    begin transaction tran_inner
        ; with cte as
        (
            SELECT * FROM LargeSourceTable_TEMP ORDER BY ID
            OFFSET ((@PageNumber) * @RowspPage) ROWS
            FETCH NEXT @RowspPage ROWS ONLY
        )

        INSERT INTO LargeSourceTable 
        (
             ID                     
            ,[CreateDate]       
            ,[Column1]   
            ,[Column2] 
            ,[Column3]       
        )       
        select 
             ID                     
            ,[CreateDate]       
            ,[Column1]   
            ,[Column2] 
            ,[Column3]       
        from cte

    commit transaction tran_inner

    PRINT 'Page: ' + convert(varchar(10), @PageNumber)
    PRINT 'Transfered: ' + convert(varchar(20), @PageNumber * @RowspPage)
    PRINT 'Of: ' + convert(varchar(20), @TotalRows)

    SELECT @dt = convert(varchar(19), getdate(), 121)
    RAISERROR('Inserted on: %s', 0, 1, @dt) WITH NOWAIT
    SET @PageNumber = @PageNumber + 1
End

SET IDENTITY_INSERT LargeSourceTable OFF

试着使用:

SELECT linkorder
    ,Row_Number() OVER (
        PARTITION BY linkorder ORDER BY linkorder DESC
        ) AS RowNum
FROM u_links