Blog

Delete duplicate records from table using CTE in SQL Server Database

By Anoop Kumar Sharma    2114  14-April-2022

In this blog, we will learn How to Delete duplicate records from a table using CTE in SQL Server.



In this blog, we will learn How to delete duplicate records from a table using CTE in SQL Server. For demonstration, I created a table named Students which contains duplicate records. Below is the Schema and data of the Students table which you can refer quickly for learning purposes.

CREATE TABLE dbo.Students
(
	Id   INT IDENTITY NOT NULL,
	Name VARCHAR (100) NULL,
	Age  INT NULL,
	City VARCHAR (100) NULL,
	CONSTRAINT PK_Students PRIMARY KEY (Id)
)

Go

INSERT INTO dbo.Students (Name, Age, City) VALUES ('Anoop Sharma', 30, 'New Delhi');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Anoop Sharma', 30, 'New Delhi');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Anoop Sharma', 30, 'New Delhi');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Rohan Singh', 28, 'Kanpur');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Rohan Singh', 28, 'Kanpur');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Ajay Kumar', 40, 'Agra');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Ajay Kumar', 40, 'Agra');
INSERT INTO dbo.Students (Name, Age, City) VALUES ('Ramesh', 24, 'Sonipat');

On selecting the data from the Students table, You will see duplicate records as shown in the below image.

To delete the duplicate records, first, we need to find the rows which are duplicated means having more than one entry in the table. We have to use Row_Number() function with Partition By clause. Partition By clause partition the rows of a table into groups and used inside the Over() clause. In this example, I used Partition By clause on Name, Age, and City Column as in case of duplicity, Name, Age, and City Column values must be matched with other rows. A Column with the alias “DuplicateCount” is created with the help of Partition By in the CTE by which we can detect the duplicate records. 

Records with DuplicateCount > 1 column are duplicate records so, we have to delete them using CTE.

with cteDuplicateRecords
as
(
Select Id, Name, Age, City, 
ROW_NUMBER() over(partition by Name, Age, City order by Name asc) as DuplicateCount
from Students
)

Delete from cteDuplicateRecords where DuplicateCount>1;

After executing the above query, you will see that duplicate records from the Students table are removed/deleted.