Sunday 16 July 2017

What is the Difference between ROW_NUMBER(), RANK() and DENSE_RANK()?

Introduction

All of these three functions are used to calculate RowID for the result set returned from a query but in a slightly different way.

Row_Number()

This function will assign a unique id to each row returned from the query.
Consider the following query:
DECLARE @Table TABLE (
      Col_Value varchar(2)
)

INSERT INTO @Table (Col_Value)
      VALUES ('A'),('A'),('A'),('B'),('B'),('C'),('C');

SELECT
      Col_Value,
      ROW_NUMBER() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;    
After executing it, we will get:
Col_ValueRowID
A1
A2
A3
B4
B5
C6
C7
As we notice, each and every row has a unique ID.

Rank()

This function will assign a unique number to each distinct row, but it leaves a gap between the groups. Let me explain with a query, we will use the same query we used above with Rank().
SELECT
      Col_Value,
      Rank() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;
This query will return:
Col_ValueRowID
A1
A1
A1
B4
B4
C6
C6
As we can see, rowid is unique for each distinct value, but with a gap. What is this gap?
This gap represents number of occurrence. For example: value ‘a’ is repeated thrice and has rank ‘1’, the next rank will be 1+3=4. Same with the next value 4+2=6 and so on.

Dense_Rank()

This function is similar to Rank with only difference, this will not leave gaps between groups.
So if we use the same query used for Rank, then:
SELECT
      Col_Value,
      DENSE_RANK() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;
We will get:
Col_ValueRowID
A1
A1
A1
B2
B2
C3
C3
So it is clear that it generates a unique id for each group and without repetition.
As we are clear now what these functions do, we can use them in different scenarios based on our need.
For example: Row_Number() can be used if we want to skip and fetch some records. Take a look at this query:
WITH AllRecords
AS (
      SELECT
            Col_Value,
            ROW_NUMBER() OVER (ORDER BY Col_Value) AS 'RowID'
      FROM
            @Table)

SELECT
      Col_Value,
      RowID
FROM
      AllRecords
WHERE
      RowID >= 4
      AND RowID <= 5;
This will return only those rows with RowID 4 and 5. This function is very useful in paging data in SQL instead of in code till SQL SERVER 2012. In SQL Server 2012, Microsoft introduces a new feature Offset Fetch similar to what we did in the above query.
In SQL Server 2012, the above query will be written as:
SELECT
      Col_Value
FROM
      @Table
ORDER BY
      Col_Value OFFSET 3 ROWS FETCH NEXT 2 ROWS ONLY;
This will skip first 3 rows and will fetch next 2.
Similarly, we can use Dense_Rank() when we need to calculate rowid with Select Distinct.
SELECT
DISTINCT
      Col_Value,
      DENSE_RANK() OVER (ORDER BY Col_Value) AS 'RowID'
FROM
      @Table;

Will return:-
Col_ValueRowID
A1
B2
C3
Or when we need a report where ranks clashes. For example: Two students scored same marks and share same division.
We can use Rank() if we don’t want consecutive numbers.
Hope this tip helped you.

No comments:

Post a Comment