BI skills丨Use Rank function to rank

In actual business scenarios, in addition to various condition calculations and year-on-year comparison calculations, ranking is also a relatively common problem. We often need to use the Rankx function to perform certain TopN calculations, etc.

Microsoft’s newly launched windowing function Rank can be said to be an optimization of the ranking problem and also solves some of the pitfalls of the previous Rankx function.

Basic syntax of Rank function

RANK ( [<ties>][, <relation>][, <orderBy>][, <blanks>][, <partitionBy>][, <matchBy>] )

Parameter introduction:

ties: optional, DENSE or SKIP, ranking processing with the same value.

relation: optional, the table or table expression based on which the ranking is based.

orderBy: optional, sorting basis, if omitted, the second parameter needs to be bound.

blanks: optional, used to handle empty ranking positions.

partitionBy: optional, partition definition, just refer to SQL window partitioning.

matchby: optional, defines a statement that matches data and identifies the column of the current row.

PS:

It seems that there are many parameters. In fact, we use the second parameter and the third parameter more frequently in daily life. The other parameters are generally used to deal with certain special scenarios.

Let’s take a look at the case data in this issue first:



The case data has a total of 3 tables, product table, branch table and fact table. Import them into PowerBI. The model relationship is as follows:

Add base measures:

Sales Amount:

001.SalesAmount =
SUMX ( 'Fact_Sales', 'Fact_Sales'[Quantity] * RELATED ( Dim_Product[Price] ) )

Cost of sales:

002.SalesCost =
SUMX ( 'Fact_Sales', 'Fact_Sales'[Quantity] * RELATED ( Dim_Product[Cost] ) )

sales profit:

003.SalesProfit =
[001.SalesAmount] - [002.SalesCost]

Sales unit price:

001.Price =
SUM ( 'Dim_Product'[Price] )

Sales unit cost:

002.Cost =
SUM ( 'Dim_Product'[Cost] )

At this point, our preparations are complete. Next, let’s take a look at the performance of the Rank function in different scenarios.

①.Floating point operations

Floating point arithmetic refers to the situation where rankings are disordered because the decimal digits of values are too long. For example, we add the following code:

004.ProfitRankx =
IF (
    HASONEFILTER (Dim_Store[City]),
    RANKX ( ALLSELECTED ( 'Dim_Store' ), [003.SalesProfit] )
)

The result is as shown below:

You can see that there are many dimensions with different values, but the rankings are the same. Our previous processing method was as follows:

005.ProfitRankxAmend =
IF (
    HASONEFILTER (Dim_Store[City]),
    RANKX ( ALLSELECTED ( 'Dim_Store' ), ROUND ( [003.SalesProfit], 2 ) )
)

The solution is to use the ROUND function to process the basis value into a fixed number of digits to avoid floating point calculation differences.

With the Rank function, we don’t need to consider this situation.

006.ProfitRank =
RANK ( ALLSELECTED ( 'Dim_Store' ), ORDERBY ( [003.SalesProfit], DESC ) )

The result is as follows:

②.Tie-ranking

In actual scenarios, tied rankings often occur. The general way to deal with this situation is to perform weighting processing if a tied ranking is found.

For example, we now sort products based on sales price.

003.RankxPrice =
RANKX ( ALLSELECTED ( 'Dim_Product' ), [001.Price],, ASC )

The result is as follows:

To resolve tie rankings, we chose to weight product unit costs.

004.RankxPriceCost =
RANKX (
    ALLSELECTED ( 'Dim_Product' ),
    [001.Price]
         + RANKX ( ALLSELECTED ( 'Dim_Product' ), [002.Cost],, ASC ) / 10000,
    ,
    ASC
)

The result is as follows:

The writing method has a loss on performance. With the Rank function, we can change the writing method.

005.RankPriceCost =
RANK (
    ALLSELECTED ( 'Dim_Product' ),
    ORDERBY ( [001.Price], ASC, [002.Cost], ASC )
)

The result is as follows:

Compared with Rankx’s writing method, it is undoubtedly much simpler.

PS:

Friends who are good at SQL can easily find that Microsoft transplanted the Rank function in SQL to DAX. There are differences in writing methods, but the core is basically the same.

③.Performance

Let’s compare the execution performance of the Rankx function and the Rank function in the two scenarios respectively.

It can be seen with the naked eye that there is not much difference between the two in dealing with floating point arithmetic problems.

In dealing with the problem of tied rankings, the performance of the Rank function is higher than that of the Rankx function.

In addition to the above three questions, there are also related absolute rankings, relative rankings, and within-group rankings. White Tea will not be detailed here. Interested friends can test it by themselves.