How to Select Top ‘n’ or Bottom ‘n’ records from a dataset?

In this post, you will learn how to create a query to extract Top ‘n’ or Bottom ‘n’ records from a particular dataset. There are  instances when you’ll need to extract top or bottom ‘n’ records within a particular group.  In order to do so, MS Access provides a nifty feature  ‘Return’ – Select Top 'n' or Bottom 'n' records from a dataset in MS Access that is used to extract the required Top or Bottom values. This feature is available under the Query setup in the Design Tab of the ribbon.

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

So, lets get started with the actual scenarios……

Real Life Scenario

You have been assigned a new business unit as an inventory manager and you want to know the following:

  • The Top 5 products that are below the target inventory level
  • Similarly, the Bottom 5 products that are below the target inventory level.
  • For all the categories present in your data, find Top 3 products across each of the category.

Select Top ‘n’ or Bottom ‘n’ Products

As an inventory manager, you are trying to understand the current level of your company’ stock. As per your company’s inventory policy, you have assigned a target inventory level across each products. This is the minimum level of stock that each product needs to have at any point of time. If your inventory for a product falls below this level, then you need to place a purchase order with the respective vendor. Besides that, you also have access to the information on the current inventory levels across each products.

So, now you are interested in analysing the difference between the current inventory levels and the target level across each products i.e. you want to analyse the ‘quantity below target level’ for each the products –

‘Qty below target level’ = Target level – Current level.

To begin with, you want to know the top 5 products that have the maximum difference between the target levels and their current levels i.e. the top 5 products with maximum ‘Qty below target level’ 

There is a very easy way to achieve what you have discussed above.

The Northwind database comes with a query ‘Inventory’ on which you just have to create a simple SELECT query as shown in the screenshot below:

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

On running this query, you get the required output:

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

Now, let us see how the result changes when you change the sort order :

  • You change the sort to Ascending:

    Select Top 'n' or Bottom 'n' records from a dataset in MS Access

The above result is nothing but ‘Bottom 5’ products. 

  • You do not mention sort at all:

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

As can be seen from above, the query simply returns the top 5 records in whatever order they are present in the source table.

Select Top ‘n’ Records across each Group

Now, instead of just knowing the Top ‘n’ products or Bottom ‘n’ products, you might want to analyse say Top 3 products across each category.

For example if you have 4 categories, then you will be expecting something like :

Category

Product

Quantity

Category1

Product1

100

 

Product2

20

 

Product3

5

Category2

Product1

150

 

Product2

110

 

Product3

90

Category3

Product1

790

 

Product2

100

 

Product3

10

Ideally speaking, you would want Access to list all the categories present in the data and for each category list the top 3 values.

This can be achieved by:

  1. Creating a Query that would select the Category Name.
  2. Then Creating another Query that would return the top 3 values for each row of the query above. To do so, you’ll have to embed or nest the 02nd query in the 01st Query. This type of query is called as a Sub-Query where the first query is referred to as an Outer Query while the second one is referred to as an Inner Query.

Normally, an inner query is executed first and the results of the inner query are then used as criteria for the outer query. But in this case, the inner query is expected to run for every new category in the outer query. In other words, the inner query will run repeatedly for every category that might be selected by the outer query. This type of sub-query is nothing but a ‘correlated sub-query’.

The query is constructed as follows:  

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

As you can see from the screenshot above, the inner query selects the top 3 values of the column [Qty Below Target Level] for every category in the outer query . It does so every time the category in the inner query is equal to the category in the outer query.

Below is the SQL view of the query in the query grid above:

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

Note that both the first query as well as the second query is referring to the same table ‘Inventory1’. However, in the outer query the table will be used to select the category name along with the quantity field whereas in the inner query the table will be used to select top 3 values for each category. As you can see, in both the queries the tables are going to have different role. Hence, you cannot use the same table name in both the queries. However, you can create an alias of the original table. Creating an alias in the second query allows you to refer the same table but with different name while the first query is still referring to the same table with the original name.

Let’s say for a category = ‘Beverages’ in the outer query, the inner category returns 3 top values of the column [Qty Below Target Level] for the same category. Let’s say those 3 values are 750, 690, 120.

As you might have noticed the inner query is enclosed within the In() operator. So, inner query on returning the 3 values would look like :

In (750,690,120)

This now becomes the criteria for the outer query under the column [Qty Below Target Level]. On evaluating the outer query, you get the top 3 values across the category ‘Beverages’.

Similarly, it repeatedly runs the inner and outer query for each of the categories.  On running the above query, you get the desired output as follows:

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

In this way, you can select Top or Bottom ‘n’ records across each Category.

Other Points to Remember

  1. You need to apply sort on the column that contains the top or bottom values. The sort could be in Ascending or Descending order depending on whether you need top or bottom values.
  2. This feature could also be used during testing your queries. In real life,  you’ll normally encounter huge datasets while analysing the data. These datasets will have thousands of rows and multiple columns. So, every time you create a query and try to run it, the resulting dataset will also be huge. So, the whole activity of running the query on such a huge dataset and returning a resulting dataset takes a lot of time.
    Further, it is not as if you are going to write a query in one go. You’ll first write a part of a query, then run to validate the results of the query. Then again go back to add another criteria to the existing query or maybe add another calculated field, then again run this query to confirm whether or not you are getting the expected results. In order to do all these iterations, you’ll be running the query multiple times. Given the huge dataset, this activity is then going to consume a lot of time.  However, using the feature discussed in this post can save you a lot of time. Once you are satisfied that the query is returning the expected result, you can then remove the Top clause.
  3. There is an alternate way to enter the TOP ‘n’ values. As can be seen from the screenshot below, the top 5 or 3 values could be entered in the property sheet under the General Tab. The property panel is activated by right clicking and clicking on ‘Properties’ option.

Select Top 'n' or Bottom 'n' records from a dataset in MS Access

 

Homework

You can further practise this feature by extracting TOP % or Bottom % of the records. Everything remains the same – you just need to enter the % values instead of integer values.

So, how about you create a query to get the expected result for the following scenario:

  • Top products that accounted for 75% of total sales.
  • Bottom products that accounted for 20% of the total sales

Once you are done, you can share your observations with all the readers on this blog. Good luck.

Advertisements

One response to “How to Select Top ‘n’ or Bottom ‘n’ records from a dataset?

  1. Dear Sir,The tutorials are just wonderful,easy to understand unlike some that make things more difficult and even annoying that one has to abondon some lessons entirely.

    Take this tutorial Ceil Inn Hotel …..function X another very good tutorials that every beginner can understand.

    Does your team have similar ones like these tutorials?

    My kind regards,

    Benjamin Emadu.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s