Extracting, manipulating and presenting data from a SQL database could be a complicated operation, expecially if we talk about business intelligence, reports and graphics for the final users.
If we couldn’t use a BI tool and we need to write extraction queries manually, we could have some problem to write complex queries. Writing a query that dinamically group a set of data based on a user selection is no very easy.
Suppose that the application have an interface with a multivalue field where the user can choose a max of three grouping from a list of ten, after that the filters will be stored in a dedicated table. Then the extraction query will be executed, will read the groupings from the table and will have to apply the grouping dinamically to the dataset.
The columns of the result dataset should are:
- Grouping 1
- Grouping 2 (optional)
- Grouping 3 (optional)
- Totals
In order to achieve this result I’ll use a SQL Function and table variables.
Function
The function will returns a table variable like this:
CREATE FUNCTION [dbo].[report_summary] () RETURNS @Aggregate TABLE ( FirstGrouping NVARCHAR(50) null ,SecondGrouping NVARCHAR(50) null ,ThirdGrouping NVARCHAR(50) null ,Total int ) AS BEGIN ..... END
The first step is retrieving the groupings from the filters table:
DECLARE @Groupings TABLE ( Grouping NVARCHAR(50) ) INSERT INTO @Groupings SELECT Grouping, ROW_NUMBER() OVER(ORDER BY DefaultOrder) AS RowNumber from dbo.Groupings WHERE ExtractionId = @ExtractionId ORDER BY DefaultOrder
So I’ve declared a table variable filled with a list of groupings retrieved from the Groupings table. The table has an identity column DefaultOrder as well that is useful to assign an order to the groupings and a RowNumber with OVER clause that assign the first, second and third position.
Based on a predefinded list of possible groupings, I set a bunch of variables that will be useful later:
DECLARE @BirthCityGrouping NVARCHAR(30) = 'Birth city'; DECLARE @AgeGrouping NVARCHAR(30) = 'Age'; DECLARE @ZipCodeGrouping NVARCHAR(30) = 'Zip code'; DECLARE @CountryGrouping NVARCHAR(30) = 'Country'; DECLARE @SubscriptionYearGrouping NVARCHAR(30) = 'Subscription year'; ..... DECLARE @IsBirthCityGrouping smallint set @IsBirthCityGrouping = (select count(Grouping) from @Groupings where Grouping = @BirthCityGrouping); DECLARE @IsAgeGrouping smallint set @IsAgeGrouping = (select count(Grouping) from @Groupings where Grouping = @AgeGrouping); DECLARE @IsZipCodeGrouping smallint set @IsZipCodeGrouping = (select count(Grouping) from @Groupings where Grouping = @ZipCodeGrouping); ..... DECLARE @FirstGrouping nvarchar(50) = (select Grouping from @Groupings where RowNumber = 1); DECLARE @SecondGrouping nvarchar(50) = (select Grouping from @Groupings where RowNumber = 2); DECLARE @ThirdGrouping nvarchar(50) = (select Grouping from @Groupings where RowNumber = 3);
I can use the first set of variables to build an intermediate table variable with the values to be grouped:
DECLARE @SelectedRecords TABLE ( Id UNIQUEIDENTIFIER ,BirthCity NVARCHAR(50) ,Age smallint ,ZipCode NVARCHAR(5) ,Country NVARCHAR(50) ,SubscriptionYear SMALLINT ) INSERT INTO @SelectedRecords select Id ,CASE WHEN (@IsBirthCityGrouping > 0) THEN BirthCity ELSE 'ALL' END ,CASE WHEN (@IsAgeGrouping > 0) THEN Age ELSE 'ALL' END ,CASE WHEN (@IsZipCodeGrouping > 0) THEN ZipCode ELSE 'ALL' END ..... from dbo.MyTable
With the variables declarated above I can check if the grouping have to be done or not; in the first case I select the effective value, otherwise I set a default value ‘ALL’ for all the rows, that will be not considered in the group by clause.
Now I’m ready to populate the Aggregate table:
INSERT INTO @Aggregate SELECT (CASE WHEN @FirstGrouping = @BirthCityGrouping THEN BirthCity WHEN @FirstGrouping = @AgeGrouping THEN Age WHEN @FirstGrouping = @ZipCodeGrouping THEN ZipCode ..... ELSE '' END) as FirstGrouping ,(CASE WHEN @SecondGrouping = @BirthCityGrouping THEN BirthCity WHEN @SecondGrouping = @AgeGrouping THEN Age WHEN @SecondGrouping = @ZipCodeGrouping THEN ZipCode ..... ELSE '' END) as SecondGrouping ,(CASE WHEN @ThirdGrouping = @BirthCityGrouping THEN BirthCity WHEN @ThirdGrouping = @AgeGrouping THEN Age WHEN @ThirdGrouping = @ZipCodeGrouping THEN ZipCode ..... ELSE '' END) as ThirdGrouping ,count(Id) FROM @SelectedRecords GROUP BY BirthCity,Age,ZipCode,Country,SubscriptionYear
In the first three colums I need to determine the groupings and show the correct values and in the last one I calculate the total. The group by clause will group for the enhanced columns and will ignore those that they have the ALL value. In this way I will be able to generate a table variable with three dynamic grouping.
Summary
The problem of dynamically grouping sql records can be solved with table variables and replacing some default values instead of the originals where we do not need to do groupings.
We can manage a default grouping order with an identity column in a table that guarantee the order which the rows are created.
Once the intermediate table variables are ready, we can count the subtotals by leveraging the default values defined, that not influence the count.
Leave a Reply