SQL Keywords

SQL Keywords

GROUP BY / HAVING

Feb 23, 2026

·

5

min read

Category: Data Query Language (DQL) - Aggregation

Platform Support:

✅ Snowflake | ✅ BigQuery | ✅ Databricks

Description

GROUP BY groups rows with the same values in specified columns into summary rows, typically used with aggregate functions (COUNT, SUM, AVG, MAX, MIN). HAVING filters grouped results based on aggregate conditions, similar to how WHERE filters individual rows.

Syntax

SELECT 
    column1,
    column2,
    aggregate_function(column3)
FROM table
WHERE condition           -- Filters before grouping
GROUP BY column1, column2
HAVING aggregate_condition -- Filters after grouping
ORDER BY

SELECT 
    column1,
    column2,
    aggregate_function(column3)
FROM table
WHERE condition           -- Filters before grouping
GROUP BY column1, column2
HAVING aggregate_condition -- Filters after grouping
ORDER BY

SELECT 
    column1,
    column2,
    aggregate_function(column3)
FROM table
WHERE condition           -- Filters before grouping
GROUP BY column1, column2
HAVING aggregate_condition -- Filters after grouping
ORDER BY

Platform-Specific Notes

Snowflake:

  • Supports GROUP BY ALL (groups by all non-aggregated columns)

  • Can use column positions: GROUP BY 1, 2

  • Supports GROUPING SETS, CUBE, ROLLUP

BigQuery:

  • Supports GROUP BY ALL

  • Can use column positions or names

  • Supports GROUPING SETS, CUBE, ROLLUP

  • Requires all non-aggregated columns in GROUP BY

Databricks:

  • Standard Spark SQL implementation

  • Supports GROUPING SETS, CUBE, ROLLUP

  • Can use column positions

Example 1: Basic GROUP BY with Aggregations

All Platforms:

SELECT 
    category,
    COUNT(*) as product_count,
    AVG(price) as avg_price,
    MIN(price) as min_price,
    MAX(price) as max_price,
    SUM(quantity_in_stock) as total_inventory
FROM products
GROUP BY category
ORDER BY product_count DESC

SELECT 
    category,
    COUNT(*) as product_count,
    AVG(price) as avg_price,
    MIN(price) as min_price,
    MAX(price) as max_price,
    SUM(quantity_in_stock) as total_inventory
FROM products
GROUP BY category
ORDER BY product_count DESC

SELECT 
    category,
    COUNT(*) as product_count,
    AVG(price) as avg_price,
    MIN(price) as min_price,
    MAX(price) as max_price,
    SUM(quantity_in_stock) as total_inventory
FROM products
GROUP BY category
ORDER BY product_count DESC

Sample Data (products table):

product_id

product_name

category

price

quantity_in_stock

1

Laptop

Electronics

999.99

50

2

Mouse

Electronics

29.99

200

3

Keyboard

Electronics

89.99

150

4

Desk

Furniture

299.99

30

5

Chair

Furniture

199.99

80

6

Monitor

Electronics

349.99

75

Result:

category

product_count

avg_price

min_price

max_price

total_inventory

Electronics

4

367.49

29.99

999.99

475

Furniture

2

249.99

199.99

299.99

110

Example 2: GROUP BY with HAVING Clause

All Platforms:

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(total) as total_spent,
    AVG(total) as avg_order_value,
    MAX(order_date) as last_order_date
FROM orders
WHERE order_date >= '2024-01-01'  -- Filter BEFORE grouping
GROUP BY customer_id
HAVING COUNT(*) >= 3              -- Filter AFTER grouping
   AND SUM(total) > 500
ORDER BY total_spent DESC

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(total) as total_spent,
    AVG(total) as avg_order_value,
    MAX(order_date) as last_order_date
FROM orders
WHERE order_date >= '2024-01-01'  -- Filter BEFORE grouping
GROUP BY customer_id
HAVING COUNT(*) >= 3              -- Filter AFTER grouping
   AND SUM(total) > 500
ORDER BY total_spent DESC

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(total) as total_spent,
    AVG(total) as avg_order_value,
    MAX(order_date) as last_order_date
FROM orders
WHERE order_date >= '2024-01-01'  -- Filter BEFORE grouping
GROUP BY customer_id
HAVING COUNT(*) >= 3              -- Filter AFTER grouping
   AND SUM(total) > 500
ORDER BY total_spent DESC

Sample Data (orders table):

order_id

customer_id

order_date

total

101

1

2024-01-15

250.00

102

2

2024-01-16

180.00

103

1

2024-01-20

320.00

104

3

2024-01-22

450.00

105

1

2024-01-25

180.00

106

2

2024-01-28

290.00

107

3

2024-02-01

220.00

108

3

2024-02-05

180.00

Result:

customer_id

order_count

total_spent

avg_order_value

last_order_date

1

3

750.00

250.00

2024-01-25

3

3

850.00

283.33

2024-02-05

Note: Customer 2 excluded (only 2 orders, needs >= 3)

Example 3: Multiple Grouping Columns

All Platforms:

SELECT 
    department,
    job_title,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    SUM(salary) as total_payroll
FROM employees
GROUP BY department, job_title
HAVING COUNT(*) > 1  -- Only show roles with multiple employees
ORDER BY department, avg_salary DESC

SELECT 
    department,
    job_title,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    SUM(salary) as total_payroll
FROM employees
GROUP BY department, job_title
HAVING COUNT(*) > 1  -- Only show roles with multiple employees
ORDER BY department, avg_salary DESC

SELECT 
    department,
    job_title,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    SUM(salary) as total_payroll
FROM employees
GROUP BY department, job_title
HAVING COUNT(*) > 1  -- Only show roles with multiple employees
ORDER BY department, avg_salary DESC

Sample Data (employees table):

employee_id

name

department

job_title

salary

1

Alice

Sales

Sales Manager

85000

2

Bob

Sales

Sales Rep

55000

3

Carol

Sales

Sales Rep

58000

4

David

Engineering

Software Engineer

95000

5

Emma

Engineering

Software Engineer

92000

6

Frank

Engineering

Software Engineer

98000

7

Grace

Engineering

QA Engineer

72000

8

Henry

Engineering

QA Engineer

70000

Result:

department

job_title

employee_count

avg_salary

min_salary

max_salary

total_payroll

Engineering

Software Engineer

3

95000.00

92000

98000

285000

Engineering

QA Engineer

2

71000.00

70000

72000

142000

Sales

Sales Rep

2

56500.00

55000

58000

113000

Note: Sales Manager excluded (only 1 employee)

Example 4: WHERE vs HAVING

All Platforms:

SELECT 
    product_category,
    EXTRACT(YEAR FROM sale_date) as sale_year,
    COUNT(*) as transaction_count,
    SUM(quantity) as units_sold,
    SUM(revenue) as total_revenue,
    AVG(revenue) as avg_transaction_value
FROM sales
WHERE 
    sale_date >= '2023-01-01'           -- WHERE: Filter rows before grouping
    AND product_category != 'Discontinued'
GROUP BY 
    product_category, 
    EXTRACT(YEAR FROM sale_date)
HAVING 
    SUM(revenue) > 10000                -- HAVING: Filter groups after aggregation
    AND COUNT(*) >= 50
ORDER BY 
    sale_year DESC, 
    total_revenue DESC

SELECT 
    product_category,
    EXTRACT(YEAR FROM sale_date) as sale_year,
    COUNT(*) as transaction_count,
    SUM(quantity) as units_sold,
    SUM(revenue) as total_revenue,
    AVG(revenue) as avg_transaction_value
FROM sales
WHERE 
    sale_date >= '2023-01-01'           -- WHERE: Filter rows before grouping
    AND product_category != 'Discontinued'
GROUP BY 
    product_category, 
    EXTRACT(YEAR FROM sale_date)
HAVING 
    SUM(revenue) > 10000                -- HAVING: Filter groups after aggregation
    AND COUNT(*) >= 50
ORDER BY 
    sale_year DESC, 
    total_revenue DESC

SELECT 
    product_category,
    EXTRACT(YEAR FROM sale_date) as sale_year,
    COUNT(*) as transaction_count,
    SUM(quantity) as units_sold,
    SUM(revenue) as total_revenue,
    AVG(revenue) as avg_transaction_value
FROM sales
WHERE 
    sale_date >= '2023-01-01'           -- WHERE: Filter rows before grouping
    AND product_category != 'Discontinued'
GROUP BY 
    product_category, 
    EXTRACT(YEAR FROM sale_date)
HAVING 
    SUM(revenue) > 10000                -- HAVING: Filter groups after aggregation
    AND COUNT(*) >= 50
ORDER BY 
    sale_year DESC, 
    total_revenue DESC

Sample Result:

product_category

sale_year

transaction_count

units_sold

total_revenue

avg_transaction_value

Electronics

2024

1250

3500

125000.00

100.00

Furniture

2024

680

890

45000.00

66.18

Electronics

2023

950

2800

98000.00

103.16

Example 5: GROUPING SETS, CUBE, and ROLLUP

All Platforms:

-- GROUPING SETS: Specify exact grouping combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (region, product_category),  -- By region and category
    (region),                     -- By region only
    (product_category),          -- By category only
    ()                           -- Grand total
)
ORDER BY region NULLS LAST, product_category NULLS LAST;

-- CUBE: All possible combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY CUBE(region, product_category);

-- ROLLUP: Hierarchical aggregations
SELECT 
    year,
    quarter,
    month,
    SUM(revenue) as total_revenue
FROM sales_data
GROUP BY ROLLUP(year, quarter, month)
ORDER BY year, quarter, month

-- GROUPING SETS: Specify exact grouping combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (region, product_category),  -- By region and category
    (region),                     -- By region only
    (product_category),          -- By category only
    ()                           -- Grand total
)
ORDER BY region NULLS LAST, product_category NULLS LAST;

-- CUBE: All possible combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY CUBE(region, product_category);

-- ROLLUP: Hierarchical aggregations
SELECT 
    year,
    quarter,
    month,
    SUM(revenue) as total_revenue
FROM sales_data
GROUP BY ROLLUP(year, quarter, month)
ORDER BY year, quarter, month

-- GROUPING SETS: Specify exact grouping combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (region, product_category),  -- By region and category
    (region),                     -- By region only
    (product_category),          -- By category only
    ()                           -- Grand total
)
ORDER BY region NULLS LAST, product_category NULLS LAST;

-- CUBE: All possible combinations
SELECT 
    region,
    product_category,
    SUM(revenue) as total_revenue
FROM sales
GROUP BY CUBE(region, product_category);

-- ROLLUP: Hierarchical aggregations
SELECT 
    year,
    quarter,
    month,
    SUM(revenue) as total_revenue
FROM sales_data
GROUP BY ROLLUP(year, quarter, month)
ORDER BY year, quarter, month

Sample Data (sales table):

region

product_category

revenue

East

Electronics

50000

East

Furniture

30000

West

Electronics

60000

West

Furniture

25000

Result (GROUPING SETS):

region

product_category

total_revenue

East

Electronics

50000

East

Furniture

30000

West

Electronics

60000

West

Furniture

25000

East

NULL

80000

West

NULL

85000

NULL

Electronics

110000

NULL

Furniture

55000

NULL

NULL

165000

Example 6: GROUP BY with CASE Statements

All Platforms:

SELECT 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END as age_group,
    COUNT(*) as customer_count,
    AVG(lifetime_value) as avg_lifetime_value,
    SUM(total_purchases) as total_purchases
FROM customers
GROUP BY 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END
HAVING COUNT(*) >= 10
ORDER BY

SELECT 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END as age_group,
    COUNT(*) as customer_count,
    AVG(lifetime_value) as avg_lifetime_value,
    SUM(total_purchases) as total_purchases
FROM customers
GROUP BY 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END
HAVING COUNT(*) >= 10
ORDER BY

SELECT 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END as age_group,
    COUNT(*) as customer_count,
    AVG(lifetime_value) as avg_lifetime_value,
    SUM(total_purchases) as total_purchases
FROM customers
GROUP BY 
    CASE 
        WHEN age < 25 THEN '18-24'
        WHEN age < 35 THEN '25-34'
        WHEN age < 50 THEN '35-49'
        ELSE '50+'
    END
HAVING COUNT(*) >= 10
ORDER BY

Sample Result:

age_group

customer_count

avg_lifetime_value

total_purchases

18-24

245

385.50

2450

25-34

580

720.25

6800

35-49

450

950.75

5200

50+

125

1150.00

1580

WHERE vs HAVING Comparison

Aspect

WHERE

HAVING

Filters

Individual rows

Grouped results

Applied

Before GROUP BY

After GROUP BY

Can use aggregates

No

Yes

Use with

Any query

Only with GROUP BY

Example

WHERE price > 100

HAVING SUM(price) > 1000

Common Aggregate Functions

Function

Description

Example

COUNT(*)

Count all rows

COUNT(*)

COUNT(column)

Count non-NULL values

COUNT(email)

SUM(column)

Sum of values

SUM(revenue)

AVG(column)

Average of values

AVG(price)

MIN(column)

Minimum value

MIN(order_date)

MAX(column)

Maximum value

MAX(salary)

COUNT(DISTINCT column)

Count unique values

COUNT(DISTINCT customer_id)

Common Use Cases

  1. Sales reports: Revenue by product, region, time period

  2. Customer analytics: Purchase behavior, segmentation

  3. Inventory management: Stock levels by category

  4. HR analytics: Headcount, salary statistics by department

  5. Web analytics: Page views, sessions by source

  6. Financial analysis: Expenses by category, budget vs actual

Performance Tips

  1. Index GROUP BY columns: Improves grouping performance

  2. Filter with WHERE first: Reduce rows before grouping

  3. Use HAVING sparingly: Only for aggregate conditions

  4. Limit grouping columns: Fewer columns = faster grouping

  5. Consider materialized views: For frequently grouped queries

  6. Use appropriate data types: Smaller types group faster

Best Practices

  1. Include all non-aggregated columns: In GROUP BY clause

  2. Use meaningful aliases: For aggregated columns

  3. Order logically: GROUP BY, HAVING, ORDER BY sequence

  4. Document complex groupings: Explain business logic

  5. Validate results: Check totals make sense

  6. Use HAVING for aggregates only: Use WHERE for column filters

  7. Consider GROUPING SETS: For multiple aggregation levels

Interested to Learn More?
Try Out the Free 14-Days Trial

More Articles

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.