Fix: ‘Column is invalid in the select list’
The error “Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause” occurs in SQL when a SELECT statement includes columns that aren’t in an aggregate function or the GROUP BY clause.

This error means that if you’re using GROUP BY to summarize data, any non-aggregated columns must be in the GROUP BY list. If they aren’t, SQL Server will return an error because it doesn’t know which value to show for those fields.
This usually happens when you select columns not part of the aggregation in a GROUP BY query. It can also occur due to incorrect grouping or by failing to include all non-aggregated fields in the GROUP BY clause.
Two key terms, “Group by” and “aggregate function,” are central to solving SQL errors. Here’s how to use them:
GROUP BY Clause:
The “GROUP BY” clause helps summarize or aggregate data, such as calculating total profits, sales, or counting students in a department. For instance, it can be used to sum daily sales figures.
GROUP BY Split-Apply-Combine Strategy:
“GROUP BY” employs a “split-apply-combine” strategy:
- Split: Divides data into groups.
- Apply: Uses aggregate functions (e.g., SUM, AVG) on each group.
- Combine: Merges these results into one output.

In the figure above, column C1 is divided into groups, aggregate functions are applied, and a single value is assigned to each group.
Here’s an example using a database named “appuals.”

Example:
Create a table called “employee” using this code:
USE [appuals] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO CREATE TABLE [dbo].[employee]( [e_id] [int] NOT NULL, [e_ename] [varchar](50) NULL, [dep_id] [int] NULL, [salary] [int] NULL, CONSTRAINT [PK_employee] PRIMARY KEY CLUSTERED ( [e_id] ASC ) ) ON [PRIMARY] GO SET ANSI_PADDING OFF GO

Insert data into the table:
Insert into employee(e_id, e_ename, dep_id, salary) values(101, 'Sadia', 1, 6000), (102, 'Saba', 1, 5000), (103, 'Sana', 2, 4000), (104, 'Hammad', 2, 3000), (105, 'Umer', 3, 4000), (106, 'Kanwal', 3, 2000)
Result:

Next, view all data:
select * from employee
Output:

Group by department:
select dep_id, salary from employee group by dep_id
Error: Column ’employee.salary’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This error occurs because the “employee.salary” column is neither in the GROUP BY clause nor in an aggregate function.

Solution:
Use GROUP BY and an aggregate function to calculate the average salary for each department:
select dep_id, avg(salary) as average_salary from employee group by dep_id

Depicted via split-apply-combine:

Aggregate Functions:
- Sum(): Total of each group.
- Count(): Number of rows in each group.
- Avg(): Mean of each group.
- Min(): Minimum value of each group.
- Max(): Maximum value of each group.
Using GROUP BY and Aggregate Functions:
Now, a practical example:
Create a table “people“:
USE [appuals] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[people]( [id] [bigint] IDENTITY(1,1) NOT NULL, [name] [varchar](500) NULL, [city] [varchar](500) NULL, [state] [varchar](500) NULL, [age] [int] NULL ) ON [PRIMARY] GO

Insert data:
insert into people(name, city, state, age) values ('Meggs', 'MONTEREY','CA',20), ('Staton','HAYWARD', 'CA',22), ('Irons', 'IRVINE' ,'CA',25), ('Krank', 'PLEASANT', 'IA',23), ('Davidson' ,'WEST BURLINGTON', 'IA',40), ('Pepewachtel' ,'FAIRFIELD' ,'IA',35), ('Schmid', 'HILLSBORO', 'OR',23), ('Davidson', 'CLACKAMAS', 'OR',40), ('Condy','GRESHAM','OR',35)
Output:

To find the number of residents and their average age by state, use:
select state, avg(age) as Age, count(*) as no_of_residents from people group by state
Error: Column ‘people.age’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This error arises when the “people.age” column is not part of the GROUP BY clause nor included in an aggregate function.
Logical Solution:
To fix the query, include the specific aggregation for the “age” column, like the average age:
select state, avg(age) as Age, count(*) as no_of_residents from people group by state
This executes correctly and provides:

Consider these points when using GROUP BY to avoid errors:
- The GROUP BY clause comes after the WHERE clause and before the ORDER BY clause.
- Use the WHERE clause to filter rows before applying “GROUP BY”.
- If a column contains NULL values, they are treated as one group.
GROUP BY and NULL Values:
Insert a row with NULL in the “state” column:
insert into people(name, city, state, age) values ('Kanwal', 'GRESHAM', '', 35)

Now, execute:
select state, avg(age) as Age, count(*) as no_of_residents from people group by state
Output:

Add more NULLs:
insert into people(name, city, state, age) values ('Kanwal', 'IRVINE', 'NULL', 35), ('Krank', 'PLEASANT', 'NULL', 23)

Re-execute the query:

In summary, empty or NULL entries are treated as separate groups within the “GROUP BY” operation.