A plea to Microsoft

Malcolm Bacchus

Published Aug 15, 2025

A while back I posted a plea on the Microsoft website wanting them to update Excel to have a (better than) functionality to GoogleDocs with an SQL function. That didn't include an example of why it would be good to have but recently I was exploring the best way of emulating parts of a database in Access (yes, I know, horses for course, but there is a reasonably good reason behind this). So here are some examples of why we need =RUNSQL() in Excel.

The challenge was to take an input table which had the following fields (this is simplified a bit for the purpose of illustration, but essentially):

[Key] Unique index; [Name] text; [StartDate] date; [Order] number

where each row of data has a unique combination of [name]+[startdate]+[order] (referred to by the [key]) but individually neither [name], [startdate] nror [order] is unique.

The output is a table which gives, for each name, the latest date. Where there is more than one entry for the name with that date, the output must have the highest [order] number.

The SQL in the original database is pretty simple to follow:


SELECT T.Name, T.StartDate
FROM Mytable AS T
WHERE T.StartDate <= [SetDate] AND 
T.StartDate = (SELECT MAX(T3.StartDate)
    FROM Mytable AS T3
    WHERE T3.Name = T.Name
    AND T3.StartDate <= [SetDate]) AND 
T.[Order] = (SELECT MAX(T2.[Order])
    FROM Mytable AS T2
    WHERE T2.Name = T.Name
    AND T2.StartDate = T.StartDate
    AND T2.StartDate <= [SetDate])
ORDER BY T.Name;

I'm using BI/Get and Transform to do the manipulation of the data in Excel, so the corresponding M-formula code for Excel is:

    Source = Excel.CurrentWorkbook(){[Name="Mytable"]}[Content],
    ChangedType = Table.TransformColumnTypes(Source,
		{{"Key", Int64.Type}, {"Name", type text}, 
		{"StartDate", type date}, {"Order", Int64.Type}}),
    FilteredByDate = Table.SelectRows(ChangedType, 
		each [StartDate] <= SetDate),
    GroupedByName = Table.Group(FilteredByDate,{"Name"}, 
		{{"MaxStartDate", each List.Max([StartDate]), type date}, 
		{"AllRows", each _, type table}}),
    ExpandedRows = Table.ExpandTableColumn(GroupedByName,"AllRows",
		{"Key", "StartDate", "Order"},{"Key", "StartDate", 
		"Order"}),
    ChangedType1 = Table.TransformColumnTypes(ExpandedRows,
		{{"Key", Int64.Type},{"StartDate", type date}, 
		{"Order", Int64.Type}}),
    FilteredMaxDate = Table.SelectRows(ChangedType1,
		each [StartDate] = [MaxStartDate]),
    GroupedByNameAndDate = Table.Group(FilteredMaxDate,
		{"Name", "StartDate"},{{"MaxOrder", 
		each List.Max([Order]), Int64.Type},
		{"AllRows", each _, type table}}),
    ExpandedRows1 = Table.ExpandTableColumn
		(GroupedByNameAndDate,"AllRows",
		{"Key", "Order"},{"Key", "Order"}),
    FilteredByOrder = Table.SelectRows(ExpandedRows1,
		each [Order] = [MaxOrder]),
    RemoveColumns = Table.RemoveColumns(FilteredByOrder, 
		{"Key","Order","MaxOrder"}),
    ReorderedColumns = Table.ReorderColumns(RemoveColumns,
		{"Name", "StartDate"}),
    SortedRows = Table.Sort(ReorderedColumns,
		{{"Name", Order.Ascending}})
in SortedRows

It's would be easier to read if LinkedIn gave wider columns. And its a bit cumbersome - I'm sure somebody can do better, but it works. Of course, this isn't dynamic so when the input data changes, the query has to be refreshed, but that is fine and a couple of lines of VBA do that automatically.

Now of course you can do it as an Excel formula, which will recalculate dynamically, and return the results as a dynamic array rather than a table, but who on earth would spend their time writing this:

Recommended by LinkedIn

Crossjoin in Power Query. Wait, what?!

Felipe Muniz 11 months ago

Use the Chartmaker Directory to find new ways to…

Dave Paradi 8 years ago

🚀 Automating the Analyst's World: My Journey into VBA…

ATI BHARGAVI 10 months ago

=SORT(IFERROR(INDEX(Mytable[[Name]:[StartDate]], MATCH(LET(names,UNIQUE(Mytable[Name]),dates,MAXIFS(Mytable[StartDate],Mytable[Name],names,Mytable[StartDate],"<="&SetDate),XMATCH(names&"-"&dates&"-"&MAXIFS(Mytable[Order],Mytable[Name],names,Mytable[StartDate],dates),Mytable[Name]&"-"&Mytable[StartDate]&"-"&Mytable[Order],0)), Mytable[Key], 0), {1,2}), ""),1)

(Oh, I did, just to prove it could be done, but by tomorrow I will have forgotten what it does and nobody else would have a clue. So really not a good approach.)

Then there is putting the code into Python as a =PY() function which will recalculate dynamically but requires an internet connection to run:

=PY("
import pandas as pd
from datetime import datetime
dM = xl("Mytable")
dM.columns = ['Key', 'Name', 'StartDate', 'Order']
dM['StartDate'] = pd.to_datetime(dM['StartDate'], errors='coerce')
setdate = pd.to_datetime(xl("SetDate"), errors='coerce')
dM_filtered = dM[dM['StartDate'] <= setdate].copy()
max_startdate = dM_filtered.groupby('Name')['StartDate'].max().reset_index()
max_startdate = max_startdate.rename(columns={'StartDate': 'MaxStartDate'})
dM_merged = dM_filtered.merge(max_startdate, on='Name')
dM_merged = dM_merged[dM_merged['StartDate'] == dM_merged['MaxStartDate']]  
max_order = dM_merged.groupby(['Name', 'StartDate'])['Order'].max().reset_index()
max_order = max_order.rename(columns={'Order': 'MaxOrder'})
result = dM_merged.merge(max_order, on=['Name', 'StartDate'])
result = result[result['Order'] == result['MaxOrder']]
result = result[['Name', 'StartDate']].sort_values(by='Name').values
result
")

So, in summary:

SQL = 371 characters
M-formula = 1793 characters (plus a little VBA to make it dynamic)
Excel formula = 360 character (but totally incomprehensible)
Python = 900 characters (but requires an internet connection)

SQL would be the clear winner ... if only Microsoft had an SQL interpreter. GoogleSheets does. It has QUERY() which takes a SQL query and executes it. Excel doesn't. And Microsoft could make it better than Google docs by allowing it to execute multiline statements and interpret range names rather than just hard coded cell addresses. So come on, Microsoft, up your game. I'd miss the enjoyment of working out all this fun programming (yes, I do enjoy it) but we would all be more productive, wouldn't we?

Malcolm

P.S. It's even shorter in APL, but I'm not that unreasonable in my expectations.

latestByName ← {
    names ← ∪ data[;1] 
    ⊃{ 
        name ← ⍵
        rows ← data[data[;1] = name;] 
        maxDate ← ⌈/ rows[;2] 
        maxDateRows ← rows[rows[;2] = maxDate;]
        maxOrder ← ⌈/ maxDateRows[;3]
        maxRow ← maxDateRows[maxDateRows[;3] = maxOrder;]
        ⊃maxRow
    }¨ names
}

To view or add a comment, sign in

A plea to Microsoft

Malcolm Bacchus

Recommended by LinkedIn

More articles by Malcolm Bacchus

Others also viewed

The Query and the Pivot – When your Tables need more than a Sheet

What is a Power Query Template and why is it a big deal in the era of Fabric?

Toolbox with no hammer?

POST call in Power BI Dataflow

Intro to R in Power BI - Part 3

Page Numbering in SAS: A Double DOW'll Do Ya

Microsoft Data Platform News 2024 - Week 08

Microsoft Excel vs R

The case against Pandas Pivot Tables

Why Excel Still Deserves a Seat at the Analytics Table

Explore content categories

Recommended by LinkedIn

More articles by Malcolm Bacchus

Speech to the Institute of Chartered Accountants of India’s World Forum of Accountants (WOFA) 2025

Observations on Corporate Governance

Something for London businesses?

Protect yourself from Cyber Crime

Accountancy Firms and Solicitors Being Targeted by Fraudsters

Others also viewed

The Query and the Pivot – When your Tables need more than a Sheet

What is a Power Query Template and why is it a big deal in the era of Fabric?

Toolbox with no hammer?

POST call in Power BI Dataflow

Intro to R in Power BI - Part 3

Page Numbering in SAS: A Double DOW'll Do Ya

Microsoft Data Platform News 2024 - Week 08

Microsoft Excel vs R

The case against Pandas Pivot Tables

Why Excel Still Deserves a Seat at the Analytics Table

Explore content categories