Python in Excel — worth looking at

Python in Excel — worth looking at

Compared to using Python “straight” with a IDE aor inside Jupyter Notebook, PIE (Python In Excel) is quite unappealing. However there are good reasons to get it to know better. It's not a bad product and I suspect that it will get a lot better over the years. In the time I've been playing with it, I've learned some things:

PIE uploads the code and its data to a secure Microsoft server, where it gets executed and the results are returned. If you;re used to running everything on your machine it sounds like an abomination, and in some ways, it is, So what is good about it?

Anyone can run PIE code without installing anything. Your complex model can be sent to the VP of sales who can enter some parameters and get those beautiful Seaborn graphs, and they only need Excel on their machine. Clearly this is the big selling point.

As far as "I don't want to send my data to Microsoft" remember that in most MS Office shops your data is either stored on OneDrive or in Sharepoint (Teams) so Microsoft already has your data.

That approach comes with some limitations you will have to live with, and I encourage playing around with PIE t get used to them:

  • Limited packages. Of course Python comes "batteries included: so having "only" the standard library is not as limiting as it is for a C programmer. On top, PIE comes with Pandas, Numpy, Matplotlib, Seaborn, and StatsModels, so exactly a Spartan outfit either and will comfortably cover most business needs. But yes, pip install great-tables is not an option.
  • Because of its execution model you can't load data on your machine if it's not already "inside Excel." External files can be referenced but you need to set up a query in PowerQuery to make it happen. Furthermore, "for security reasons" (but I suspect server load concerns) you can't access HTTP or data over API's either.
  • Now the tricky part: PIE needs to know the data it needs "at compile time" when the code is sent to the server. As a consequence, the xl function used to retrieve Excel data must be s string literal and cannot be a variable or calculation. This will work:

df = xl("MyBigQuery")        

But this will not:

qry = "MyBigQuery"
df = xl(qry)        

  • Speaking of which, when creating a query in PowerQuery, its default is to create an Excel sheet with output in the same name. By default, it seems PIE will prefer the Excel (Data) Table over the query and that matters because your data table reference needs to include [#All] to include the headers. Even worse, if you never intended the Excel table (it just takes up space) and delete the sheet, your line of code will now read xl("#REF!") (remember that intricate compile-time-literals-only thing!), but it will run after updating it back to xl("my_data").
  • The server will cache data but only for a limited time. A cell will take initially a long time to run if you're loading a lot of data but then run quickly when executing it again, But walk away for coffee, and run it again and it will need to retrieve its data again
  • The Editor can be found in the Formulas ribbon. Think IDLE, not Pycharm, but it beats working in the formula bar
  • Cells (formulas) get executed in regular calculation order: column by column. So it would run your A1 cell first, then A2, A3 and then B1 (if you're organizing your code that way), and not A1-B1-A2-A3.

With that I listed what I think are most of the surprises you can run into. If you're able to install Python and VS Code (Pycharm) on your machine, I'd still do that, develop your code there and then copy it over to Excel. Debugging in PIE is not fun! But you can share your code once it's in PIE, and the importance of that cannot be understated.

Now get coding!

To view or add a comment, sign in

More articles by Bart Willems

  • Navigating the risks of generative AI

    AI bots like ChatGPT and Copilot can be a fantastic tool in getting the job done. But you must always remember that…

  • What's in a (Excel) name?

    With the advent of Excel Data Tables, “regular” named ranges might find less uses but that means they're far from…

    1 Comment
  • Multi-column XLookup

    In the previous article we saw how Boolean algebra can be applied to matrix formulas to combine multiple selection…

  • Boolean logic with Excel Matrix Operations

    In the previous article, Excel's matrix operations were discussed, and how nowadays we can perform matrix calculations…

  • Matrix operations in Excel

    Back in the day, when we still used an abacus to double check our results in Excel, there was a cool feature well…

  • The case against Pandas Pivot Tables

    When used to Excel, the pivot_table method in Pandas seems like a gift from the heavens. A simple, single command to…

Explore content categories