Extract Excel Data with AWS Lambda

Extract Excel Data with AWS Lambda

A requirement existed where I needed to extract data from Excel from within an AWS Lambda Function. Having read several articles on how to do this, I realised that a set of python libraries needed to be zipped and uploaded to AWS as a lambda layer; the lambda layer could then be used within the function by the python code.

However, none of the articles that I found worked successfully for me and I kept getting "Unable to import module" error message, even though the module in question was included in the zipped package.

The following steps describe how I solved my problem using AWS cloud9:

First, I created 2 top level folders named Virtual_Env and Working_Packages. From the first folder I opened a new terminal. From here I typed:

python3 -m venv venv_projectname/        

where projectname is the name of my project

This creates a sub folder containing folders bin, include, lib and lib64. The bin folder contains a script to activate this folder as a virtual environment. To do this I used the following line:

source venv_projectname/bin/activate        

To confirm that I was working in the new virtual environment I typed:

pip list        

which confirms that no packages currently exist

Using pip once more I installed all the python libraries that I needed:

pip install panda

pip install xlrd

pip install openpyxls        

and then confirmed their existance by using pip list once more. I then deactivated the virtual environment by:

deactivate        

The result is that all python libraries that I required now existed within folder lib64/python3.x/site-packages. It was at this point that I came unstuck in my previous attempts.

Within my second top level folder I created a sub folder named projectname where I copied the site-packages folder. Lambda expects all packaged libraries to be inside a folder named python so I renamed site-packages to python to get:

Working_Packages > projectname > python > packaged libraries listed here

I then opened another terminal, this time at the projectname level of the Working_Packages folder and removed all unwanted folders by:

rm -r dist-info*        

To zip everything up I used the following:

ZIP="projectname.zip

zip -r ${ZIP} *"        

and then downloaded the file to my local machine.

The final step was to upload the zip file as a lambda layer and add the layer to the lambda function. Importing the library xlrd (import xlrd) a dependancy for the pandas pd.read_excel() function within my lambda function worked.

Happy days




To view or add a comment, sign in

More articles by Gary Juleff

  • AWS Step Function Parameters

    To solve a particular business problem I needed to find a solution to pass input parameters into a step function and…

    1 Comment
  • An Introduction to PowerBI - #10 of 10

    This final article follows ipon from slicers and briefly touches on DAX Expressions DAX Expressions DAX Expressions…

  • An Introduction to PowerBI - #9 of 10

    This article follows on from filtering Slicers Slicers allow users interacting with the Power BI report the ability to…

  • An Introduction to PowerBI - #8 of 10

    This article follows on from maps. Filtering can be applied at various levels within a Power BI report as follows: 1.

  • An Introduction to PowerBI - #7 of 10

    Previously I demonstrated how to create bar charts using the Total Sales measure that we created in an earlier article.…

  • An Introduction to PowerBI - #6 of 10

    After looking at measures in the previous article we now take a look at charts. Charts With charting we can gain…

  • An Introduction to PowerBI - #5 of 10

    In the last article we looked at calculated columns. This time around it's the turn of measures.

  • An Introduction to PowerBI - #4 of 10

    In the last article we discussed Shaping and Cleansing. Today, we move onto Calculated Columns.

  • An Introduction to PowerBI - #3 of 10

    In the second article of this series I discussed merging tables and relationships. In this article I will touch on…

  • An Introduction to PowerBI - #2 of 10

    In the first article I discussed data preparation and load. In this article I will briefly touch on Relationships and…

Explore content categories