War on Talent: Identifying talent from your talent pool with Python

War on Talent: Identifying talent from your talent pool with Python

My observation around organizational approaches to mining talent pools has often found the approaches to be wanting. Organizations amass collections of cv’s from hopeful candidates, yet after the advertised position is filled, these cv’s are very often ignored. Yet these same dormant cv’s may contain candidates with the very skill set you are now searching for.

Admittedly manually reading through each cv is time consuming, often its cheaper and easier to pay for advertising, and then read through the cv’s that are submitted specifically for that role. What if however we could read through our collection of cv’s in mere moments, mining them for specific keywords that relate to the role we are currently looking to hire for? And while there are commercial products, we can simply use Python, an open source language which is one of the most commonly used languages in data science.

Not everyone is familiar with Python, however chances are very good that if you’re in a medium sized organization either one of your HR analysts will have knowledge of Python, or certainly some of your IT crew will do. While the completed code is below, really what I wanted to highlight is that today you can mine that stack of cv’s your organization has collected, and find candidates that match your skill requirements. One of the advantages of Python is that it is readable by humans, you’ll see just over half way down is an object called Skills_sought, this is a list containing the skills that relate to the job vacancy you are looking to fill - simply insert the skills your are looking for into this list. What this code will do, is read through a cv (identified in the command pdfFileObj = open('brendan resume example.pdf', 'rb') , look for those key words, and then provide you with the following output:

['machine learning', 'reporting', 'analysis', ‘Excel’,  'organizational psychology', 'workforce strategy', 'HR', 'Python']

Match to skills sought 88.88 percent match   

The full code is below, I built this code to read through a cv in PDF format. Useful additions to this code would include the code iterating through a list of cv’s, and also outputting the results to an Excel spreadsheet. Your friendly HR analyst or IT person will be able to do this for you, my goal was to share the main aspect of the code, to allow interested parties to read through it and implement it quickly.

import PyPDF2

# pdf file object

# you can find find the pdf file with complete code in below

pdfFileObj = open('brendan resume example.pdf', 'rb')

pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

#the .getNumPages returns the number of pages in the document, this is required for the count function.

pages = pdfReader.getNumPages()

count = 0

#PyPDF2 starts its page numbering at zero, hence count also starts at zero

text = ""

#the following while statement iterates through the pdf reading each page

while count < pages:

   pageObj = pdfReader.getPage(count)

   count += 1

   text += pageObj.extractText()

   text = text.replace("\n","") #this line removes the carriage returns within the document

#having read the text with the above code, we can now move on to

keyword_found= [] #this is simply inialising an empty list where we will store the keywords that we've found in the CV

 

Skills_sought = ["machine learning", "reporting", "analysis", "HR", "organisational psychology", "Python", "Excel", "workforce strategy", "employment relations"]

CVlower = text.lower()

CV = text

for i in Skills_sought:

   locate_keyword = CVlower.find(i)

   if locate_keyword > -1:

       keyword_found.append(i)

for i in Skills_sought:

   locate_keyword = CV.find(i)

   if locate_keyword > -1:

       keyword_found.append(i)

       

keyword_found = list(dict.fromkeys(keyword_found))

print(keyword_found)

print("Match to skills sought", len(keyword_found) / len(Skills_sought)*100,"percent match")

Cool! My dad would really like this kind of thing.

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories