Data science for the 99%
Image by Tim Mossholder on Unsplash (https://unsplash.com/photos/qvWnGmoTbik)

Data science for the 99%

I lived for a while in New York City, a place that I love dearly. I am, however, originally from Los Angeles, California, and, for the last 25 years, I have lived in Salt Lake City, Utah. What’s curious is that many of the people I met in NYC knew that I was from “someplace on the other side of the country” and seemed to have a mental map that resembled this picture:

New Yorker Magazine cover showing distorted perception of world outside of Manhattan
"View of the World from 9th Avenue" by Saul Steinberg (https://en.wikipedia.org/wiki/View_of_the_World_from_9th_Avenue)

We all have our own forms of myopia, where we see one small part of the world and its possibilities in great detail, but the rest is just a vague haze. This can happen even when the part outside our awareness is much, much larger than our familiar, comfortable area. One peculiar place that can happen is in data science careers.

As an example, Springboard published a list of the "22 Best Data Science Companies Hiring in 2023." Not surprisingly, the top of the list is dominated by giant tech companies. Here are the top five companies on their list, along with their employee counts:

  1. Microsoft (220,000 employees)
  2. Amazon (1.5 million employees)
  3. EY (365,000 employees)
  4. Google (Alphabet; 150,000 employees)
  5. VMware (38,300 employees)

It almost makes VMware look small by comparison, but, at this exact moment, they have a market capitalization of over $60 billion, so they're definitely big.

Another way to look at data science is with the list "Highest Paying Data Science Jobs in 2023" by Simplilearn. They list familiar job titles like data scientist, machine learning engineer, data architect, and so on. While all of these are important jobs and they pay well, they also represent a narrow view of the career possibilities for people interested in working with data. (And they're definitely not the only well-paying jobs that are potentially available to people with training in data work.) Really, it starts to feel like a gold-plated hall of mirrors. You see a lot of very shiny things, but you don’t necessarily see very far.

Hall of Mirrors at the Palace of Versailles
The Hall of Mirrors at the Palace of Versailles in France (http://www.historylines.net/img/versailles/La_Galerie_des_Glaces.jpg)

So, it may be time to update your map of the data science career landscape, and see what else is out there for you. To help with this, I'll share five recommendations.

1. See the 99%

Public viewfinder looking over field and sunset
Image by Matt Noble on Unsplash (https://unsplash.com/photos/BpTMNN9JSmQ)

This is an exercise that involves seeing what's right in front of you. According to data from the US Small Business Administration (and additional data from the US Census, the US Chamber of Commerce, and Forbes), there are over 33 million businesses in the United States, but fewer than 21,000 have 500 or more employees, which is the cutoff in the US for "small business." That's just 0.06% of all businesses – it's not even visible in the chart below. On the other hand, small businesses, or those that have fewer than 500 employees, account for over six million business establishment in the US, or 18%. And, finally, solo businesses with no employees beyond the owner are by far the most common, accounting for over 27 million businesses, or nearly 82%.

Bar chart of number of businesses in US by business size

Then again, large businesses employ more people per firm, so it's also helpful to look at the total number of employees in each category. Large businesses collectively have 30 million employees, which is a little more than the solo businesses (19% and 17%, respectively). But the small businesses employ more than both of those categories put together: nearly 100 million people in America work for small businesses, or almost two-thirds of the total. This is a massive group, and a place where an aspiring data scientist can make a meaningful contribution.

No alt text provided for this image

[And It should also be clear by this point that I wasn't quite accurate when I called this newsletter "Data Science for the 99%." Really, it should be "Data Science for the 99.94%."]

2. Learn about the goals of the 99%

No alt text provided for this image
Image by Maranda Vandergriff on Unsplash (https://unsplash.com/photos/fZBwUGlKbO8)

The work that makes headlines in data science – the extraordinary advances in machine learning and artificial intelligence, for example – are associated with tech giants like Google, Amazon, and Microsoft. (Paradoxically, the company that produced the world-changing ChatGPT, OpenAI, is apparently still a small business; Wikipedia and other sources report that OpenAI has 375 employees.) But data work is still critical to the six million small businesses and 27 million solo businesses in the United States. Think of some of the small businesses around you:

  • Local bakeries, breweries, cafés, and restaurants
  • Doctors, lawyers, and accountants with their own practices
  • Regional architectural firms, construction companies, and subcontractors
  • Social media marketing and event planning specialists
  • Professional dance companies, music ensembles, and theater companies
  • Landscapers, plumbers, electricians, and HVAC technicians

(For a more complete list see "United States Small Business Economic Profile" by the US Small Business Administration, which ranks various categories by number of firms and employment, including the percent of firms in a category that are small businesses. For example, 86% of firms in "Agriculture, Forestry, Fishing and Hunting" are small businesses.)

These businesses are designed to provide a sustainable living for their owners and employees. That goal may sound painfully obvious, as though there were no other options, but it stands in stark contrast to the number of companies that are oriented towards rapid growth and splashy IPOs for their investors. (Curiously, many of the "growth-oriented" companies that are in the news have never been profitable, but rely on continued rounds of investor funding.) Both kinds of businesses need to track their progress, although they will use different metrics. They both need to calculate their ROI, or return on investment, but growth-oriented businesses are likely to focus on investor-facing actions, while small businesses will focus more on customers. In particular, small businesses need to know how well they are serving their current customers and clients, as well as how they can reach out to new ones in a sustainable way, which means without the dramatic expansions in investment or headcount that growth-oriented startups rely on.

In my own work with small businesses and nonprofits, their major concerns included questions like:

  • What days and times should they be open?
  • How can they simplify record-keeping for their employees and volunteers?
  • How did they connect with their best donors and how can they find more?
  • What products and services were their clients most interested in?
  • How well did they help their audience meet a wide range of challenges in their lives?

These are not necessarily complicated questions in need of high-powered machine learning. They are fundamentally simple, but they are also the most important questions that these organizations had. The questions all had a direct impact on how they operated and, by extension, how well they could sustain their work. They are data-driven questions, but not everyone knows how to work with data to answer them. Learning how to address these goals, as opposed to, say, developing a new data product, can make your work crucial to the 99% of businesses around you.

3. Adapt your methods to the 99%

No alt text provided for this image
Image by Markus Winkler on Unsplash (https://unsplash.com/photos/IrRbSND5EUc)

In 1954, Kitty Kallen sang "little things mean a lot." In working with data for small businesses, this same principle is true. The datasets from small businesses nearly always fit neatly into a single spreadsheet file. In fact, I can only think of one client I worked with who used anything else. In that case, it was a simple matter of taking their two Salesforce databases, doing an inner join, and then saving the data in a spreadsheet file. In an earlier edition of this newsletter entitled "Minimalism in data work," I encouraged people to start with simple tools and only move on as needed. Specially, I said:

  • First, spreadsheets
  • Second, apps
  • Third, languages

The idea is to use the "minimally-sufficient tool," or the simplest thing that gets the job done properly and with minimal difficulty. With data, that typically means spreadsheets, unless and until it is no longer easy to do what you need to do. Then move to apps like SPSS or jamovi, unless and until is no longer easy to do what you need to do there. Only at that point would you need to move on to a data-focused programming language like R or Python.

There are a few important reasons for this recommendation when it comes to working with the 99%:

  • Spreadsheets are universally available. This is a tool that your clients will already have access to and will be familiar with. Also, when the data is properly prepared (for example, in the "tidy data" style advocated by Hadley Wickham) and saved in CSV files (the "comma-separated values" format that serves as a generic standard for spreadsheets), then nearly any data app or program in the world can read it with minimal effort.
  • Spreadsheets require you to keep your work simple. Spreadsheets are great for organizing data, for sorting variables, for computing descriptive statistics, and for creating bar charts and line charts. More complicated work may be possible, but it can be difficult, and so the basic approaches are reinforced. In my experience, this small set of operations will answer at least 90% of the questions that small businesses may have.
  • Spreadsheets facilitate communication. By using a common format on a universally-accessible tool, your clients will be able to see everything you did. In addition, by relying on spreadsheets to create your charts, your choices are limited to the options that are typically most effective. This enforced simplicity promotes clear, concise communication, and that is always good. (As an example, I used Google Sheets to make the two bar charts that appeared earlier in this newsletter. It was a quick process, and the charts are easy to understand.)

Businesses are best served when you can give them actionable insights. In small businesses, that can often be just a "yes" or "no" to a critical question. In most cases, you can provide a data-driven answer to those questions using simple tools and procedures.

(FREE COURSE: For more information on actionable insights, see my new LinkedIn Learning course "Actionable Insights and Business Data in Practice." This course is a hands-on approach to making your data work directly useful to your clients. Use this link and it will be free to you for 24 hours.)

4. Connect with the 99%

No alt text provided for this image
Image by Antenna on Unsplash (https://unsplash.com/photos/ohNCIiKVT1g)

Fortunately, finding the 99% is easy because they're everywhere. Every city and every town has small businesses that could benefit from thoughtful data work. An easy first step is to connect with your local Small Business Development Center or your local Chamber of Commerce to describe your skills and ways that you think you could help. They can give you great ideas to connect with people who would value your work.

It also helps to connect with groups that focus on particular topics. For example, I live in Utah, and I have connected with the Utah Nonprofits Association and the Utah Cultural Alliance. I have also presented on data topics at the Mountain West Arts Conference, where I formed some wonderful, lasting connections.

Connecting through networking events, such as Meetup groups in your area, can also be a productive way of finding the 99%. And, finally, there is the opportunity to connect with local nonprofits, most of which are small businesses, through service events. Here in Utah, I organized several editions of our "Data Charrette," where we connected data-savvy volunteers with local nonprofits in two-day, hackathon-like events. These events were great experiences where volunteers made important connections and honed their data skills. We'll launch a modified one-day version of the event again later this fall, with the goal of doing local and remote versions at least twice a year.

5. Enable the 99%

No alt text provided for this image
Image by Amy Hirschi on Unsplash (https://unsplash.com/photos/K0c8ko3e6AA)

Finally, one of the best things you can do for the small businesses you work with is help them develop their own data skills for ongoing work. You can do this by clearly organizing your work in an easy-to-follow format, and providing reusable templates, such as spreadsheets and presentations. You can share training materials with them that guide them in their own work. (For example, I have a range of courses at LinkedIn Learning, which is a subscription service, and free courses through my own company, datalab.cc, but many other options are available.)

It's also worth considering working as a full-time member of one of the small businesses near you. They may not need a full-time data scientist, but your ability to think with data, as well as your ability to learn new, vital skills can make you a valuable addition to any company. (As a personal note, I have four degrees in research psychology – BS, MA, MPhil, and PhD – and I am employed as a full-time faculty member at a university. However, when it comes to the work that I am actually paid for, which is teaching statistics, I rely on the empirical and interpretative mindset that I developed in graduate school, but the technical skills that I use I have learned on my own since then.)

Your data skills and insights are valuable, not just to the tech giants but to nearly every business and organization around you. Once you learn to better understand those organizations, what their goals are, and how you adapt your data expertise to meet those goals, then you can fulfill the promise of your training and your passion in new and unexpected ways.


Thanks for joining me here. And remember, sharing is caring! Follow me on LinkedIn and share this  newsletter with a friend who you think would benefit from it.

Timely and insightful for me, thank you!

Agree! It can be applied to any business as long as meaningful data is being collected!

Viorel Cazacu

Finance Manager at Action Romania | Financial Controlling & Financial Analysis Courses Lecturer @ Skillab

2y

Great article, full of actionable insights ("actionable insights" is a mantra that I have learned from your Data Trainings and has been my Data Analysis focus ever since). Many thanks!

To view or add a comment, sign in

More articles by Barton Poulson

  • Data Poetics, vol. 02

    This is the 2nd collection of my "Data Poetics" posts. My motivation for this project was to remind data people that…

  • Data Poetics, vol. 01

    It's my firm belief that, although the technical aspects of data work are necessary conditions for meaningful data…

    5 Comments
  • AI for the 99%

    I lived most of my life in the previous millennium. I've been around for many of the tech developments that have…

    3 Comments
  • Upside down and on fire but successful

    We all live by metaphors. One of my favorites comes from a stock car race that I saw many years ago.

    3 Comments
  • Return on Investment (ROI) for data people

    I grew up in Los Angeles, with MGM Studios (now Sony Picture Studios) at one end of my street and 20th Century Fox at…

    1 Comment
  • The limits of advice

    Novelists, biographers, playwrights, composers, and choreographers – and so many others – often face the impossible…

    7 Comments
  • Why I embrace accommodations

    I am grateful that I teach at an open-enrollment university, Utah Valley University. Every student who applies will…

    2 Comments
  • The symbol vs. the thing

    In 1931, Alfred Korzybski, who developed the field of general semantics, famously said "the map is not the territory."…

    3 Comments
  • In praise of DIY data work

    There is a fundamental paradox to data work: It feels like an exact science, because it has rows and columns of…

    1 Comment
  • 3 themes for data work worth doing

    Richard Wagner, the 19th century German composer, revolutionized opera in many ways – in scale, structure…

    1 Comment

Others also viewed

Explore content categories