Exploring potential business locations using foursquare APIs & Data science
An evening view of the Toronto skyline : Image courtsey Wikipedia

Exploring potential business locations using foursquare APIs & Data science

A study to determine appropriate neighborhood for opening a Vegetarian / Vegan specialty restaurant based on trending Yoga Studio locations in Toronto, Canada

1. Introduction / Business Problem: 

Restaurant industry is one of the most dynamic industries across the world. In order to be successful, one needs to find the perfect recipe - not just for the food but the venture itself. 

One of the initial key business decisions is determining the location. It has to be carefully selected based on the demographics and competition view. 

This is a report based on a data science oriented study, undertaken to identify suitable neighborhoods in the city of Toronto, Canada for a Vegetarian / Vegan restaurant. 

Hypothesis:

This study is based on the hypothesis that a majority of Yoga practitioners will also eventually adopt a plant based diet as prescribed by the discipline of yoga. While this study does not venture into proving the relationship - this assumption is based on previously published material, cited in the data section of this report.  

2. Data Sources:

  1. Foursquare API to identify the Trending venues in Toronto

Foursquare provides rich location content from over 100K trusted sources and driven by millions of consumers. For the purpose of this study we connected to a foursquare database via an API call. 

Foursquare’s explore query provides a list of trending venues based on the location data submitted to it. https://developer.foursquare.com/developer/

2. Wikipedia: List of Neighborhoods in Toronto

https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, 

3. List of Postal Codes in Toronto - Coursera Database

http://cocl.us/Geospatial_data

4. Yoga & Vegetarianism relationship Assumption

Henning Stanfield, Pierre Gerber, Tom Wassenaar, Vincent Castel, and Mauricio Rosales, Livestock's Long Shadow: Environmental Issues and Options (Food and Agriculture Organization of the United Nations, 2006).

Copyright 2008 by Sharon Gannon. Published by Mandala Publishing. All rights reserved.  

3. Tools & Skills Used:

  1. IBM Watson Studio - Platform used for hosting the entire research data and processing 
  2. Jupyter Notebooks - Environment for executing the code 
  3. Python - Programming Language 
  4. Pandas - Python Library for processing the dataframes
  5. Matplotlib - Python Library for plotting & Charting 
  6. Numpy - Python Library for processing the data as arrays
  7. Sklearn- Python Library for Machine Learning algorithms
  8. Folium - Python Library for plotting the interactive geo-map plotting & analysis
  9. RESTful APIs - For importing the real time data from Foursquare database 
  10. JSON - API data response format
  11. K-Mean - Unsupervised machine learning algorithm for clustering the neighborhood data
  12. BeautifulSoup- Library for scraping the data from web pages 
  13. Github - Final Code repository. All the programming code for this study is available for review and further use at the following github location:  

https://github.com/sanjeetmanchanda/Coursera_Capstone/blob/master/Neighburhood_projects.ipynb

4. Methodology:

  1. We started by creating a Jupyter notebook in IBM Watson studio environment. This environment allows executing multiple programming languages. 
No alt text provided for this image

2. Within the notebook, the initial code was written to scrape the list of neighborhoods in Toronto from a Wikipedia page. 

No alt text provided for this image
No alt text provided for this image


3. Next step was to download the latitude & longitude information for all the postal codes in Toronto

No alt text provided for this image

4. The process continued by combining these two different datasets by joining them using the common column (neighborhood) and cleaning up the junk data: 

No alt text provided for this image

5. A high level data analysis of the neighborhoods:

No alt text provided for this image

6. Generate an interactive map of all the selected neighborhoods of Toronto to see what the scope of our study looks like:

No alt text provided for this image


7. Foursquare API

Foursquare API’s explore query brings back the most trending venues for the given location. Trending venues can be safely assumed to be the most famous and well known locations of the given neighborhood. 

The Foursquare API also provides the “Category” of the venue. We pass the latitude and longitude information of each neighborhood in scope to get back the list of all the Trending places and their respective categories.

No alt text provided for this image
No alt text provided for this image

We get a final list of 1669 trending venues across all the neighborhoods of Toronto along with their categories.

8. The hypothesis is that the presence of trending Yoga studio venues in the neighborhood indicates a robust potential clientele for a Vegetarian / Vegan restaurant. Upon analysis it was discovered that these two are categorized in the foursquare data as 'Yoga Studio' & 'Vegetarian / Vegan Restaurant' 

No alt text provided for this image

9. To ensure that we have statistically significant data available, we found the count of the venues for each category & plotted a box plot. 

No alt text provided for this image

The Count of Vegetarian / Vegan restaurants & Yoga studios is 13 & 12. This falls between 3rd & 4th quartiles of the data. This means our data is viable and doesn't represent any extremities. 


10. We did some further data cleanup to only keep the neighborhoods that have Trending venues in “Vegetarian / Vegan restaurants” or “Yoga Studios”. We also combined this information with original data of the geo-location information to be able to plot it on map. 

No alt text provided for this image

11. Before proceeding, we plotted an interactive map to visualize the neighborhoods that have Trending Yoga studios or Vegetarian/ Vegan restaurants. 

(Blue dots represent neighborhoods with trending yoga studios & red dots represent those with trending vegetarian restaurants. 

No alt text provided for this image


12. In order to subject this data further to mathematical and machine learning evaluation, we applied one hot encoding to the data.

No alt text provided for this image

13. Grouping Rows & Taking Mean of Frequency of occurrence for each category to check how frequently these categories of venues came back for each neighborhood. This will tell us how the frequency for occurrence of trending venues among these two categories is split. 

No alt text provided for this image

14. We use this information to apply the K- Mean cluster algorithm to split the neighborhoods into 3 clusters based on the mean frequency i.e. how frequently did those venue categories appeared in trending data:

No alt text provided for this image

5. Results & Discussion

Upon the visual inspection of the interactive map we can clearly notice that the east side of the city has trending yoga studios but no trending veg/vegan restaurants. By the presence of trending Yoga studios, we can infer the presence of a potential clientele for a Vegetarian / Vegan specialty restaurant. 

In the below image, red dots show the neighborhoods with trending Vegetarian / Vegan restaurants and the blue dots show those with trending Yoga Studio locations.    

No alt text provided for this image

The study results divide neighborhoods in 3 clusters:

Cluster 1: Low opportunity zone: The neighborhoods listed in this zone already have trending Vegetarian / Vegan restaurants. While there is an existing target clientele available in these neighborhoods, new  restaurants will face heavy competition and will need to displace existing players. 

These neighborhoods may be attractive for businesses looking to open a new Veg / Vegan restaurant in a mature market having an active demand for Vegan / Vegetarian restaurants. Businesses will need to work hard to earn their market share.  

No alt text provided for this image


Cluster 2: High Opportunity Zone: The neighborhoods listed in this cluster have multiple trending Yoga studios; but no trending Vegan / Vegetarian restaurants. 

Based on the hypothesis of Yogis (Yoga practitioners) eventually adopting to a vegetarian diet - these neighborhoods are an untapped market for opening a Vegetarian / Vegan specialty restaurant. 

Business can adopt the strategy of focused and targeted marketing to Yoga studio clients to build its first base of patrons.   

No alt text provided for this image


Cluster 3: Balanced Zone: Neighborhoods in this zone have both trending Vegetarian / Vegan restaurants and Yoga studios. Further research needs to be undertaken to explore the saturation level of the market. 

These neighborhoods seem to have a good number of Yoga studios and Veg / Vegan restaurants. 

Further research can be undertaken to identify business opportunities for this demography.  

No alt text provided for this image

6. Conclusion

Based on our research hypothesis, following is the list of neighborhoods in Toronto that present ample opportunity for opening a Vegetarian / Vegan restaurant:

  1. Regent Park, Harbourfront
  2. Thorncliffe Park
  3. The Danforth West, Riverdale
  4. Studio District
  5. North Toronto West, Lawrence Park
  6. University of Toronto, Harbord
  7. Church and Wellesley
  8. Business reply mail Processing Centre,

Business can adopt the strategy of focused and targeted marketing to Yoga studio clients to build its first base of patrons.  



Footnotes & Citations

1.  Henning Stanfield, Pierre Gerber, Tom Wassenaar, Vincent Castel, and Mauricio Rosales, Livestock's Long Shadow: Environmental Issues and Options (Food and Agriculture Organization of the United Nations, 2006).

Copyright 2008 by Sharon Gannon. Published by Mandala Publishing. All rights reserved. 

2. https://www.crummy.com/software/BeautifulSoup/bs4/doc/



Great work!! It’s fantastic to see that you have taken the onus of upskilling yourself into your own hand which is the right way to go about professionally.

To view or add a comment, sign in

More articles by Sanjeet Manchanda

Others also viewed

Explore content categories