Mining social media data using Python (2) - Make API request with Python

Mining social media data using Python (2) - Make API request with Python

Step 1. Before making API request...... register for an application and client key

No alt text provided for this image

All social media platform requires you to register your client application before interacting with API, so your first step is client registration. Using Linkedin as example, you can visit their Developers page to create your app. The steps for "Create an app" are easy, provide your app name, linkedin page url & upload an app logo photo and click "okay".

Once you create your first app, you can see the the app detail. Click "Auth" to get the information you required.

No alt text provided for this image
No alt text provided for this image
  1. Client ID & Secret: The ID & password of your client app, and required during OAuth process.
  2. Access Token Lifetime: The duration of the token lifetime. Set it based on every app usage.
  3. Authorized redirect URLs: The redirect URLs allowed to be used during the OAuth grant.
  4. OAuth 2.0 scopes: The scope determines which API types can be called in the platform. Be aware that social platform does not provide full scopes by default. i.e. Linkedin only provide three limited scopes to your application when created. If you need more scopes, you have to make a request and ask for approval from Linkedin.

Step 2. Understand the OAuth grant type supported

In general, OAuth2.0 supports 4 different types of authorization grant, including: (i.e. For detail, you can take a look on this documentation as reference)

  1. Authorization Code**: the most commonly used types because it is optimized for server-side applications, where source code is not publicly exposed, and Client Secret confidentiality can be maintained. Also, this method allows the best experience for Resource Owner to interact with web browser. (i.e. "iHerb Login" mentioned in the previous article also use this grant type.) ** In this article I will focus on how to pass through OAuth2.0 with this grant type.
  2. Implicit: This grant is similar to Authorization Code which is a redirection-based flow, however, the confidentiality of the access token is not guaranteed. The access token is given to the user-agent to forward to the application, so it may be exposed to the user and other applications on the user’s device. Also, this flow does not authenticate the identity of the application, and relies on the redirect URI (that was registered with the service) to serve this purpose.
  3. Resource Owner Password Credential: This is a simple grant type which does not have any redirect flow. Users provides their service credentials (username and password) directly to the application, which uses the credentials to obtain an access token from the service. It should only be used if the application is trusted by the user (e.g. it is owned by the service, or the user’s desktop OS).
  4. Client Credential: The client credentials grant type provides an application a way to access its own service account. This is a direct method to access your own account if the resource owner is also the owner of the applications.

Step 3. Use Python to get Linkedin API access via grant type "Authorization Code"

>> pip install requests
>> pip install json

  • It is the time to turn on your python IDLE and start writing your python script. Before the start of scripting, make sure that you have installed the package "requests" and "json". Otherwise, install these two packages with "pip"
  • Also, I have set my redirect URL as "http://localhost" which return the request to my own machine's web page. To enable your own web page, you can enable the IIS service in your Windows OS. (Link for IIS enablement : https://teckangaroo.com/enable-iis-windows-10/)

3.1. Send GET request to Linkedin and return the authentication link

import requests


#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
    URL = "https://www.garudax.id/oauth/v2/authorization"
    client_id=<your own client ID>
    redirect_uri = 'http://localhost/'
    scope='r_liteprofile'
    PARAMS = {'response_type':'code', 'client_id':client_id,  'redirect_uri':redirect_uri, 'scope':scope}
    r = requests.get(url = URL, params = PARAMS)
    return_url = r.url
    print('Please copy the URL and paste it in browser for getting authentication code')
    print('')
    print(return_url)

get_auth_link()

  • For our first function "get_auth_link()", we try to make a GET request to Linkedin for a link to let resource owner to login. Making GET request is easy in python, you just need to use get() method (i.e. requests.get())
  • Take a look on Linkedin Documentation, you can find the parameters and URL needed for making the GET request.
No alt text provided for this image
  • Define all the variables like URL, client_id, redirect_url, scope, PARAM in the script, and put all variables into the get() method.
  • After making the request, Linkedin returns a URL which allow users to login to service. So, at the end of "get_auth_link()", I print out the returned URL link which will be used to access the login page.
No alt text provided for this image

3.2. Copy the return URL and paste it into a web browser

No alt text provided for this image
  • You will be prompted to log into Linkedin and click the "Allow" button for permission. LinkedIn redirects you back to your website’s URL with an authorization code embedded in the URL link.
  • Check the returned URL link in browser and find the authorization code in the parameter "code". Copy the value and move to the next step.
No alt text provided for this image

3.3. Make a POST request to exchange the Authorization Code for an Access Token

import requests
import json

def get_access_token():
    headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
    AUTH_CODE = <Authorization Code you got from the URL>
    ACCESS_TOKEN_URL = 'https://www.garudax.id/oauth/v2/accessToken'
    client_id=<Your Client ID>
    client_secret=<Your Client Secret>
    redirect_uri = 'http://localhost/'
    PARAM = {'grant_type': 'authorization_code',
      'code': AUTH_CODE,
      'redirect_uri': redirect_uri,
      'client_id': client_id,
      'client_secret': client_secret}
    response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
    data = response.json()
    print(data)
    access_token = data['access_token']
    return access_token

get_access_token()

  • Our second function "get_access_token()" is to POST the request to Linkedin with our authorization code (taken in Step 3.2) and get the access token. Based on the documentation, we need the following URL and parameters:
No alt text provided for this image
  • Futhermore, the documentation has reminded us that the post() request requires a content-type header of "application/x-www-form-urlencoded". That's why we have added "headers" in the script this time.
  • A JSON file will be returned in this POST request. So, we will use .json() method to parse the response. See the result of "print(data)":
No alt text provided for this image
  • This JSON contains two fields: access_token and expires_in. Extract the value using "data['access_token'] and return it at the end of the function.
  • Done! We have gone through the whole OAuth procedure and get our access token. Let's try to use the access token to make a data request.

Step 4. Make our 1st API call - Getting the owner's personal information from Profile API

  • Before making the request, visit the Profile API documentation to understand the request structure and the scope required. For example, let's take a look on Retrieve Current Member's Profile:
No alt text provided for this image
  • Permissions: This is important to see if the scope of access token allows us to run this API. Our access token is authorized to run this API because our scope includes "r_liteprofile"
No alt text provided for this image
  • Request: This request is straight forward. No parameter is needed. We only need to input the URL to make the get request.
No alt text provided for this image
  • Header: This is so tricky that the access token we got in Step 3.3 is not inputted as a parameter in the GET request. The access token is required to put in the "Authorization" with the value "Bearer <access_token>"
def get_profile(access_token):
    URL = "https://api.linkedin.com/v2/me"
    headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
    response = requests.get(url=URL, headers=headers)
    print(response.json())

get_profile(access_token)

  • In our final function "get_profile()", we add an argument called "access_token" which allow us to bring the access_token returned from our previous function "get_access_token()" into it. Let's see what is the response JSON file looks like.
No alt text provided for this image
  • Great! We have successfully connected with the Profile API and retrieved our personal information from Linkedin.

Summary

In these 2 articles, I have used Linkedin as an example to explain what OAuth2.0 is and illustrate how to extract the data via its RESTful API. In fact, these procedure can be applied to all the platform which support OAuth2.0 and RESTful API. The procedure includes:

  • Register for a client application in their developer website
  • Read their API documentation to check their supported OAuth grant type and their sample procedure
  • Start using Python to get their access token. If their grant type is "Authorization Code", please follow my instruction "Step 3" in this article.
  • Use the access token to make API calls.

Hope this article can help you understand more on how to extract your desired data from the social platform. If you want to review the concept of OAuth2.0, please read my previous article: Mining social media data using Python (1) - REST API & OAuth

Also, you can download my sample python script HERE for reference. To use the script, remember to enter your own client ID and secret in the script.

What's next?

Actually I want to share the data available in Linkedin API for analysis. However, Linkedin applies strong restriction on personal page, so we cannot analyze our page statistics without Linkedin permission.

So, maybe let us explore more on "Flask", which is a micro web framework written in Python. Why a web framework is needed? This helps a lot on our analytics life, like creating an interface to make our RESTful API calls much easier, or even leveraging Tableau extension API to enrich your Tableau capabilities!!

Stay Tuned and happy learning!

Hi. How can I request for more scopes for my created app? Thank you!

Like
Reply

Thanks for sharing. Really nice step by step guide. LinkedIn has introduced a web-based access token generator process https://www.garudax.id/developers/news/featured-updates/token-generator-tool And yes LinkedIn's stance is restrictive in comparison to Google. I find my YouTube data more easily accessible than my LinkedIn data.

Like
Reply

To view or add a comment, sign in

More articles by Marvin W.

Others also viewed

Explore content categories