How to eat Google Trends with Python in real time ?
Numbers represent search interest relative to the highest point on the chart for the given region and time. A value of 100 is the peak popularity for the term. A value of 50 means that the term is half as popular. A score of 0 means there was not enough data for this term.
Source: https://trends.google.com/trends/, more explanations about google trends values here https://support.google.com/trends/answer/6248105.
I work in IT and solve my clients problems, one of the recent requests I got was about how to monitor google trends in real time. In this short article I will tell You how to do it. You decide what is Your need.
Problem Diagram
Celery is a Task Queue, it execute each task in separate process and this maybe relevant because of IP Rotation and API Rate Limit problems. If you would like to mirror real time updates from Google Trends API, You may configure Celery so it execute the same task in defined frequency. Celery maybe only a part of Your system, You can create API client which allow to create tasks with different keywords: python, ethereum, ...
Google Trends API
I haven't found information about Google Trends API, open Your web browser console (in Firefox You do it with Ctrl+Shift+I) and open Network tab, now search google trends, for this example I used python keyword.
First request that google trends application is doing:
https://trends.google.com/trends/api/explore?hl=pl&tz=-120,-120&req={"comparisonItem":[{"keyword":"python","geo":"US","time":"today 12-m"}],"category":0,"property":""}
as result You get a list of configuration items for google trends application widgets. Widget is an component that can be added to web site as standalone element, it contains visual parameters and business logic, in our case what resources to load for widget.
First widget has this configuration:
{
"request": {
"time": "2020-06-22 2021-06-22",
"resolution": "WEEK",
"locale": "pl",
"comparisonItem": [
{
"geo": {
"country": "US"
},
"complexKeywordsRestriction": {
"keyword": [
{
"type": "BROAD",
"value": "python"
}
]
}
}
],
"requestOptions": {
"property": "",
"backend": "IZG",
"category": 0
}
},
"token": "APP6_UEAAAAAYNN3iWA7yWIEtpwO5kG4SY4rL5B5z8S_",
"version": "1"
}
It contains the information about how to create a request for google trends API. As there is no documentation of API we must experiment with different properties, I guess You call it reverse engineering :).
If You would like to embed google trends widgets it is possible, google allows on that:
<script type="text/javascript" src="https://ssl.gstatic.com/trends_nrtr/2578_RC01/embed_loader.js"></script>
<script type="text/javascript">
trends.embed.renderExploreWidget("TIMESERIES", {"comparisonItem":[{"keyword":"python","geo":"US","time":"today 12-m"}],"category":0,"property":""}, {"exploreQuery":"geo=US&q=python&date=today 12-m","guestPath":"https://trends.google.com:443/trends/embed/"});
</script>
As You see it contain similar information which I shown in previous fragment.
The first widget is executing this request:
https://trends.google.com/trends/api/widgetdata/multiline?hl=pl&tz=-120,-120&req={"time":"2020-06-22 2021-06-22","resolution":"WEEK","locale":"pl","comparisonItem":[{"geo":{"country":"US"},"complexKeywordsRestriction":{"keyword":[{"type":"BROAD","value":"python"}]}}],"requestOptions":{"property":"","backend":"IZG","category":0}}&token=APP6_UEAAAAAYNN3iWA7yWIEtpwO5kG4SY4rL5B5z8S_
and this is our API https://trends.google.com/trends/api/widgetdata/multiline, the multiline is a name of widget we can change it with other widget name like comparedgeo or relatedsearches.
Google Trends API token
If You will remove token the request will fail with Error 400, which mean that token needs to be used in order to authorize request. The first problem we encounter that token is outdated maybe after some time of not activity on google trends application page.
When I executed embedded widget I verified that api token is injected to widget itself on server side, at the time of writing I am not sure how long one token is valid.
If You would like to use ready to go library check this one https://github.com/GeneralMills/pytrends/blob/master/pytrends/request.py, I haven't tried it. It is great that somebody covered the problem, although one thing to keep in mind each abstraction layer takes away control over the process, so for research or prototyping purposes I think it is easier to create simple script.
This will give You all You need to connect:
import json
import requests
cookies = requests.get('https://trends.google.com/?geo=PL').cookies.items()
cookies = dict(cookies)
widgets = requests.get('https://trends.google.com/trends/api/explore?'
'hl=pl&tz=360&'
'req={"comparisonItem":[{"keyword":"python","geo":"US","time":"today 12-m"}],'
'"category":0,"property":""}',
cookies=cookies,
headers={'accept-language': 'pl'})
TRIM_CHARACTER_SLICE = ")]}',"
text_trimmed = widgets.text[len(TRIM_CHARACTER_SLICE):]
data = json.loads(text_trimmed)
print(data)
Trim Character
One interesting part is this line https://github.com/GeneralMills/pytrends/blob/bac0caea18817630e98b503a38e9b445e7b5add1/pytrends/request.py#L109 and this was something new for me too, because response we get from above fragment is not valid JSON data, why ? Creator of library commented that some responses start with garbage characters, like ")]}'," and I wonder why maybe somebody will explain ?
Summary
Reverse Engineering is a great learning experience as You try to understand how things work from the other side, You slowly see how everything is connected.
This article was written purely for research and training purposes and my goal was not to encourage anyone to to use Google Trends API in a way described above.