Google Data
Google Ads and Google GA4 Data
Some practice today. My previous posts were mostly about marketing data, so it makes sense to share some peculiarities about such data from an analytical point of view. I'll start with Google Ads and Google GA4.
The data
The main difference between Google Ads (and ads platforms in general) and GA4 data is here. Google Ads Reporting API can only give us reports: a set of dimensions and metrics. We can get daily/monthly cost, clicks, impressions, conversions, etc., broken down by campaign, Ads group, or country, but we won't be able to pull individual conversions or clicks. It means that when we merge such data with the CRM data, we won't be able to assign anything to the individual users, leads, or orders. We can join the aggregated data only. This won't give us any new information about a particular client, but it should be enough to determine channel performance.
GA4 is a completely different story. It allows pulling individual events with all the parameters. What it gives us? Suppose we configure GA4 properly, meaning we send some lead/use identifiers along with the events. In this case, GA4 data can give us a device type, user location, UTMs to identify the channel, and other helpful information. Once we pull the data, we can join it with the CRM, significantly enriching our knowledge about clients.
Recommended by LinkedIn
The instruments
We've discussed the data, but how to pull this data?
👉 Data Studio / Google Sheets connectors. These systems have connectors (by Google and 3rd party) that can connect directly to the Google systems and pull the data. Whenever any filter is changed on the dashboard, a new request will be sent to the Google API to fetch the data. It means that the data will always be fresh, but it also means that it can be very slow.
👉 Google Data Transfer Service / BigQuery Link. If one uses BigQuery as DWH, this service can pull the data. No coding is needed; it's poorly UI configuration. GA4 and Google Ads are being pulled differently, though: always a batch update (e.g., daily) for Google Ads and a possibility to stream GA4 data. It's very easy to configure but not flexible: only standard reports can be pulled for the Google Ads, which means that if one creates some fancy custom report, it'll be difficult to recreate the same utilizing data delivered this way. For GA4, it's better as everything in GA4 is an event, and we import events, so we don't need much flexibility here.
👉 3rd party SaaS services. The most popular ones are Fivetran, Stitch, and Airbyte. All of them can be configured via UI, but they are more flexible and allow one to configure particular dimensions and metrics, but one needs to pay for the services.
👉 Open-source frameworks. Singer, Meltano, Transferwise. They all have a very similar idea, which the 3rd party SaaS services also use under the hood. This idea is to have extractors (taps) and loaders (targets) developed and maintained separately. One must go through a configuration process (that is not always properly documented) and wrap up everything in some script to make them play together. The scheduler implementation will also require some work. Though it sounds like a complicated task, with just a little experience, one can do this configuration within several hours. I once posted an example of such a stack. The big plus of this approach is that it gives much flexibility. As we utilize open-source frameworks, if any new dimension/field is needed, one can always code it.
In conclusion, It's always better to start with the simplest solutions (direct connectors) and then move down the list if more flexibility is needed.