S2 E4 - D3 asks...Do Data Aggregators Put Data Quality At the Forefront of Their Value Proposition?
D3 helps Data Owners Bring Their Solutions to Market

S2 E4 - D3 asks...Do Data Aggregators Put Data Quality At the Forefront of Their Value Proposition?

The question I'm asking is:  how obviously vital is data quality to the leading data aggregators and data solution providers? 

In other words, what outward signs do we see from the best of the best…that data quality is something that is tied directly to their core value proposition, and that they provide obvious ways for a potential customer to evaluate that quality?

Over the next few blogs, I’ll share my findings using public information about the largest b2b data aggregators in the world. 

To be clear, this isn’t an analysis of their actual data quality…it’s a snapshot of how much clarity and transparency they offer into the quality of their data including how they define and manage it.

The questions I’m asking are:

1.  Does the vendor make obvious references to data quality being important to them?

2.  Does the vendor define data quality in some way, either qualitatively or quantitatively?

3.  Do they indicate - operationally and technically - how they manage data quality?

4.  Do they share if, and how, customer feedback is essential to how they manage data quality?

5.  Is there any evidence that data quality is built into their commercial licenses as a form of assurance to their customers about what they can expect to receive? 

PLEASE suggest some additional questions to help bolster this approach even further!

This week’s focus is on Dow Jones

Part of the well-known News Corp family of companies, they’ve been a household name for decades and have built an enviable reputation for providing insightful, market-moving intelligence within capital and financial markets.

Their Narrative

Dow Jones (“DJ” / https://www.dowjones.com/) make an array of references to aspects of data quality, using descriptors such as:

  • Trust and trusted
  • Unrivaled
  • Market-moving
  • Exclusive
  • Rich
  • Industry-leading

These suggest quality and are provided in the form of benefit statements, as opposed to a strict definition or formula of what makes DJ’s data different.

Drilling down a bit to a key division - Dow Jones Risk & Compliance (“DJRC”) - DJRC offer up novel insights twice per year in the form of an external audit of their “Sanctions Platform and Data Set” and a Data Quality Report. From the 2024 version of the latter (DJRC’s “2024 Data Quality Report”) , they claim that data quality sits at the core (“heart”) of their business and is a shared area of focus between them and their clients.  A brief definition is provided:  “...to supply the most accurate, timely and relevant information that allows our customers to make risk decisions quickly and with confidence”.

This report also provides operational, technical and product-related details on how they manage data quality.

D3’s Score:  to date, this is the strongest evidence I’ve found of a company demonstrating their commitment to data quality:  “A”

Data Quality Definition

Despite an excellent level of detail provided in the above reports - especially the 2024 Data Quality Report - DJ / DJRC do not provide a singular, overarching quantitative definition of data quality.

But they do outline four key qualitative factors that summarize their operational focus:  accuracy, completeness, validity and timeliness. In support of their overall focus on DQ, DJRC leverages two novel metrics:

  • Precision Score - revealing the accuracy of content within a given company profile; and
  • Recall Score - covering completeness of the data they offer within a given company profile, in relation to publicly available sources

The Scores are generated by their Quality Assurance team after analyzing a “portion of new and changed profiles” of their overall dataset.  How large the sample is, isn’t clear. For your interest, they claim an average Precision score of 99.74% for the 12 months ending July 2024.

Lastly, these two metrics are then averaged into a single Accuracy Score, and presented for review over the past 10 years.

Score:  “A” on a comparative basis.  This is best-in-class so far with some room for improvement to be discussed later on.  Time for me to get picky!?!

Data Quality Process & Technology

To operationalize their focus on the four DQ factors - DJ/DJRC’s “Data Quality Assurance program” engage in a tiered, sequenced approach that includes:

  • Data Quality checks at point of ingestion or origination
  • Regular internal audits using the above DQ Scores, where anomalies are feedback to the teams, or individuals, to apply “human expertise” (Dow Jones “Authentic Intelligence”) on future profiles.
  • Automated quality checks - “hundreds” are used on a regular basis to review all content

An overarching Quality Assurance program is in place to enable continuous improvement of the core elements in their process, per above.

DJ/DJRC also provides a detailed breakdown of their “accuracy” process - 6 steps ranging from strict content definitions to external assurance reviews.

Overall, this is a strong program that would benefit from even more transparency.  Similar to S&P’s in terms of scope and rigour, but with a bit more ‘meat on the bones’....score of “A-”.

Customer Feedback and Engagement

Aside from stating - as we all do - that customers are important to their business, I could not find any specific information on HOW DJ/DJRC engages them in managing (improving) data quality according to their needs.

The Data Quality Report concludes with a section called “Keeping Customers Informed” but it doesn’t suggest there’s any formal way that their customers are solicited and/or their products are designed with features to invite feedback on DQ.

Given how strong they are in all other areas…I’d give them a D here.  

Commercial Licensing

I was unable to find any details on standard data licensing terms despite several attempts.  

Evaluation:  D.

Overall?

Dow Jones is part of News Corp., a ~$10.1B news and information company, and contributes $2.2B to their topline. 

From this initial review, I believe they are best-in-class from the perspective of transparency around their data quality definitions, metrics and processes. 

Improvements can be made in terms of how they engage customers in optimizing quality, while also providing details on their standard licensing where the burning question is:  is there an assurance that their Accuracy Score won’t fall below a certain level (in the manner of a technical SLA metric) over the life of an agreement?  

Time to do some more digging!



Feedback please!!Was this a helpful exercise for you?  Are you interested in more content on Data Quality?  Do you have a POV that you’d be willing to share with me?  DM me on LinkedIn or at drew@d3data.ca.  THANKS!

Understanding data quality assurance in specialized markets is vital. Consumers deserve clarity on value. Insights are key.

To view or add a comment, sign in

More articles by Drew Doherty

  • S2E4 - D3 Data Digs Into Real-World B2B Data Quality...and Finds ???

    Do Data Aggregators Put Data Quality At the Forefront of Their Value Proposition? Picking up where I left off, the…

  • S2E3 From D3 Data

    Welcome! Each week, D3 Data digs into issues that affect the universe of b2b data and the DaaP companies and their…

  • S2 - EPISODE 2 - FROM D3 DATA!

    Welcome to my weekly blog on all thing related to #b2bdata. Each week, D3 Data will explore what issues are impacting…

  • Season 2 - Episode 2 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week D3 Data will explore what issues are impacting the world…

  • Season 2 - Episode 1 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week D3 Data will explore what issues are impacting the world…

  • Season 1 - Episode 6 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week +d3data will explore what issues are impacting the world…

    3 Comments
  • Season 1 - Episode 5 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week D3 Data will explore what issues are impacting the world…

  • Season 1 - Episode 4 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week D3 Data will explore what issues are impacting the world…

    2 Comments
  • Season 1 - Episode 3 - from D3 Data!

    Welcome to my weekly blog on the world of #b2bdata! Each week +d3data will explore what issues are impacting the world…

  • Part 2 of 2: Where are all the new businesses coming from?

    Welcome to my weekly blog on the world of #b2bdata! Each week D3 Data will explore what issues are impacting the world…

    4 Comments

Others also viewed

Explore content categories