DATA ARCHITECT

DATA ARCHITECT

A data architect is an information technology (IT) specialist who designs and manages data systems, sets policies for how data is stored and accessed, coordinates various data sources within an organization, and integrates new data technologies into existing IT infrastructures. Data architects may act as liaisons between the IT side of an organization and other departments, aligning data collection and distribution policies with the organization’s operational and strategic objectives. They also typically work with members of a data team, which may include data engineers, data miners, data scientists, and data analysts, in areas related to data collection, data storage, data security, and data systems access.

DATA ARCHITECT

Data architecture refers both to the IT systems that facilitate the collection, storage, distribution, and consumption of data within an organization, and to the policies that govern how data is collected, stored, distributed, and accessed within an organization. From an IT standpoint, an organization’s data architecture typically includes data storage and warehousing systems (e.g., databases), computer networks that serve as data pipelines and provide access to stored data, and software platforms and analytics applications that process data in order to further an organization’s goals. In terms of organizational structure, data architecture may encompass personnel who have access to relevant and potentially sensitive data, policies governing data access, and the protocols for the secure distribution of data to relevant parties, including analytics specialists, operations managers, marketing departments, and others, depending on the size and type of the organization.

The Open Group, a consortium of IT industry leaders committed to “the development of open, vendor-neutral technology standards and certifications” in the realm of data technologies, maintains a framework for understanding data architecture in the context of other aspects of an organization’s IT infrastructures. The Open Group Architecture Framework (TOGAF) describes four types of architecture:

  • Business architecture, which defines the business strategy, governance, organization, and key business processes of the organization.
  • Data architecture, which describes the structure of an organization’s logical and physical data assets and the associated data management resources.
  • Applications architecture, which provides a blueprint for the individual systems to be deployed, the interactions between the application systems, and their relationships to the core business processes of the organization.
  • Technical architecture, or technology architecture, which describes the hardware, software, and network infrastructure needed to support the deployment of core, mission-critical applications.

In practice, it can be difficult to draw a clear line between data architecture and business, applications, and technical architectures, depending on the nature of an organization, its goals, and its size. As a consequence, it can be useful to conceptualize data architecture more broadly as a logical outgrowth of a business strategy or business architecture, which itself may follow from a larger strategic plan, sometimes referred to as enterprise architecture.

The Role of the Data Architect

In its TOGAF guidelines, the Open Group delineates three primary areas of concern for data architects: data management; data migration; and data governance. These are broad concerns that may require the attention of a data architecture group or department in larger organizations. In order to address these concerns effectively, data architects rely on a deep knowledge of contemporary data technology systems, including common operating and database systems, networking protocols, data analytics tools and methods, business intelligence software, and other data-centric elements of IT infrastructures.

While data engineers are typically tasked with the technical challenge of constructing and maintaining an organization’s data storage and distribution system, and data analysts and data scientists generally handle functions related to data modeling, data interpretation, and data reporting, their roles may overlap. However, the primary responsibility of data architects involves constructing and maintaining the technical infrastructures and policy frameworks for the secure and efficient collection, storage, and distribution of data within an organization.

In the “Skills Framework” articulated by the Open Group in its TOGAF guidelines, there are seven general skill areas that are considered central for of data and/or IT architects:

  1. Generic Skills – typically comprising leadership, teamwork skills, inter-personal skills, etc.
  2. Business Skills and Methods – typically comprising business cases, business process, strategic planning, etc.
  3. Enterprise Architecture Skills – typically comprising modeling, building block design, applications and role design, systems integration, etc.
  4. Program oroject Management Skills – typically comprising the management of change within a business, as well as project management methods and tools, etc.
  5. IT General Knowledge Skills – typically comprising brokering applications, asset management, migration planning, SLAs, etc.
  6. Technical IT Skills – typically comprising software engineering, security, data interchange, data management, etc.
  7. Legal Environment – typically comprising data protection laws, contract law, procurement law, fraud, etc.

TOGAF acknowledges that “‘IT Architecture’ and ‘IT Architect’ are widely used but poorly defined terms in the IT industry today.” As a result, professional responsibilities, skills, and knowledge areas for data architects may vary depending on the employer, the job, and the nature of an organization’s data systems and IT infrastructures.

How to Become a Data Architect

Just as there is no clear-cut, industry-wide definition of data architecture, there is no clearly defined pathway to becoming a data architect. As a general rule, data architects are professions who have formal training and/or professional experience in IT management, computer programming, and data systems engineering, as well as in the processes by which data is mined, sorted, stored, and analyzed. Data architects typically interact with and respond to the needs of non-technical managers and business administrators within an organization, which may require some training in professional communication. And it may be helpful for data architects to have familiarity and/or experience with business intelligence systems, data mining tools, and data analytics operations.

Training to become a data architect might begin with an undergraduate degree in computer science, computer engineering, or a related field. A graduate from a bachelor’s or associate degree program who possesses a strong background in computer programming and IT systems may find entry-level employment in a data-intensive field, such as IT administrations, computer programming, data mining, or data analytics. This types of work experience can be a pathway to a career in data architecture.

While master’s program in data architecture are rare, there are graduate programs that provide training and instruction in computer engineering, IT systems management, business data systems (i.e., business intelligence and analytics), many of which may include advanced coursework in data architecture, data warehousing, and data engineering. Master’s programs in these and similar technical fields can provide data professionals with the academic credentials and practical skills required to advance in the field of data architecture. There are also several professional credentials and certification programs that may be advantageous for data architects, including:

  • DAMA International’s Certified Data Management Professional (CDMP) certification
  • The Hortonworks Data Flow Certified NiFi Professional (HDFCNA) certification
  • The IBM Certified Data Architect – Big Data certification
  • The Salesforce Certified Data Architecture and Management Designer credential
  • The Open Group Certified Architect (Open CA) credential


To view or add a comment, sign in

More articles by Ragini Trivedi

  • GIT

    Git is a mature, actively maintained open source project originally developed in 2005 by Linus Torvalds. Git is an…

  • APACHE SPARK

    What is Apache Spark? Apache Spark is an open-source, distributed processing system used for big data workloads. It…

  • DEVOPS

    What is DevOps DevOps is a collection of flexible practices and processes organizations use to create and deliver…

  • AZURE DATA ENGINEER

    What is Azure Data Factory? Azure Data Factory is a cloud-based data integration service that allows you to create…

  • GCP

    Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that runs on the same…

  • ACTURIAL

    What Is Actuarial Science? Actuarial science is a discipline that assesses financial risks in the insurance and finance…

  • CLOUD OPERATIONS

    Cloud operations (CloudOps) is the management, delivery and consumption of software in a computing environment where…

  • SALESFORCE

    Salesforce, Inc. is an American cloud-based software company headquartered in San Francisco, California.

    1 Comment
  • REDSHIFT

    A Redshift Database is a cloud-based, big data warehouse solution offered by Amazon. The platform provides a storage…

  • UIPATH

    UiPath is a robotic process automation tool for large-scale end-to-end automation. For an accelerated business change…

    3 Comments

Others also viewed

Explore content categories