DATA SCIENCE

DATA SCIENCE


1 Introduction

How can we effectively and efficiently teach data science to students with little to no background in computing and statistical thinking? How can we equip them with the skills and tools for reasoning with various types of data and leave them wanting to learn more? This article describes an introductory data science course that is our (working) answer to these questions.

At its core, the course focuses on data acquisition and wrangling, exploratory data analysis, data visualization, inference, modeling, and effective communication of results.

ABSTRACT

The proliferation of vast quantities of available datasets that are large and complex in nature has challenged universities to keep up with the demand for graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data. To keep up with this demand, attracting students early on to data science as well as providing them a solid foray into the field becomes increasingly important. We present a case study of an introductory undergraduate course in data science that is designed to address these needs. Offered at Duke University, this course has no prerequisites and serves a wide audience of aspiring statistics and data.

Article content

2 Background and Related Work

An exact characterization of what the field of data science is meant to encompass is still debated. However, in this article, we define data science as the “science of planning for, acquisition, management, analysis of, and inference from four of the most recent curriculum guidelines for undergraduate programs in data science, statistics, and computer science to assess how the case study course ranks up against them.

While the 2013 Computer Science Curricula of the Association for Computing Machinery (ACM) Sahami et al. do not mention suggestions for integrating data science into a computer science major, the 2019 report by the ACM Task Force on Data Science Education suggestions of core competencies a graduating data science student should leave with. Each competency corresponds to one of nine data science knowledge areas: computing fundamentals; data acquirement and governance; data management, storage, and retrieval; data privacy, security, and integrity; machine learning; data mining; big data; analysis and presentation; and professionalism. The report also suggests that a full data science curriculum should integrate courses in “calculus, discrete structures, probability theory, elementary statistics, advanced topics in statistics, and linear algebra.” We note, however, that this document was released as a draft at the time of writing this article.


To view or add a comment, sign in

More articles by Kamesh K

  • Business Ethics

    Business ethics involves applying moral principles and values to guide decision-making and behavior in a business…

  • Customer service

    Customer service is the support provided by a company to its customers before, during, and after a purchase. It…

  • Computer Vision

    Computer Vision: Computer Vision is a field of artificial intelligence (AI) that enables computers to interpret and…

  • Artificial Intelligence (AI)

    Artificial intelligence Artificial intelligence (AI) is the ability of machines to learn, think, and perform tasks…

    1 Comment
  • E - Vehicle

    E vehicle: An Electric Vehicle (EV) is a type of vehicle that uses electricity as its primary source of power, instead…

  • Full Stack of Web Development

    Full Stack of Web Development: Full stack web development is the process of building and maintaining both the front-end…

    1 Comment
  • Drug Awareness

    Your best chance to avoid addiction is not to use drugs, and avoid situations that present problems. If someone offers…

  • A Guide to Selecting the Right IT Project Management Methodology

    Any company that conducts business on a project-by-project basis should have a plan for managing and completing…

  • SOIL POLLUTION OF WORLD

    Soil pollution refers to the contamination of soil with anomalous concentrations of toxic substances. It is a serious…

  • Improve Defence

    Improving defense can encompass various aspects, whether it's in the context of personal safety, cybersecurity, or…

Others also viewed

Explore content categories