Eric Colson

Eric Colson

Los Gatos, California, United States
11K followers 500+ connections

About

Data science, AI, big data, machine learning, statistical learning, social algorithms…

Articles by Eric

Activity

Join now to see all activity

Experience

  • Activation Fund Graphic
  • -

    Austin, Texas, United States

  • -

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    Los Gatos, CA

  • -

    Los Gatos, CA

  • -

  • -

  • -

  • -

  • -

Education

  • Stanford University Graphic

    Stanford University

    -

    - Present

    Activities and Societies: Stanford Industrial Affiliates

    Masters Degree in MS&E with focus on Statistical Learning.
    Advisor: Ramesh Johari

    Research Topics:
    * "Human‐Machine Systems to Leverage More Data", Directed Study with Professor Ashish Goel
    * "The Benefits of Generalists over Specialists in a Dynamic Environment", Directed Study with Professor Bob Sutton

  • -

  • -

Publications

  • Feature Selection and Validation for Human Classifiers

    Association for the Advancement of Artificial Intelligence (www.aaai.org)

    Algorithmic approaches to prediction and recommendation can often be improved by combining the results with the curation of human experts. Hybrid machine- human recommendation systems can combine the best of both large-scale machine learning and expert-human judgement. In this paper, we outline an approach for measuring, training, and understanding the human contribution to the combined system. This approach provides a practical strategy for optimizing the role and experience of the human…

    Algorithmic approaches to prediction and recommendation can often be improved by combining the results with the curation of human experts. Hybrid machine- human recommendation systems can combine the best of both large-scale machine learning and expert-human judgement. In this paper, we outline an approach for measuring, training, and understanding the human contribution to the combined system. This approach provides a practical strategy for optimizing the role and experience of the human experts. We share a motivating example from Stitch Fix, an online personal styling service that commits to its recommendations through the physical delivery of merchandise to clients.

    Other authors
  • ETD: A Design Pattern for Building Web-Based Analytics Dashboards in R

    The R User Conference 2014

    ETD, an abbreviation for extract-transform-display, is a design pattern that the Stitch Fix data team
    observed while building reporting and analytics dashboards in R, using the Shiny package. Formalizing
    this pattern reduces the complexity involved in creating web-based dashboards. It also provides a
    templatized approach for creating dashboards and promotes re-use and encapsulation of R and data
    extraction code.

    Developing interactive web-based dashboards typically…

    ETD, an abbreviation for extract-transform-display, is a design pattern that the Stitch Fix data team
    observed while building reporting and analytics dashboards in R, using the Shiny package. Formalizing
    this pattern reduces the complexity involved in creating web-based dashboards. It also provides a
    templatized approach for creating dashboards and promotes re-use and encapsulation of R and data
    extraction code.

    Developing interactive web-based dashboards typically involves three distinct stages. First, the
    ‘Extract’ stage pulls data from a data source - typically a relational database using SQL. This extracted
    data is pulled into an R data structure where complex calculations can be applied (e.g. cross-tabulation,
    cleansing routines, conditional probabilities, complex metric definitions, …etc.). This is the
    ‘Transform’ stage. Finally, the transformed information is displayed using standard R visualization
    packages like ggplot or googleVis. This is the ‘Display’ stage. This 3-staged workflow is analogous
    to the extract-transform-load2
    (ETL) pattern prevalent in data warehousing. The important distinction is
    the final stage where, rather than loading the data for system consumption, we are rendering the
    information for end-user consumption.

    Many data scientists lack the requisite skills to build web-based analytics dashboards. However,
    packages like Shiny provide a layer of abstraction that enables them to build web-based application in
    R without having to learn HTML, Javascript and CSS. Our ETD design pattern takes it one step further
    by taming the complexities of Shiny’s reactive programming framework and making it possible to
    templatize the creation of typical analytics dashboards. Using the ETD pattern in the development of
    Shiny dashboards helps our data scientists build complex web-based dashboards quickly while keeping
    our R code-base modular, clean, and extensible

    Other authors
    See publication
  • Using Human and Machine Processing in Recommendation Systems

    Association for the Advancement of Artificial Intelligence (www.aaai.org)

    Customer-item recommendation systems are used by many ecommerce companies. For this task, machine-learning algorithms are efficient at processing structured data for ranking vast catalogs of merchandise in the context of a customer. Yet, machines are notoriously challenged when it comes to processing unstructured data such as images and free-form text. Humans, on the other hand, can process unstructured data effectively. They can also better contextualize the results and perceive more nuanced…

    Customer-item recommendation systems are used by many ecommerce companies. For this task, machine-learning algorithms are efficient at processing structured data for ranking vast catalogs of merchandise in the context of a customer. Yet, machines are notoriously challenged when it comes to processing unstructured data such as images and free-form text. Humans, on the other hand, can process unstructured data effectively. They can also better contextualize the results and perceive more nuanced distinctions vs. machines alone. However, human processing is inefficient on large sets of unranked items. By using the two resources together in a single system, more data and processing can be leveraged.

    See publication

Projects

  • Stitch Fix Algorithms Tour

    An interactive tour of some of the algorithms in use at Stitch Fix.

    Other creators
    See project
  • Human-in-the-loop machine learning, by Ted Cuzzillo

    What do you call a practice that most data scientists have heard of, few have tried, and even fewer know how to do well? It turns out, no one is quite certain what to call it. In our latest free report Real-World Active Learning: Applications and Strategies for Human-in-the-Loop Machine Learning, we examine the relatively new field of “active learning” — also referred to as “human computation,” “human-machine hybrid systems,” and “human-in-the-loop machine learning.” Whatever you call it, the…

    What do you call a practice that most data scientists have heard of, few have tried, and even fewer know how to do well? It turns out, no one is quite certain what to call it. In our latest free report Real-World Active Learning: Applications and Strategies for Human-in-the-Loop Machine Learning, we examine the relatively new field of “active learning” — also referred to as “human computation,” “human-machine hybrid systems,” and “human-in-the-loop machine learning.” Whatever you call it, the field is exploding with practical applications that are proving the efficiency of combining human and machine intelligence.

    See project
  • Strata 2015: Data (art &) Science

    For years now we’ve espoused data-driven decision making into the organization. And, while we still need to take this further, there is equal opportunity in internalizing the “art” that exists within the organization. The judgement and cultural values that reside within the brains of our employees can be harvested and married with data science to produce new capabilities. In this talk we will share new ideas about how to systematically combine the assets of the organization – be they machines…

    For years now we’ve espoused data-driven decision making into the organization. And, while we still need to take this further, there is equal opportunity in internalizing the “art” that exists within the organization. The judgement and cultural values that reside within the brains of our employees can be harvested and married with data science to produce new capabilities. In this talk we will share new ideas about how to systematically combine the assets of the organization – be they machines or humans.

    See project
  • ACM Recsys 2014: Blending Human Computation and Machine Algorithms for Personalized Style Recommendations

    Machine algorithms are great for tasks that require processing of large amounts of objective and structured data. However, they have difficulty with tasks that are relatively simple for skilled humans – For example, interpreting concepts in an image, or discerning tone in language, ..etc. Yet, there is a class of problems that call for precisely the combination of these tasks. This concept of human-assisted algorithmic processing is not new. It is inherent to many processes that we are familiar…

    Machine algorithms are great for tasks that require processing of large amounts of objective and structured data. However, they have difficulty with tasks that are relatively simple for skilled humans – For example, interpreting concepts in an image, or discerning tone in language, ..etc. Yet, there is a class of problems that call for precisely the combination of these tasks. This concept of human-assisted algorithmic processing is not new. It is inherent to many processes that we are familiar with. However, there are very few systems that embrace humans and machines as two resources within a single system. Instead, they are often independent and non-collaborating agents. In this talk, we explain how a single task-processing system can be architected to use diverse resources: be they human or machine. Such a system not only better utilizes each resource, but also produces better results and gets better with experience.

    See project
  • Fashioning Data: How fashion industry leaders innovate with data, By Julie Steele and Liza Kindred

    The traditional cycle in the fashion industry starts on the runway, with designs that are finally available for consumer purchase several months later. But as this O’Reilly report shows, consumers are now an integral part of the full fashion cycle—even before some styles come to fruition—as fashion innovators find new ways to bring data analytics to the industry.

    Through interviews with several fashion startups, authors Liza Kindred and Julie Steele reveal that these pioneers are talking…

    The traditional cycle in the fashion industry starts on the runway, with designs that are finally available for consumer purchase several months later. But as this O’Reilly report shows, consumers are now an integral part of the full fashion cycle—even before some styles come to fruition—as fashion innovators find new ways to bring data analytics to the industry.

    Through interviews with several fashion startups, authors Liza Kindred and Julie Steele reveal that these pioneers are talking to customers and getting valuable data from them, in ways that other industries would be wise to emulate. Some aspects of fashion are becoming more agile, as startups such as Poshly, Rent the Runway, and Stitch Fix respond to customer input on sizing, preference, and more.

    As you’ll discover, data science has already made big alterations to the $3 trillion fashion industry, via a growing number of fashion data tools and trends, including custom fit and local manufacturing. At the same time, there is lots of room for exploration and innovation, especially in the areas of machine vs human insight, image processing, and online vs offline data collection.

    Liza Kindred is the founder of Third Wave Fashion, a fashion tech think tank, and the author of the upcoming O’Reilly book How We Buy Now.

    Julie Steele is Director of Communications at Silicon Valley Data Science and coauthor of Beautiful Visualization and Designing Data Visualizations (both O’Reilly).

    Other creators
    See project
  • Strata 2013: "Committing to Recommendation Algorithms"

    -

    Developing recommendation algorithms so accurate that we can commit to them.

    See project

Recommendations received

3 people have recommended Eric

Join now to view

More activity by Eric

View Eric’s full profile

  • See who you know in common
  • Get introduced
  • Contact Eric directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Others named Eric Colson

Add new skills with these courses