Pedro Larroy

Pedro Larroy

Pacifica, California, United States
2K followers 500+ connections

About

25 years of diverse Engineering experience solving all kinds of problems with diverse…

Articles by Pedro

Activity

Join now to see all activity

Experience

  • NVIDIA Graphic

    NVIDIA

    Santa Clara, California, United States

  • -

    San Francisco Bay Area

  • -

    Palo Alto, California, United States

  • -

    Palo Alto

  • -

    Berlin Area, Germany

  • -

    Berlin Area, Germany

  • -

    Berlin Area, Germany

  • -

    Berlin Area, Germany

  • -

    Berlin

  • -

    Barcelona

  • -

    Barcelona

Education

  • Universitat Politècnica de Catalunya

    Ingierniero superior de telecomunicación

    -

    Activities and Societies: Robotics and embedded software course during vacation. Printed circuit design and microchip PIC programming, spare time activities. Open source projects.

    Plan 92: http://www.etsetb.upc.edu/info_sobre/estudis/pla_92/eng_telecos/

  • -

    -

  • -

    -

Publications

  • Fairness Measures for Machine Learning in Finance

    PMR

    The authors present a machine learning pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Whereas accuracy metrics are well understood and the principal ones are used frequently, there is no consensus as to which of several available measures for fairness should be used in a generic manner in the financial services industry. The authors explore these measures and discuss which ones to focus on at various stages in the ML…

    The authors present a machine learning pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Whereas accuracy metrics are well understood and the principal ones are used frequently, there is no consensus as to which of several available measures for fairness should be used in a generic manner in the financial services industry. The authors explore these measures and discuss which ones to focus on at various stages in the ML pipeline, pre-training and post-training, and they examine simple bias mitigation approaches. Using a standard dataset, they show that the sequencing in their FAML pipeline offers a cogent approach to arriving at a fair and accurate ML model. The authors discuss the intersection of bias metrics with legal considerations in the United States, and the entanglement of explainability and fairness is exemplified in the case study. They discuss possible approaches for training ML models while satisfying constraints imposed from various fairness metrics and the role of causality in assessing fairness.

    See publication
  • Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud

    ACM

    Understanding the predictions made by machine learning (ML) models and their potential biases remains a challenging and labor-intensive task that depends on the application, the dataset, and the specific model. We present Amazon SageMaker Clarify, an explainability feature for Amazon SageMaker that launched in December 2020, providing insights into data and ML models by identifying biases and explaining predictions. It is deeply integrated into Amazon SageMaker, a fully managed service that…

    Understanding the predictions made by machine learning (ML) models and their potential biases remains a challenging and labor-intensive task that depends on the application, the dataset, and the specific model. We present Amazon SageMaker Clarify, an explainability feature for Amazon SageMaker that launched in December 2020, providing insights into data and ML models by identifying biases and explaining predictions. It is deeply integrated into Amazon SageMaker, a fully managed service that enables data scientists and developers to build, train, and deploy ML models at any scale. Clarify supports bias detection and feature importance computation across the ML lifecycle, during data preparation, model evaluation, and post-deployment monitoring. We outline the desiderata derived from customer input, the modular architecture, and the methodology for bias and explanation computations. Further, we describe the technical challenges encountered and the tradeoffs we had to make. For illustration, we discuss two customer use cases. We present our deployment results including qualitative customer feedback and a quantitative evaluation. Finally, we summarize lessons learned, and discuss best practices for the successful adoption of fairness and explanation tools in practice.

    See publication
  • Fairness Measures for Machine Learning in Finance

    AWS

    We present a machine learning pipeline for
    fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and
    accuracy). Whereas accuracy metrics are well understood and the principal ones used frequently,
    there is no consensus as to which of several available measures for fairness should be used in a
    generic manner in the financial services industry. We explore these measures and discuss which
    ones to focus on, at various stages in the ML
    pipeline…

    We present a machine learning pipeline for
    fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and
    accuracy). Whereas accuracy metrics are well understood and the principal ones used frequently,
    there is no consensus as to which of several available measures for fairness should be used in a
    generic manner in the financial services industry. We explore these measures and discuss which
    ones to focus on, at various stages in the ML
    pipeline, pre-training and post-training, and we
    also examine simple bias mitigation approaches.
    Using a standard dataset we show that the sequencing in our FAML pipeline offers a cogent
    approach to arriving at a fair and accurate ML
    model. We discuss the intersection of bias metrics with legal considerations in the US, and the
    entanglement of explainability and fairness is exemplified in the case study. We discuss possible
    approaches for training ML models while satisfying constraints imposed from various fairness
    metrics, and the role of causality in assessing fairness.

    See publication
  • AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

    Arxiv

    We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file. Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. Experiments reveal that our multi-layer combination of many models offers better use of allocated…

    We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file. Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. Experiments reveal that our multi-layer combination of many models offers better use of allocated training time than seeking out the best. A second contribution is an extensive evaluation of public and commercial AutoML platforms including TPOT, H2O, AutoWEKA, auto-sklearn, AutoGluon, and Google AutoML Tables. Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate. We find that AutoGluon often even outperforms the best-in-hindsight combination of all of its competitors. In two popular Kaggle competitions, AutoGluon beat 99% of the participating data scientists after merely 4h of training on the raw data.

    See publication
  • Peer to peer synchronization using vector clocks and repository update clocks

    github

    A method is proposed to detect concurrent changes, conflicts and causality violations in a large set of data files or key-value pairs which are shared and synchronized across a cluster of compute nodes in which the wall clock is not necessarily synchronized. The method proposed also guarantees that on reconnection only the list of files or keys that have changed since the last synchronization is transmitted.

    See publication
  • GMM Based multimodal biometric identification

    http://www.enterface.net/enterface05/docs/results/reports/project5.pdf

    Gaussian Mixture Model expectation maximization model for sensor fusion.

    Other authors
    • Yannis Stylianoiu
    • Yannis Pantazis
    • Felipe Calderero
    • Francois Severin
    • Rolando Bonal
    • Federico Matta
    • Athanasios Valsamakis
    See publication

Patents

  • Map data compatibility processing architecture

    Filed US US20180239828A1

    Systems and methods are provided for executing a filter on map data. The filter receives a first notification that a version of first map data from a first map data source is available. The filter determines that the version of first map data is compatible using one or more version rules stored in the filter. The filter processes the version of first map data, when the version of first map data is compatible. The filter generates a second notification that a processed version of first map data…

    Systems and methods are provided for executing a filter on map data. The filter receives a first notification that a version of first map data from a first map data source is available. The filter determines that the version of first map data is compatible using one or more version rules stored in the filter. The filter processes the version of first map data, when the version of first map data is compatible. The filter generates a second notification that a processed version of first map data is available.

    Other inventors
    See patent
  • Fresh hybrid routing independent of map version and provider

    Issued US US9874451B2

    Systems, methods, and apparatuses are described for providing fresh hybrid routing independent of map version and provider. A set of routing data is received in response to a routing request. The set of routing data includes road segments. An analysis may be performed of a local map and the set of routing data. At least one unmatched road segment between the local map and the set of routing data is identified based on the analysis. A request for update data for the at least one unmatched road…

    Systems, methods, and apparatuses are described for providing fresh hybrid routing independent of map version and provider. A set of routing data is received in response to a routing request. The set of routing data includes road segments. An analysis may be performed of a local map and the set of routing data. At least one unmatched road segment between the local map and the set of routing data is identified based on the analysis. A request for update data for the at least one unmatched road segment is made. Using the local map, the set of routing data, and the update data for the at least one unmatched road segment a navigation action is generated.

    See patent
  • Method for identifying and diagnosing interferences in RF signals and particularly television signals

    Filed EU EP2048801

    A method for identifying and diagnosing interferences in television signals, whether analogue or digital, in which, for each frequency, frequency interval (1, 2, 3, 4) or carrier wave (P1 to P9,...), there is a predetermined optimal level of a spectrum control variable (D, A), based on time (spectrum), such as for example, intensity, radiated power, MER, BER, etc.

    See patent

Courses

  • Algorithms I (Coursera - Princeton)

    -

  • Algorithms II (Coursera - Princeton)

    -

  • Architecture of operating systems

    11512

  • Artificial Intelligence (Udacity)

    -

  • Bases de dades

    M2009

  • Computational Finance

    COMP510

  • Computational Investing, Part I

    -

  • Concurrent programming

    11520

  • Data mining

    UOC

  • Data transmission, cryptography and cryptology

    11557

  • Functional Programming Principles in Scala (Coursera)

    -

  • Game theory (coursera)

    -

  • Machine Learning (Andrew Ng)

    -

  • Machine Learning From Data

    230625

  • Microcontroller and PCB design for robotics

    -

  • Networks and communication services

    11522

  • Optical communications

    11513

  • Principles of Reactive Programming

    -

  • Radio communications

    11521

  • Team skills traning and communication (Nokia)

    -

Projects

Languages

  • Spanish

    Native or bilingual proficiency

  • English

    Native or bilingual proficiency

  • German

    Limited working proficiency

  • Catalan

    Professional working proficiency

Recommendations received

More activity by Pedro

View Pedro’s full profile

  • See who you know in common
  • Get introduced
  • Contact Pedro directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses