Sarah Aerni

Sarah Aerni · 2026-04-23T05:57:23.647Z

Incredible opportunities!!! Encourage you to take a look

San Francisco, California, United States
10K followers 500+ connections

View mutual connections with Sarah

Sarah can introduce you to 10+ people at Intuit

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to follow

Intuit

Stanford University

About

Technology leader with experience in Machine Learning for over 15 years. 9 years in…

Activity

LLMs need a trusted platform. That's why 90% of the top AI companies run their businesses on Salesforce, along with these leading enterprises…

LLMs need a trusted platform. That's why 90% of the top AI companies run their businesses on Salesforce, along with these leading enterprises…

Liked by Sarah Aerni
A few days in-person with our Intuit team in San Diego, shaping a sharper view of where the work goes next. The industry is in the middle of a real…

A few days in-person with our Intuit team in San Diego, shaping a sharper view of where the work goes next. The industry is in the middle of a real…

Liked by Sarah Aerni
Just got back from #TDX2026 — and MAN what an awesome experience. The speed that we, (as a MASSIVE company), are innovating in the AI space, is…

Just got back from #TDX2026 — and MAN what an awesome experience. The speed that we, (as a MASSIVE company), are innovating in the AI space, is…

Liked by Sarah Aerni

Join now to see all activity

Experience

Intuit

California, United States
-

San Francisco
-

San Francisco, California, United States
-

San Francisco Bay Area
-
-
-

San Francisco Bay Area
-
-
-
-
-

San Francisco
-
-
-
-
-
-
-
-

Education

Stanford University

-

2006 - 2012

Activities and Societies: Founding member of the Stanford Association for Multi-Disciplinary Medicine and Science (SAMMS), organizing committee member for Biomedical Computing at Stanford (a student-run conference), elected student representative to the executive committee of the BMI program, organizing industry panels.

Primary Faculty Advisor : Serafim Batzoglou
Co-advisor: Stuart Kim
William R. Hewlett Fellow (Stanford Graduate Fellowships), National Science Foundation Graduate Research Fellow
-

2011 - 2011
-

2009 - 2009
-

2001 - 2005
-

Volunteer Experience

Tutor

Boys & Girls Clubs of America
Tutor

Stanford ScienceBus

Publications

A Bioinformatics Guide for Molecular Biologists

Cold Spring Harbor Laboratory Press June 1, 2014
Informatics can vastly assist progress in research and development in cell and molecular biology and biomedicine. However, many investigators are either unaware of the ways in which informatics can improve their research or find it inaccessible due to a feeling of “informatics anxiety.” This sense of apprehension results from improper communication of the principles behind these approaches and of the value of the many tools available. In fact, many researchers are inherently distrustful of…

Informatics can vastly assist progress in research and development in cell and molecular biology and biomedicine. However, many investigators are either unaware of the ways in which informatics can improve their research or find it inaccessible due to a feeling of “informatics anxiety.” This sense of apprehension results from improper communication of the principles behind these approaches and of the value of the many tools available. In fact, many researchers are inherently distrustful of these tools. A more complete understanding of bioinformatics offered in A Bioinformatics Guide for Molecular Biologists will allow the reader to become comfortable with these techniques, encouraging their use—thus helping to make sense of the vast accumulation of data. To make these concepts more accessible, the editors approach the field of bioinformatics from the viewpoint of a molecular biologist, (1) arming the biologist with a basic understanding of the fundamental concepts in the field, (2) presenting approaches for using the tools from the standpoint of the data for which they are created, and (3) showing how the field of informatics is quickly adapting to the advancements in biology and biomedical technologies. All concepts are paired with recommendations for the appropriate programming environment and tools best suited to solve the particular problem at hand. It is a must-read for those interested in learning informatics techniques required for successful research and development in the laboratory.

Other authors
See publication
Automated Cellular Annotation for High Resolution Images of Adult C. elegans

Bioinformatics [ISMB/ECCB] 2013 July 1, 2013
Motivation: Advances in high-resolution microscopy have recently made possible the analysis of gene expression at the level of individual cells. The fixed lineage of cells in the adult worm Caenorhabditis elegans makes this organism an ideal model for studying complex biological processes like development and aging. However, annotating individual cells in images of adult C.elegans typically requires expertise and significant manual effort. Automation of this task is therefore critical to…

Motivation: Advances in high-resolution microscopy have recently made possible the analysis of gene expression at the level of individual cells. The fixed lineage of cells in the adult worm Caenorhabditis elegans makes this organism an ideal model for studying complex biological processes like development and aging. However, annotating individual cells in images of adult C.elegans typically requires expertise and significant manual effort. Automation of this task is therefore critical to enabling high-resolution studies of a large number of genes.

Results: In this article, we describe an automated method for annotating a subset of 154 cells (including various muscle, intestinal and hypodermal cells) in high-resolution images of adult C.elegans. We formulate the task of labeling cells within an image as a combinatorial optimization problem, where the goal is to minimize a scoring function that compares cells in a test input image with cells from a training atlas of manually annotated worms according to various spatial and morphological characteristics. We propose an approach for solving this problem based on reduction to minimum-cost maximum-flow and apply a cross-entropy–based learning algorithm to tune the weights of our scoring function. We achieve 84% median accuracy across a set of 154 cell labels in this highly variable system. These results demonstrate the feasibility of the automatic annotation of microscopy-based images in adult C.elegans.

Other authors
See publication
Automated cellular annotation for high-resolution images of adult Caenorhabditis elegans

Bioinformatics 2013
Other authors
See publication
Reconstructing cancer genomes from paired-end sequencing data

BMC Bioinformatics April 19, 2012
Other authors
See publication
Reconstruction of genealogical relationships with applications to Phase III of HapMap.

Bioinformatics [ISMB/ECCB] 2011
*Authors should be regarded as joint First Authors.

Other authors
See publication
Analysis of gene regulation and cell fate from single-cell gene expression profiles in C. elegans

Cell October 30, 2009
The C. elegans cell lineage provides a unique opportunity to look at how cell lineage affects patterns of gene expression. We developed an automatic cell lineage analyzer that converts high-resolution images of worms into a data table showing fluorescence expression with single-cell resolution. We generated expression profiles of 93 genes in 363 specific cells from L1 stage larvae and found that cells with identical fates can be formed by different gene regulatory pathways. Molecular signatures…

The C. elegans cell lineage provides a unique opportunity to look at how cell lineage affects patterns of gene expression. We developed an automatic cell lineage analyzer that converts high-resolution images of worms into a data table showing fluorescence expression with single-cell resolution. We generated expression profiles of 93 genes in 363 specific cells from L1 stage larvae and found that cells with identical fates can be formed by different gene regulatory pathways. Molecular signatures identified repeating cell fate modules within the cell lineage and enabled the generation of a molecular differentiation map that reveals points in the cell lineage when developmental fates of daughter cells begin to diverge. These results demonstrate insights that become possible using computational approaches to analyze quantitative expression from many genes in parallel using a digital gene expression atlas.

Other authors
See publication
BJ Raphael, S Volik, P Yu, C Wu, G Huang, EV Linardopoulou, BJ Trask, FM Waldman, J Costello, KJ Pienta, GB Mills, K Bajsarowicz, Y Kobayashi, S Shivaranjani, P Paris, Q Tao, SJ Aerni, RP Brown, A Bashir, JW Gray, JF Cheng, P de Jong, M Nefedov, T Ried, H

-
BT Messmer*, B Raphael*, SJ Aerni, GF Widhopf, LZ Rassenti, JG Gribben, NE Kay, TJ Kipps "Computational Identification Of CDR3 Sequence Archetypes Among Immunoglobulin Sequences in Chronic Lymphocytic Leukemia" Leukemia Research, Volume 33, Issue 3, Pages

-
Reconstructing Cancer Genome Organization

BMC Bioinformatics
A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data.

We demonstrate that PREGO efficiently…

A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data.

We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.

Other authors
See publication
SJ Aerni, E Eskin “10 Years of the International Conference on Research in Computational Molecular Biology (RECOMB)”,RECOMB 2006: 546-562

-

Join now to see all publications

Patents

AUTOMATIC DETERMINATION OF ALTERNATIVE PATHS FOR A PROCESS FLOW USING MACHINE LEARNING

Issued October 1, 2024 US 12,105,725
Methods, systems, apparatuses, devices, and computer program products are described. A system may identify, from an event log including log entries for a tenant of a multi-tenant database system, a pattern of log entries corresponding to main actions and satisfying a frequency threshold. The system may identify log entries associated with the pattern and corresponding to the main actions, detailed actions, or both. The system may retrieve data corresponding to a history field of a data object…

Methods, systems, apparatuses, devices, and computer program products are described. A system may identify, from an event log including log entries for a tenant of a multi-tenant database system, a pattern of log entries corresponding to main actions and satisfying a frequency threshold. The system may identify log entries associated with the pattern and corresponding to the main actions, detailed actions, or both. The system may retrieve data corresponding to a history field of a data object associated with the pattern and may determine at least a portion of a process flow for the data object according to the pattern and based on the log entries and the historical data. The process flow may include operations to perform using the data object. In some cases, the system may transmit, to a user device, an indication of the portion of the process flow for user review and implementation.

Other inventors
RAPID PROCESSING OF BIOLOGICAL SEQUENCE DATA

Issued July 11, 2017 US 9703925
In general, one aspect of the subject matter described in this specification is embodied in operations of processing sequence data by selecting a distribution key according to a type of one or more tasks to be performed on the data. The key is one or more data fields of a sequence data file, e.g., a sequence alignment/map (SAM) format or binary sequence alignment/map (BAM) format file, or derived from one or more data fields of a sequence data file. The sequence data is then distributed to…

In general, one aspect of the subject matter described in this specification is embodied in operations of processing sequence data by selecting a distribution key according to a type of one or more tasks to be performed on the data. The key is one or more data fields of a sequence data file, e.g., a sequence alignment/map (SAM) format or binary sequence alignment/map (BAM) format file, or derived from one or more data fields of a sequence data file. The sequence data is then distributed to multiple nodes of a parallel processing relational database system. The system performs the tasks of processing the sequence data by executing database queries. The system executes the database queries on multiple nodes in parallel. The system can use query optimization functions built into the database to expedite performance of each task.

Other inventors
IN-DATABASE SINGLE-NUCLEOTIDE GENETIC VARIANT ANALYSIS

Issued March 14, 2017 US 9,594,777
Genetic data in row-wise flat files, such as VCF and VCF-like files, comprising a plurality of data elements of different types is analyzed using a parallel framework in an MPP shared-nothing distributed database having a plurality of distributed segments by first parsing the data into groups of data elements of the same types, converting the data into entry-wise genetic data such that the same types of data elements are in a column, and distributing and storing the entry-wise genetic data in…

Genetic data in row-wise flat files, such as VCF and VCF-like files, comprising a plurality of data elements of different types is analyzed using a parallel framework in an MPP shared-nothing distributed database having a plurality of distributed segments by first parsing the data into groups of data elements of the same types, converting the data into entry-wise genetic data such that the same types of data elements are in a column, and distributing and storing the entry-wise genetic data in the distributed segments. SQL database queries are used to analyze the genetic data, including locating probable significant associations between genotype and phenotype data.

Other inventors
See patent
ELEMENT IDENTIFICATION IN DATABASE

Issued February 14, 2017 US 9,569,464
This document describes, among other things, a computer-implemented method. The method includes obtaining a structured data object that having a plurality of nodes that represent elements in the data object. One or more tables that define a table representation of the data object can be generated. The one or more tables can include a plurality of table entries that correspond to the plurality of nodes, respectively. For each of one or more first nodes from among the plurality of nodes, the…

This document describes, among other things, a computer-implemented method. The method includes obtaining a structured data object that having a plurality of nodes that represent elements in the data object. One or more tables that define a table representation of the data object can be generated. The one or more tables can include a plurality of table entries that correspond to the plurality of nodes, respectively. For each of one or more first nodes from among the plurality of nodes, the method can include identifying information about one or more second nodes that are determined to be adjacent or otherwise related to the first node by performing window functions along two or more coordinate systems in the one or more tables. The window function can be centered on a particular table entry that corresponds to the first node of the data object.

Other inventors
See patent

Courses

Machine Learning at Stanford University

CS 229

Languages

English

Native or bilingual proficiency
German

Native or bilingual proficiency
Swiss German

Native or bilingual proficiency
French

Limited working proficiency

More activity by Sarah

Incredible opportunities!!! Encourage you to take a look

Incredible opportunities!!! Encourage you to take a look

Shared by Sarah Aerni

View Sarah’s full profile

See who you know in common
Get introduced
Contact Sarah directly

Join to view full profile

Other similar profiles

Claire (Na) Cheng

Claire (Na) Cheng

Greater Seattle Area

Connect
Christopher Gutierrez

Christopher Gutierrez

Ranchos De Taos, NM

Connect
Mohammad Sabah

Mohammad Sabah

Los Angeles Metropolitan Area

Connect
Nikolaos Vasiloglou

Nikolaos Vasiloglou

Atlanta, GA

Connect
William Ford, PhD

William Ford, PhD

Bellevue, WA

Connect
Abhishek Bhardwaj

Abhishek Bhardwaj

New York, NY

Connect
Guy Lebanon

Guy Lebanon

San Francisco Bay Area

Connect
Chester Chen

Chester Chen

Santa Clara, CA

Connect
Mike Tamir, PhD

Mike Tamir, PhD

San Francisco Bay Area

Connect
Addhyan Pandey

Addhyan Pandey

Chicago, IL

Connect
Caitlin Smallwood

Caitlin Smallwood

Palo Alto, CA

Connect
Faezeh (Fae) Salehi

Faezeh (Fae) Salehi

Los Gatos, CA

Connect
Pavel Dmitriev

Pavel Dmitriev

Bellevue, WA

Connect
Ilkay Altintas

Ilkay Altintas

San Diego County, CA

Connect
David Comfort, DPhil

David Comfort, DPhil

Los Angeles Metropolitan Area

Connect
Jason Robinson, PhD

Jason Robinson, PhD

San Francisco Bay Area

Connect
Dan Banas

Dan Banas

Charlotte, NC

Connect
Ricardo Bion

Ricardo Bion

San Francisco Bay Area

Connect
Vishal Morde

Vishal Morde

San Francisco Bay Area

Connect

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

Sarah Aerni

San Francisco, California, United States 10K followers 500+ connections

About

Activity

LLMs need a trusted platform. That's why 90% of the top AI companies run their businesses on Salesforce, along with these leading enterprises…

Liked by Sarah Aerni

A few days in-person with our Intuit team in San Diego, shaping a sharper view of where the work goes next. The industry is in the middle of a real…

Liked by Sarah Aerni

Just got back from #TDX2026 — and MAN what an awesome experience. The speed that we, (as a MASSIVE company), are innovating in the AI space, is…

Liked by Sarah Aerni

Experience

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

Education

-

-

-

-

-

Volunteer Experience

Tutor

Tutor

Stanford ScienceBus

Publications

Cold Spring Harbor Laboratory Press June 1, 2014

Bioinformatics [ISMB/ECCB] 2013 July 1, 2013

Bioinformatics 2013

BMC Bioinformatics April 19, 2012

Bioinformatics [ISMB/ECCB] 2011

Cell October 30, 2009

BJ Raphael, S Volik, P Yu, C Wu, G Huang, EV Linardopoulou, BJ Trask, FM Waldman, J Costello, KJ Pienta, GB Mills, K Bajsarowicz, Y Kobayashi, S Shivaranjani, P Paris, Q Tao, SJ Aerni, RP Brown, A Bashir, JW Gray, JF Cheng, P de Jong, M Nefedov, T Ried, H

-

BT Messmer*, B Raphael*, SJ Aerni, GF Widhopf, LZ Rassenti, JG Gribben, NE Kay, TJ Kipps "Computational Identification Of CDR3 Sequence Archetypes Among Immunoglobulin Sequences in Chronic Lymphocytic Leukemia" Leukemia Research, Volume 33, Issue 3, Pages

-

BMC Bioinformatics

SJ Aerni, E Eskin “10 Years of the International Conference on Research in Computational Molecular Biology (RECOMB)”,RECOMB 2006: 546-562

-

Patents

AUTOMATIC DETERMINATION OF ALTERNATIVE PATHS FOR A PROCESS FLOW USING MACHINE LEARNING

Issued October 1, 2024 US 12,105,725

RAPID PROCESSING OF BIOLOGICAL SEQUENCE DATA

Issued July 11, 2017 US 9703925

Issued March 14, 2017 US 9,594,777

Issued February 14, 2017 US 9,569,464

Courses

Machine Learning at Stanford University

CS 229

Languages

English

Native or bilingual proficiency

German

Native or bilingual proficiency

Swiss German

Native or bilingual proficiency

French

Limited working proficiency

More activity by Sarah

Incredible opportunities!!! Encourage you to take a look

Shared by Sarah Aerni

View Sarah’s full profile

Other similar profiles

Claire (Na) Cheng

Christopher Gutierrez

San Francisco, California, United States
10K followers 500+ connections

BT Messmer, B Raphael, SJ Aerni, GF Widhopf, LZ Rassenti, JG Gribben, NE Kay, TJ Kipps "Computational Identification Of CDR3 Sequence Archetypes Among Immunoglobulin Sequences in Chronic Lymphocytic Leukemia" Leukemia Research, Volume 33, Issue 3, Pages