Soroush J. Pour

San Francisco, California, United States
5K followers 500+ connections

View mutual connections with Soroush

Soroush can introduce you to 5 people at Harmony Intelligence

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to follow

Harmony Intelligence

ARENA (Alignment Research Engineer Accelerator)

Personal Website

About

Building Harmony Intelligence, where we provide AI-powered, human-verified white-box…

Experience

Harmony Intelligence
-

Greater Shepparton, Victoria, Australia
-
-

Sydney, New South Wales, Australia
-

Sydney, New South Wales, Australia
-

Sydney, Australia
-
-

Sydney, Australia
-

Multiple locations in NZ, Australia, USA, LATAM
-

San Francisco Bay Area
-

San Francisco Bay Area
-

San Francisco Bay Area
-

Raleigh-Durham, North Carolina Area
-

Berlin Area, Germany
-

Education

ARENA (Alignment Research Engineer Accelerator)

-

https://www.arena.education/ - 5 week intensive program focused on accelerated learning of technical fundamentals of AI safety research engineering. Shared an office & collaborated with SERI MATS cohort. Coursework covered fundamentals of:

* Deep learning
* Transformer architecture
* Reinforcement learning
* GPU training: optimisation per GPU (incl. CUDA kernel programming) & distributed training.

Capstone project: automated red-teaming of LLMs (to be published soon)
-
-
-

Publications

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

arXiv preprint arXiv:2408.12622 August 14, 2024

The risks posed by Artificial Intelligence (AI) are of considerable concern to academics, auditors, policymakers, AI companies, and the public. However, a lack of shared understanding of AI risks can impede our ability to comprehensively discuss, research, and react to them. This paper addresses this gap by creating an AI Risk Repository to serve as a common frame of reference. This comprises a living database of 777 risks extracted from 43 taxonomies, which can be filtered based on two…

The risks posed by Artificial Intelligence (AI) are of considerable concern to academics, auditors, policymakers, AI companies, and the public. However, a lack of shared understanding of AI risks can impede our ability to comprehensively discuss, research, and react to them. This paper addresses this gap by creating an AI Risk Repository to serve as a common frame of reference. This comprises a living database of 777 risks extracted from 43 taxonomies, which can be filtered based on two overarching taxonomies and easily accessed, modified, and updated via our website and online spreadsheets. We construct our Repository with a systematic review of taxonomies and other structured classifications of AI risk followed by an expert consultation. We develop our taxonomies of AI risk using a best-fit framework synthesis. Our high-level Causal Taxonomy of AI Risks classifies each risk by its causal factors (1) Entity: Human, AI; (2) Intentionality: Intentional, Unintentional; and (3) Timing: Pre-deployment; Post-deployment. Our mid-level Domain Taxonomy of AI Risks classifies risks into seven AI risk domains: (1) Discrimination & toxicity, (2) Privacy & security, (3) Misinformation, (4) Malicious actors & misuse, (5) Human-computer interaction, (6) Socioeconomic & environmental, and (7) AI system safety, failures, & limitations. These are further divided into 23 subdomains. The AI Risk Repository is, to our knowledge, the first attempt to rigorously curate, analyze, and extract AI risk frameworks into a publicly accessible, comprehensive, extensible, and categorized risk database. This creates a foundation for a more coordinated, coherent, and complete approach to defining, auditing, and managing the risks posed by AI systems.

See publication
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation

Accepted to NeurIPS SoLaR 2023 November 6, 2023
Despite efforts to align large language models to produce harmless responses, they are still vulnerable to jailbreak prompts that elicit unrestricted behaviour. In this work, we investigate persona modulation as a black-box jailbreaking method to steer a target model to take on personalities that are willing to comply with harmful instructions. Rather than manually crafting prompts for each persona, we automate the generation of jailbreaks using a language model assistant. We demonstrate a…

Despite efforts to align large language models to produce harmless responses, they are still vulnerable to jailbreak prompts that elicit unrestricted behaviour. In this work, we investigate persona modulation as a black-box jailbreaking method to steer a target model to take on personalities that are willing to comply with harmful instructions. Rather than manually crafting prompts for each persona, we automate the generation of jailbreaks using a language model assistant. We demonstrate a range of harmful completions made possible by persona modulation, including detailed instructions for synthesising methamphetamine, building a bomb, and laundering money. These automated attacks achieve a harmful completion rate of 42.5% in GPT-4, which is 185 times larger than before modulation (0.23%). These prompts also transfer to Claude 2 and Vicuna with harmful completion rates of 61.0% and 35.9%, respectively. Our work reveals yet another vulnerability in commercial large language models and highlights the need for more comprehensive safeguards.

Other authors
See publication

Projects

Bitcoin Multisignature Transaction Builder

Dec 2014

Built open-source Go implementation of Bitcoin protocol Multisignature transactions. Focus on usability and maintainability, with full suite of tests, complete GoDoc style documentation and examples for CLI interface. Blog post: http://bit.ly/1CQLwHA

See project
Built InMoov Humanoid Robotic Arm

Jan 2014
3D printed and assembled the InMoov Humanoid Robotic Arm as detailed on: http://www.inmoov.fr/ We succeeded in getting full finger, wrist, bicep and shoulder articulation. We hooked up the arm to an Arduino and wrote a script to control the arm from the command line.

Other creators
See project
Dubbit Cartoon Creator App

Jun 2013
Built an iOS app that allows users to create animated cartoons and share them with friends. We used the Cocos2D library on the client side, ZeroMQ to communicate with a Python backend server converting character position coordinates into .mp4 movie files using a Pygame rendering engine, FFmpeg and the Youtube API for hosting.

Other creators
See project
Mobile Live Chat Support

May 2013
Began development on a mobile live chat iOS plug-in that would enable live chat integration into iOS apps. Built with a Node.js backend with XMPP support and a frontend library and UI written in Objective-C.

Other creators
See project

Honors & Awards

LLM Evals Hackathon - Honourable Mention - AI Pentester

AGI House (https://agihouse.ai/)

Oct 2023

Alex Browne and I won honourable mention for our GPT-4 AI pentesting agent ("AI Pentester"), which was coded up in just ~5 hours & was able to exploit a vulnerability in a target server given access to a base Kali Linux bash instance and basic instructions -- nothing else. The vulnerability was trivial & known CVE, but still impressed us that it worked at all, especially after with such minimal development time on our part!
Frank Borchardt Prize for Undergraduate Entrepreneurship

Duke Innovation & Entrepreneurship

May 2014

$20,000 grant to support the top undergrad entrepreneurs at Duke. My collaborators (Fabio Berger, Alex Browne) & I were the inaugural winners for our startup work throughout our college years.

https://entrepreneurship.duke.edu/borchardt-prize/
Magna cum Laude

Duke University

May 2014
Robertson Scholarship

Robertson Scholars Program

Aug 2009

The Robertson Scholars Leadership Program invests in young leaders who strive to make transformational contributions to society.

The scholarship provided:

- Four-year scholarship, including undergraduate tuition, room and board
- Attending classes at Duke and UNC-Chapel Hill
- Three summers of domestic and international experiences
- Customized leadership and professional development

http://robertsonscholars.org/

Languages

English

Native or bilingual proficiency
Japanese

Limited working proficiency
Persian

Limited working proficiency
Spanish

Limited working proficiency

View Soroush’s full profile

See who you know in common
Get introduced
Contact Soroush directly

Join to view full profile

Other similar profiles

Marc Stormer

Marc Stormer

Strathdale, VIC

Connect
Allan Bugh

Allan Bugh

Sydney, NSW

Connect
Romulo De Macedo

Romulo De Macedo

Greater Brisbane Area

Connect
Daniel Marsden-Jones

Daniel Marsden-Jones

Rozelle, NSW

Connect
George Peppou

George Peppou

Redfern, NSW

Connect
Gustav de Bruin

Gustav de Bruin

Geelong, VIC

Connect
Anthony Brewer

Anthony Brewer

Brisbane, QLD

Connect
Matthew Faria

Matthew Faria

Greater Melbourne Area

Connect
Dr. Martin Gosnell

Dr. Martin Gosnell

Greater Sydney Area

Connect
Greg Taylor

Greg Taylor

Greater Melbourne Area

Connect
Mark Wardle

Mark Wardle

Oakleigh East, VIC

Connect
Senta Walton

Senta Walton

Birmensdorf (ZH)

Connect
Ellen van Dam

Ellen van Dam

Greater Sydney Area

Connect
Pawel Mieszczanek, PhD

Pawel Mieszczanek, PhD

Australia

Connect
Tony Feneziani

Tony Feneziani

Melbourne, VIC

Connect
Derek Wang

Derek Wang

Greater Sydney Area

Connect
Anthony Skeats

Anthony Skeats

Springfield, SA

Connect
Fabio Rigato

Fabio Rigato

Greater Turin Metropolitan Area

Connect
Ian Hamilton BEng (Hons)

Ian Hamilton BEng (Hons)

Sydney, NSW

Connect
Scott Turner

Scott Turner

Sydney, NSW

Connect

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

Soroush J. Pour

San Francisco, California, United States 5K followers 500+ connections

About

Experience

Harmony Intelligence

-

-

-

-

-

-

-

-

-

-

-

-

-

-

Education

ARENA (Alignment Research Engineer Accelerator)

-

-

-

-

Publications

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

arXiv preprint arXiv:2408.12622 August 14, 2024

Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation

Accepted to NeurIPS SoLaR 2023 November 6, 2023

Projects

Bitcoin Multisignature Transaction Builder

Dec 2014

Built InMoov Humanoid Robotic Arm

Jan 2014

Dubbit Cartoon Creator App

Jun 2013

Mobile Live Chat Support

May 2013

Honors & Awards

LLM Evals Hackathon - Honourable Mention - AI Pentester

AGI House (https://agihouse.ai/)

Frank Borchardt Prize for Undergraduate Entrepreneurship

Duke Innovation & Entrepreneurship

Magna cum Laude

Duke University

Robertson Scholarship

Robertson Scholars Program

Languages

English

Native or bilingual proficiency

Japanese

Limited working proficiency

Persian

Limited working proficiency

Spanish

Limited working proficiency

View Soroush’s full profile

Other similar profiles

Marc Stormer

Allan Bugh

Romulo De Macedo

Daniel Marsden-Jones

George Peppou

Gustav de Bruin

Anthony Brewer

Matthew Faria

Dr. Martin Gosnell

Greg Taylor

Mark Wardle

Senta Walton

Ellen van Dam

Pawel Mieszczanek, PhD

Tony Feneziani

Derek Wang

Anthony Skeats

Fabio Rigato

Ian Hamilton BEng (Hons)

Scott Turner

Explore more posts

San Francisco, California, United States
5K followers 500+ connections