Jay Joshi

Santa Clara, California, United States
4K followers 500+ connections

View mutual connections with Jay

Jay can introduce you to 10+ people at LinkedIn

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

Carnegie Mellon University

About

My key technical skill-set includes, but are not limited to:

Java , Python , AWS…

Activity

Jack announced a 40% cut in their workforce because of 'Intelligence Tools' that are acclerating the way of working with flatter teams. I think…

Jack announced a 40% cut in their workforce because of 'Intelligence Tools' that are acclerating the way of working with flatter teams. I think…

Liked by Jay Joshi
Hard truth! Happy Wednesday!

Hard truth! Happy Wednesday!

Liked by Jay Joshi

Experience

LinkedIn

Mountain View, CA
-

San Francisco Bay Area
-

San Francisco Bay Area

Education

Carnegie Mellon University

-

2017 - 2018

Courses

Applied Science – I

107002
Applied Science – II

107009
Basic Electrical Engineering

103004
Computer Networks

310250
Data Communications

-
Data Structures and Algorithms

210244
Database Management Systems

310241
Design & Analysis Algorithm

410441
Digital Electronics and Logic Design

210243
Digital Signal Processing

-
Discrete Structures

210241
Engineering Mathematics – I

107001
Engineering Mathematics – II

107008
Engineering Mathematics – III

207003
Engineering Mechanics

101010
Fundamentals of Programming Languages

110003
Humanities and Social science

207005
Microprocessors and Microcontrollers

-
Programming & problem solving

210242
Business Analytic & Intelligence

410451
Cloud Computing

15619
Computer Architecture and Organization

210252
Computer Graphics

210251
Data Mining Technology & Application

410444D
Data Structures

210225
Data Structures

210250
Data Structures for Application Programmers

17683
Finance and Management Information Systems

310251
J2EE Web Application Development

17682
Java for Application Programmers

17681
Law of Computer Technology

17662
Microprocessors and Interfacing Techniques

210249
Microprocessors and interfacing

210254
Object Oriented Programming & Computer Graphics Laboratory

210253
Parallel and High Performance Computing

410449
Pervasive Computing

410445B
Principles of Complier Design

410442
Principles of Programming Languages

310249
Smart System Design & Application

410443
Software Design Method & Test

410448
Systems Programming & Operating System

310252
Theory of Computation

-

Projects

Big Data Analysis - Kaggle Challenge of Home Depot's Product Search Relevance

May 2018

Improved Home Depot's customers' shopping experience by developing a model that can accurately predict the relevance of search results, using Apache Spark's MLlib, and using Python's numpy and pandas libraries in an interactive Jupyter notebook. Created a basic pipeline with Tokenizer - HashingTF - Linear Regression and iteratively improved the performance by experimenting with others transformers and estimators such as Word2Vec and Random Forest. Preprocessed the data using Stemming and…

Improved Home Depot's customers' shopping experience by developing a model that can accurately predict the relevance of search results, using Apache Spark's MLlib, and using Python's numpy and pandas libraries in an interactive Jupyter notebook. Created a basic pipeline with Tokenizer - HashingTF - Linear Regression and iteratively improved the performance by experimenting with others transformers and estimators such as Word2Vec and Random Forest. Preprocessed the data using Stemming and transformed using stop words remover. Used Cosine similarity, Jaccard similarity and Match words to compute the resultant feature vectors. Improved the performance using Spark's PARAM Grid tuning and deployed the Spark job on YARN (Hadoop) Cluster. Achieved a final Root mean square error of 0.506 within a deadline of 2 weeks (in context, the highest RMSE among the 2125 teams that participated was 0.432).

Tools, Technologies and APIs used: Apache Spark's MLlib, pandas and numpy libraries from Python, Jupyter /Zeppelin notebook, Anaconda Python 3 distribution, Hortonworks Data Platform, HDFS
Auto Text Completion engine

Apr 2018

- Designed and implemented a MapReduce solution to pre-process a large text-based dataset for data cleaning purposes.
- Build a probabilistic language model using MapReduce batch processing to store the count and probability of pharses upto 5-grams in a 10 GB dataset.
- Configured and deployed a storage caching layer using Amazon elastic cache and Redis (Jedis) to improve the performance of the text completion.
Uber like driver matching service

Apr 2018

- Generated a stream of data using Kafka producer and made it available for a Samza consumer on AWS.
- Designed and implemented a solution for a driver matching service like Uber by joining and processing multiple real-time streams of GPS data and driver data using the Samza API.
- Implemented an auto-scaling cluster with AWS APIs to dynamically adjust server instances based on real-time load.
Consistency Models and Multithreading

Mar 2018

- Compared and contrasted the advantages and disadvantages of using replication in distributed key-value stores.
- Studied the pros and cons of different techniques to improve consistency, availability, and partitioning in a system.
- Discussed the various levels of consistency that can be employed in a distributed data store.
- Used multithreading to achieve strong and eventual consistency models for a distributed key-value store in different geographic regions.
Data Mining CRM - Product Selection and Success Prediction

Mar 2018

Implemented popular Data Mining algorithms such as K-Nearest Neighbors and Decision Trees, to predict product selection information, as well as the potential success of the newly introduced products, based on buying habits of existing customers and their profiles, and historical data on the sales volume of past products. Cross-validated the algorithms on the given training set and optimized weights programmatically to get maximum accuracy on the training data. Used the optimized weights that…

Implemented popular Data Mining algorithms such as K-Nearest Neighbors and Decision Trees, to predict product selection information, as well as the potential success of the newly introduced products, based on buying habits of existing customers and their profiles, and historical data on the sales volume of past products. Cross-validated the algorithms on the given training set and optimized weights programmatically to get maximum accuracy on the training data. Used the optimized weights that yielded the highest accuracy on the given test data set to predict the product selection (89% accuracy) and success likelihood (97% accuracy).
Horizontal Scaling and Auto-Scaling Web Application

Feb 2018

• Implemented a horizontal scaling web application with GCP, Azure, and AWS APIs able to process 3000+ RPS load
• Implemented an auto-scaling web application with AWS APIs to dynamically adjust server instances based on read-time load
Contextual Design and UI Testing

Nov 2017

• Designed the user tasks and interviewer script of a contextual inquiry for existing drug store websites
• Implemented and improved user interface of a hypothetical online drug store based on contextual inquiry, heuristic analysis, and usability testing
Networking solutions for vehicles and infrastructure

Oct 2017

Developed vehicle to vehicle (V2V) networking and communication solution.
Developed networking solution for vehicle to Infrastructure communication.
Developed a multi-modular Infotainment system.

1. Proposed application for Real-Time routing and weather related navigation using DSRC technology.
2. Applications within project explored DSRC, Over the Air, CAN, ECU and Wi-Fi technologies.
IoT Solutions for Contemporary Healthcare (Ubiquitous Computing)

Sep 2017

Developed contemporary IOT Solutions for a healthcare institution.

1. Proposed applications in Surgery, Human Resources, Cost accounting and food services.
2. Charted cost estimation for the complete project.
3. Led a team of five people
Cloud Based Health and video Service

Oct 2017 - Oct 2018

- Created a simulated system to score users based on their dietary habits. It was created as a part of cloud computing project at CMU using AWS EC2, S3, SNS, Lambda, RDS (MySQL), Rekognition, Docker and Clarifai.
- Modified the existing architecture to a youtube like video which allows the user to search the videos on the basis of on the content present in them with the help of labels generated in last step and CloudSearch. Users can also preview the video by hovering over them.
Twitter Analytics Web Service

Mar 2018 - Apr 2018

- Implemented a web-service to extract tweets and users given trending topics, hashtags and a time frame.
- Designed and implemented a high performance, fault-tolerant and scalable cloud deployment strategy responding to live load while meeting infrastructure and budgetary needs.
- Performed ETL on a 1 TB dataset to load data into MySQL and HBase systems using MapReduce and Spark frameworks on AWS, GCP, and Azure.
- Hiked the performance of service from 3000RPS to 10,000RPS by modeling…

- Implemented a web-service to extract tweets and users given trending topics, hashtags and a time frame.
- Designed and implemented a high performance, fault-tolerant and scalable cloud deployment strategy responding to live load while meeting infrastructure and budgetary needs.
- Performed ETL on a 1 TB dataset to load data into MySQL and HBase systems using MapReduce and Spark frameworks on AWS, GCP, and Azure.
- Hiked the performance of service from 3000RPS to 10,000RPS by modeling effective schemas, sharding the database and optimizing server threads while utilizing the same resources.
- Configured the service to handle data from all languages, including emoji.
- Deployed the web service using Docker images on Kubernetes across multiple cloud service
Search Engine Optimization

Feb 2018 - Mar 2018

Performed competitor analysis and keyword research using Google Trends and computed Inverse document frequency to find out attractor and discriminatory terms. By effecting changes in the URL structure, targeted HTML changes, keyword concentrated content to increase the term frequency and link building, I optimized to increase the search ranking of a low ranked website to within top 3, in CMU's Indri Search Engine.
Social Networking Timeline with Heterogeneous Backends

Feb 2018 - Mar 2018

- Compared the advantages and disadvantages of utilizing flat files, SQL databases, and NoSQL database solutions.
- Performed ETL for a dataset.
- Integrated together SQL and NoSQL databases to work on complex applications to build a social networking website.
- Responded to complex queries that span multiple databases.
- Implemented User Access functionalities like login, logout on SQL.
- Stored a social graph using Hbase
- Build a timeline using MongoDB
- Build a…

- Compared the advantages and disadvantages of utilizing flat files, SQL databases, and NoSQL database solutions.
- Performed ETL for a dataset.
- Integrated together SQL and NoSQL databases to work on complex applications to build a social networking website.
- Responded to complex queries that span multiple databases.
- Implemented User Access functionalities like login, logout on SQL.
- Stored a social graph using Hbase
- Build a timeline using MongoDB
- Build a recommendation engine for the social network.
Big Data Analytics

Jan 2018 - Feb 2018

- Processed a large text dataset (Wikipedia) using MapReduce running within distributed frameworks on AWS (EMR), Azure (HD Insight) and GCP.
- Derived popular trends based on tags such as "Donald Trump".
Mutual Fund Web Application Project description

Jan 2018 - Feb 2018

• Implemented a web application that allows users or super users to buy, sell or manage Mutual Funds
• Led the software design and implementation of the MVC components and deployment on the AWS Cloud
• Implemented a RESTful web service based on this web application
Locating targets through mention in Twitter(Python) using sentiment an Sentiment Analysis

Jan 2015 - Apr 2015

-Categorization of a pool of users from Twitter based on their location, content of the tweet, social factors (number of followers, social influence) and activeness of user to generate a list of top users likely to retweet.
-In addition to these , performed sentiment analysis on the tweets to match the most relevant tweets using NLP, NTLK and Ranked the users using a modified version of Support Vector Machine
Bus Tracker Hybrid application( IOS and Android) for Port Authority of Allegheny County

-

Developed an Android application for bus tracking in Port Authority of Allegheny County. Core functionalities of the app includes: Listing the ETA of all buses (with the help of PAAC API) for a bus stop clicked on the map (rendered using Google maps API), a search toolbar, powered by Google PlacesAPI, that helps search the input destination, and display all possible buses to that destination from the current location. In addition, an alarm feature is implemented to notify the user at a custom…

Developed an Android application for bus tracking in Port Authority of Allegheny County. Core functionalities of the app includes: Listing the ETA of all buses (with the help of PAAC API) for a bus stop clicked on the map (rendered using Google maps API), a search toolbar, powered by Google PlacesAPI, that helps search the input destination, and display all possible buses to that destination from the current location. In addition, an alarm feature is implemented to notify the user at a custom time prior to the actual arrival of a bus at a chosen stop.

Tools, Technologies and APIs used: Android SDK, Android Studio, Java, Google MapsAPI, Google PlacesAPI, PAAC API.

Recommendations received

Andrii Korotkov

“I had an opportunity to work with Jay on the Tools team as a part of Infrastructure team at Verkada. He's good at learning new areas and can operate both in a mode to deliver results fast and taking longer time with higher quality. He's good at investigating and fixing things, doing exploration of domains. He's also kind and respectful.”

1 person has recommended Jay

Join now to view

View Jay’s full profile

See who you know in common
Get introduced
Contact Jay directly

Join to view full profile

Other similar profiles

Ankitha Shetty

Ankitha Shetty

Greater Seattle Area

Connect
Khushboo Mandlecha

Khushboo Mandlecha

San Francisco, CA

Connect
Anuradha Rajashekar

Anuradha Rajashekar

San Francisco Bay Area

Connect
Deep Parekh

Deep Parekh

New York City Metropolitan Area

Connect
Amir Bahrami

Amir Bahrami

Bellevue, WA

Connect
Aliaksei Belablotski

Aliaksei Belablotski

Sultan, WA

Connect
Vikas Deshpande

Vikas Deshpande

San Francisco, CA

Connect
Yiwen Chen

Yiwen Chen

Mountain View, CA

Connect
Chun Wang

Chun Wang

Mountain View, CA

Connect
Guangdong Liu

Guangdong Liu

Houston, TX

Connect
Mengyun Lv

Mengyun Lv

San Francisco Bay Area

Connect
Thanh Dinh

Thanh Dinh

United States

Connect
Siddharth Shah

Siddharth Shah

Greater Seattle Area

Connect
Nissan Modi

Nissan Modi

Seattle, WA

Connect
Rishab Saraf

Rishab Saraf

Houston, TX

Connect
Shoubhik Debnath

Shoubhik Debnath

Santa Clara, CA

Connect
Brian Yang

Brian Yang

San Francisco, CA

Connect
Bhupendra Kastore

Bhupendra Kastore

New York, NY

Connect
Yu Zhou

Yu Zhou

Palo Alto, CA

Connect

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

Jay Joshi

Santa Clara, California, United States 4K followers 500+ connections

About

Activity

Jack announced a 40% cut in their workforce because of 'Intelligence Tools' that are acclerating the way of working with flatter teams. I think…

Liked by Jay Joshi

Hard truth! Happy Wednesday!

Liked by Jay Joshi

Experience

LinkedIn

-

-

Education

Carnegie Mellon University

-

Courses

Applied Science – I

107002

Applied Science – II

107009

Basic Electrical Engineering

103004

Computer Networks

310250

Data Communications

-

Data Structures and Algorithms

210244

Database Management Systems

310241

Design & Analysis Algorithm

410441

Digital Electronics and Logic Design

210243

Digital Signal Processing

-

Discrete Structures

210241

Engineering Mathematics – I

107001

Engineering Mathematics – II

107008

Engineering Mathematics – III

207003

Engineering Mechanics

101010

Fundamentals of Programming Languages

110003

Humanities and Social science

207005

Microprocessors and Microcontrollers

-

Programming & problem solving

210242

Business Analytic & Intelligence

410451

Cloud Computing

15619

Computer Architecture and Organization

210252

Computer Graphics

210251

Data Mining Technology & Application

410444D

Data Structures

210225

Data Structures

210250

Data Structures for Application Programmers

17683

Finance and Management Information Systems

310251

J2EE Web Application Development

17682

Java for Application Programmers

17681

Law of Computer Technology

17662

Microprocessors and Interfacing Techniques

210249

Santa Clara, California, United States
4K followers 500+ connections