“I had an opportunity to work with Jay on the Tools team as a part of Infrastructure team at Verkada. He's good at learning new areas and can operate both in a mode to deliver results fast and taking longer time with higher quality. He's good at investigating and fixing things, doing exploration of domains. He's also kind and respectful.”
About
My key technical skill-set includes, but are not limited to:
Java , Python , AWS…
Activity
-
Jack announced a 40% cut in their workforce because of 'Intelligence Tools' that are acclerating the way of working with flatter teams. I think…
Jack announced a 40% cut in their workforce because of 'Intelligence Tools' that are acclerating the way of working with flatter teams. I think…
Liked by Jay Joshi
Experience
Education
Courses
-
Applied Science – I
107002
-
Applied Science – II
107009
-
Basic Electrical Engineering
103004
-
Computer Networks
310250
-
Data Communications
-
-
Data Structures and Algorithms
210244
-
Database Management Systems
310241
-
Design & Analysis Algorithm
410441
-
Digital Electronics and Logic Design
210243
-
Digital Signal Processing
-
-
Discrete Structures
210241
-
Engineering Mathematics – I
107001
-
Engineering Mathematics – II
107008
-
Engineering Mathematics – III
207003
-
Engineering Mechanics
101010
-
Fundamentals of Programming Languages
110003
-
Humanities and Social science
207005
-
Microprocessors and Microcontrollers
-
-
Programming & problem solving
210242
-
Business Analytic & Intelligence
410451
-
Cloud Computing
15619
-
Computer Architecture and Organization
210252
-
Computer Graphics
210251
-
Data Mining Technology & Application
410444D
-
Data Structures
210225
-
Data Structures
210250
-
Data Structures for Application Programmers
17683
-
Finance and Management Information Systems
310251
-
J2EE Web Application Development
17682
-
Java for Application Programmers
17681
-
Law of Computer Technology
17662
-
Microprocessors and Interfacing Techniques
210249
-
Microprocessors and interfacing
210254
-
Object Oriented Programming & Computer Graphics Laboratory
210253
-
Parallel and High Performance Computing
410449
-
Pervasive Computing
410445B
-
Principles of Complier Design
410442
-
Principles of Programming Languages
310249
-
Smart System Design & Application
410443
-
Software Design Method & Test
410448
-
Systems Programming & Operating System
310252
-
Theory of Computation
-
Projects
-
Big Data Analysis - Kaggle Challenge of Home Depot's Product Search Relevance
Improved Home Depot's customers' shopping experience by developing a model that can accurately predict the relevance of search results, using Apache Spark's MLlib, and using Python's numpy and pandas libraries in an interactive Jupyter notebook. Created a basic pipeline with Tokenizer - HashingTF - Linear Regression and iteratively improved the performance by experimenting with others transformers and estimators such as Word2Vec and Random Forest. Preprocessed the data using Stemming and…
Improved Home Depot's customers' shopping experience by developing a model that can accurately predict the relevance of search results, using Apache Spark's MLlib, and using Python's numpy and pandas libraries in an interactive Jupyter notebook. Created a basic pipeline with Tokenizer - HashingTF - Linear Regression and iteratively improved the performance by experimenting with others transformers and estimators such as Word2Vec and Random Forest. Preprocessed the data using Stemming and transformed using stop words remover. Used Cosine similarity, Jaccard similarity and Match words to compute the resultant feature vectors. Improved the performance using Spark's PARAM Grid tuning and deployed the Spark job on YARN (Hadoop) Cluster. Achieved a final Root mean square error of 0.506 within a deadline of 2 weeks (in context, the highest RMSE among the 2125 teams that participated was 0.432).
Tools, Technologies and APIs used: Apache Spark's MLlib, pandas and numpy libraries from Python, Jupyter /Zeppelin notebook, Anaconda Python 3 distribution, Hortonworks Data Platform, HDFS -
Auto Text Completion engine
- Designed and implemented a MapReduce solution to pre-process a large text-based dataset for data cleaning purposes.
- Build a probabilistic language model using MapReduce batch processing to store the count and probability of pharses upto 5-grams in a 10 GB dataset.
- Configured and deployed a storage caching layer using Amazon elastic cache and Redis (Jedis) to improve the performance of the text completion. -
Uber like driver matching service
- Generated a stream of data using Kafka producer and made it available for a Samza consumer on AWS.
- Designed and implemented a solution for a driver matching service like Uber by joining and processing multiple real-time streams of GPS data and driver data using the Samza API.
- Implemented an auto-scaling cluster with AWS APIs to dynamically adjust server instances based on real-time load. -
Consistency Models and Multithreading
- Compared and contrasted the advantages and disadvantages of using replication in distributed key-value stores.
- Studied the pros and cons of different techniques to improve consistency, availability, and partitioning in a system.
- Discussed the various levels of consistency that can be employed in a distributed data store.
- Used multithreading to achieve strong and eventual consistency models for a distributed key-value store in different geographic regions. -
Data Mining CRM - Product Selection and Success Prediction
Implemented popular Data Mining algorithms such as K-Nearest Neighbors and Decision Trees, to predict product selection information, as well as the potential success of the newly introduced products, based on buying habits of existing customers and their profiles, and historical data on the sales volume of past products. Cross-validated the algorithms on the given training set and optimized weights programmatically to get maximum accuracy on the training data. Used the optimized weights that…
Implemented popular Data Mining algorithms such as K-Nearest Neighbors and Decision Trees, to predict product selection information, as well as the potential success of the newly introduced products, based on buying habits of existing customers and their profiles, and historical data on the sales volume of past products. Cross-validated the algorithms on the given training set and optimized weights programmatically to get maximum accuracy on the training data. Used the optimized weights that yielded the highest accuracy on the given test data set to predict the product selection (89% accuracy) and success likelihood (97% accuracy).
-
Horizontal Scaling and Auto-Scaling Web Application
• Implemented a horizontal scaling web application with GCP, Azure, and AWS APIs able to process 3000+ RPS load
• Implemented an auto-scaling web application with AWS APIs to dynamically adjust server instances based on read-time load -
Contextual Design and UI Testing
• Designed the user tasks and interviewer script of a contextual inquiry for existing drug store websites
• Implemented and improved user interface of a hypothetical online drug store based on contextual inquiry, heuristic analysis, and usability testing -
Networking solutions for vehicles and infrastructure
Developed vehicle to vehicle (V2V) networking and communication solution.
Developed networking solution for vehicle to Infrastructure communication.
Developed a multi-modular Infotainment system.
1. Proposed application for Real-Time routing and weather related navigation using DSRC technology.
2. Applications within project explored DSRC, Over the Air, CAN, ECU and Wi-Fi technologies. -
IoT Solutions for Contemporary Healthcare (Ubiquitous Computing)
Developed contemporary IOT Solutions for a healthcare institution.
1. Proposed applications in Surgery, Human Resources, Cost accounting and food services.
2. Charted cost estimation for the complete project.
3. Led a team of five people -
Cloud Based Health and video Service
-
- Created a simulated system to score users based on their dietary habits. It was created as a part of cloud computing project at CMU using AWS EC2, S3, SNS, Lambda, RDS (MySQL), Rekognition, Docker and Clarifai.
- Modified the existing architecture to a youtube like video which allows the user to search the videos on the basis of on the content present in them with the help of labels generated in last step and CloudSearch. Users can also preview the video by hovering over them.
-
Twitter Analytics Web Service
-
- Implemented a web-service to extract tweets and users given trending topics, hashtags and a time frame.
- Designed and implemented a high performance, fault-tolerant and scalable cloud deployment strategy responding to live load while meeting infrastructure and budgetary needs.
- Performed ETL on a 1 TB dataset to load data into MySQL and HBase systems using MapReduce and Spark frameworks on AWS, GCP, and Azure.
- Hiked the performance of service from 3000RPS to 10,000RPS by modeling…- Implemented a web-service to extract tweets and users given trending topics, hashtags and a time frame.
- Designed and implemented a high performance, fault-tolerant and scalable cloud deployment strategy responding to live load while meeting infrastructure and budgetary needs.
- Performed ETL on a 1 TB dataset to load data into MySQL and HBase systems using MapReduce and Spark frameworks on AWS, GCP, and Azure.
- Hiked the performance of service from 3000RPS to 10,000RPS by modeling effective schemas, sharding the database and optimizing server threads while utilizing the same resources.
- Configured the service to handle data from all languages, including emoji.
- Deployed the web service using Docker images on Kubernetes across multiple cloud service -
Search Engine Optimization
-
Performed competitor analysis and keyword research using Google Trends and computed Inverse document frequency to find out attractor and discriminatory terms. By effecting changes in the URL structure, targeted HTML changes, keyword concentrated content to increase the term frequency and link building, I optimized to increase the search ranking of a low ranked website to within top 3, in CMU's Indri Search Engine.
-
Social Networking Timeline with Heterogeneous Backends
-
- Compared the advantages and disadvantages of utilizing flat files, SQL databases, and NoSQL database solutions.
- Performed ETL for a dataset.
- Integrated together SQL and NoSQL databases to work on complex applications to build a social networking website.
- Responded to complex queries that span multiple databases.
- Implemented User Access functionalities like login, logout on SQL.
- Stored a social graph using Hbase
- Build a timeline using MongoDB
- Build a…- Compared the advantages and disadvantages of utilizing flat files, SQL databases, and NoSQL database solutions.
- Performed ETL for a dataset.
- Integrated together SQL and NoSQL databases to work on complex applications to build a social networking website.
- Responded to complex queries that span multiple databases.
- Implemented User Access functionalities like login, logout on SQL.
- Stored a social graph using Hbase
- Build a timeline using MongoDB
- Build a recommendation engine for the social network. -
Big Data Analytics
-
- Processed a large text dataset (Wikipedia) using MapReduce running within distributed frameworks on AWS (EMR), Azure (HD Insight) and GCP.
- Derived popular trends based on tags such as "Donald Trump". -
Mutual Fund Web Application Project description
-
• Implemented a web application that allows users or super users to buy, sell or manage Mutual Funds
• Led the software design and implementation of the MVC components and deployment on the AWS Cloud
• Implemented a RESTful web service based on this web application -
Locating targets through mention in Twitter(Python) using sentiment an Sentiment Analysis
-
-Categorization of a pool of users from Twitter based on their location, content of the tweet, social factors (number of followers, social influence) and activeness of user to generate a list of top users likely to retweet.
-In addition to these , performed sentiment analysis on the tweets to match the most relevant tweets using NLP, NTLK and Ranked the users using a modified version of Support Vector Machine
-
Bus Tracker Hybrid application( IOS and Android) for Port Authority of Allegheny County
-
Developed an Android application for bus tracking in Port Authority of Allegheny County. Core functionalities of the app includes: Listing the ETA of all buses (with the help of PAAC API) for a bus stop clicked on the map (rendered using Google maps API), a search toolbar, powered by Google PlacesAPI, that helps search the input destination, and display all possible buses to that destination from the current location. In addition, an alarm feature is implemented to notify the user at a custom…
Developed an Android application for bus tracking in Port Authority of Allegheny County. Core functionalities of the app includes: Listing the ETA of all buses (with the help of PAAC API) for a bus stop clicked on the map (rendered using Google maps API), a search toolbar, powered by Google PlacesAPI, that helps search the input destination, and display all possible buses to that destination from the current location. In addition, an alarm feature is implemented to notify the user at a custom time prior to the actual arrival of a bus at a chosen stop.
Tools, Technologies and APIs used: Android SDK, Android Studio, Java, Google MapsAPI, Google PlacesAPI, PAAC API.
Recommendations received
1 person has recommended Jay
Join now to viewOther similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content