About
Develop data science enterprise solutions with a motivation to deeply understand the…
Articles by Greg
Activity
-
I've been in 200+ discovery calls with enterprise teams building GenAI systems. The ones that fail don't fail because the model was wrong. They fail…
I've been in 200+ discovery calls with enterprise teams building GenAI systems. The ones that fail don't fail because the model was wrong. They fail…
Shared by Greg Makowski
-
I am thrilled to announce that I’m starting a new chapter as Managing Director, Head of Data & Analytics for JPMorgan Chase Auto in NYC! Throughout…
I am thrilled to announce that I’m starting a new chapter as Managing Director, Head of Data & Analytics for JPMorgan Chase Auto in NYC! Throughout…
Liked by Greg Makowski
-
Join 100+ people in an ACM talk next Monday, April 20, 7pm PST "Labelled Deductive Systems for Logical Validation of LLM Output". In person in…
Join 100+ people in an ACM talk next Monday, April 20, 7pm PST "Labelled Deductive Systems for Logical Validation of LLM Output". In person in…
Shared by Greg Makowski
Experience
Education
-
Western Michigan University
3.64
-
Activities and Societies: HONORS: Member of WMU Chapter of Upsilon Pi Epsilon Computer Science Honor Society, GPA 3.64/4.00 WMU CS ranks in top 10% of CS Schools in US https://computer-science-schools.com/western-michigan-university
THESIS: "Time Delay Neural Networks and Speech Recognition: Context Independence of Stops in Different Vowel Environments"
Started and ran for 2 years the local ACM chapter of the Special Interest Group on Artificial Intelligence.
Deployed an Expert System for identifying unknowns for an asbestos removal company.
Volunteer Experience
-
Vice Chair, Chair, Chair of Data Science SIG
San Francisco Bay Area ACM
- Present 18 years
Education
We are a local chapter of the Association of Computing Machinery, www.ACM.org. We have been running continuously since 1957 - older than many meetup groups. We host about 25 events per year. Each month, we have two evening talks: General Computing and Data Science SIG. Twice a year, we hold Professional Development Seminars (PDS) on a Saturday. We also support local science fairs.
* Organized Data Mining / Data Science Camp (unconference) annually since 2009
…We are a local chapter of the Association of Computing Machinery, www.ACM.org. We have been running continuously since 1957 - older than many meetup groups. We host about 25 events per year. Each month, we have two evening talks: General Computing and Data Science SIG. Twice a year, we hold Professional Development Seminars (PDS) on a Saturday. We also support local science fairs.
* Organized Data Mining / Data Science Camp (unconference) annually since 2009
http://www.sfbayacm.org/data-science-camp-2017/ (most recent)
* Organized the first big data Kaggle competition, with 2 years of Best Buy data https://www.kaggle.com/c/acm-sf-chapter-hackathon-big
https://www.kaggle.com/c/acm-sf-chapter-hackathon-small
* Organized a number of day long Professional Development Seminars (PDS): Cloud, R, Python, ..
* Organized the Toyota Car Hackathon, 2013 http://events.sfbayacm.org/event/toyota-itc-acm-quantified-car-hackathon/
www.SFbayACM.org
http://www.meetup.com/SF-Bay-ACM/
https://www.youtube.com/user/sfbayacm (125+ past talks on our channel)
-
Fundraiser & marathon runner
The Leukemia & Lymphoma Society
- 1 year
Health
I ran one marathon each year, organizing fundraiser events and raising $8,000 in total.
Patents
-
Event Lift Forecasting - automated forecasting for retail promotion events
Filed US don't remember
Projects
-
Introduction to Data Mining Algorithms and their Evolution toward IoT
See projectInvited speaker for the Silicon Valley Business of Engineering Social Meetup. This was outside of work.
-
Global Big Data Conference 6/9/2017: "Predictive Model and Record Description Using Segmented Sensitivity Analysis"
See projectModel description, both at the overall level on what variables are most important to the forecast, and reasons for an individual record forecast can be a competitive advantage in a business or consulting situation. Design objectives for this model description include 1) describe the model in terms of variables understandable to the target audience 2) independent of the predictive algorithm, 3) support a single or ensemble of models, 4) pick up non-linearities in variables and 5) pick up…
Model description, both at the overall level on what variables are most important to the forecast, and reasons for an individual record forecast can be a competitive advantage in a business or consulting situation. Design objectives for this model description include 1) describe the model in terms of variables understandable to the target audience 2) independent of the predictive algorithm, 3) support a single or ensemble of models, 4) pick up non-linearities in variables and 5) pick up interaction effects between variables. The Segmented Sensitivity Analysis (SSA) method has been used by the author for 25 years in a variety of data mining projects. As with other descriptive techniques, there is a bit of art when developing a solution for a specific use case. A number of SSA variations and use cases will be discussed to illustrate how the system can be adapted. The SSA model description can also be helpful during the model building process, as well as detecting and describing model drift over time, as the behavior of the scoring data slowly changes or drifts from the training data.
R code to share is under development. -
Invited speaker to Beijing at Global Mobile Internet Conference (GMIC): Tutorial #6, Predictive Data Science in R
See projectI was an invited speaker, with travel expenses paid to Beijing.
I gave a 4 hour tutorial, which was later expanded to an 8 hour course. For details, see the related project titled <Develop and teach 8 hour course "Predictive Data Science in R"> -
Meetup: Production Model Lifecycle Management
See projectData Scientists are quite motivated to develop accurate predictive models and do testing for generalization. Providing a decent description can be an afterthought. Talking to a number of modelers, I have heard them state "if I have to describe my model, then I will just skip certain algorithms that are hard to describe." The thesis of the presentation is that you can maximize all 3 objectives 1) accuracy, 2) generalization and 3) understandability.
As a best practice, first only…Data Scientists are quite motivated to develop accurate predictive models and do testing for generalization. Providing a decent description can be an afterthought. Talking to a number of modelers, I have heard them state "if I have to describe my model, then I will just skip certain algorithms that are hard to describe." The thesis of the presentation is that you can maximize all 3 objectives 1) accuracy, 2) generalization and 3) understandability.
As a best practice, first only focus on a metric combining accuracy and generalization, track the design of experiments of model parameters in a model notebook, reporting estimates of the business value per model.
To describe the model globally, sensitivity analysis or LIME (Local Interpretable Model-agnostic Explanations) can be used.
The model lifecycle does not just end with putting the model into production. You can use metrics to track the gradual data drift over time, as the model is repeatedly deployed. This tracking leverages the detailed model description analysis.
-
Deep Learning Meetup: Using Deep Learning to do Real-Time Scoring in Practical Applications
See projectThe talk will cover a brief review of neural network basics and the following types of neural network deep learning:
* autocorrelational - unsupervised learning for extracting features. Describe how additional layers build complexity in the feature extraction.
* convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to…The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
* autocorrelational - unsupervised learning for extracting features. Describe how additional layers build complexity in the feature extraction.
* convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to images or videos, for faces or self driving cars
* discuss details of applying deep net systems for continuous or real time scoring
* reinforcement learning or Q Learning - such as learning how to play Atari video games
* continuous space word models - such as word2vec, skipgram training, NLP understanding and translation
-
Conf: Case Studies Deploying Cluster Analysis
See projectThree case studies are discussed, that include cluster analysis as a component.
1) Customer description for a credit card attrition model, to describe how to talk to customers.
2) Hotel price optimization. Use clusters to find subsets of similar behavior, and optimize prices within each cluster. Use a neural net as the objective function.
3) Retail supply chain, planning replenishment using 52 week demand curves using thousands of seasonal "profiles" or clusters. -
Conference: Heuristic Design of Experiments with Meta-Gradient Search of Model Training Parameters
See projectOnce you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to…Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to describe an arbitrarily complex predictive system is by using sensitivity analysis -
On conference program committee. Review and accept papers for industry track of ACM/IEEE conference "Data Science and Advanced Analytics", Oct 19-21 in Tokyo
-
See projectInternational Conference on Data Science and Advanced Analytics (DSAA) started in 2014 aiming to be a flagship in the data science and analytics field. It provides a premier forum that brings together researchers, industry practitioners, as well as potential users of data science and big data analytics. It covers all data science and analytics related areas, including statistical, probabilistic and mathematical methods, machine learning, data and business analytics, data mining and knowledge…
International Conference on Data Science and Advanced Analytics (DSAA) started in 2014 aiming to be a flagship in the data science and analytics field. It provides a premier forum that brings together researchers, industry practitioners, as well as potential users of data science and big data analytics. It covers all data science and analytics related areas, including statistical, probabilistic and mathematical methods, machine learning, data and business analytics, data mining and knowledge discovery, infrastructure, storage, retrieval and search, privacy and security, and relevant applications, practices, tools and evaluation. DSAA’2014 was not a fully IEEE supported conference, but was technically co-sponsored by IEEE Computational Intelligence Society (CIS) and ACM through SIGKDD. DSAA became a fully IEEE CIS supported conference from the second edition. The second IEEE DSAA’2015 was held in Paris in 2015 which was also very successful. The third IEEE DSAA’2016 is planned in Montreal. They continue to be technically sponsored by ACM.
IEEE DSAA’2017 will consist of two main Tracks: Research and Application; the Research Track is aimed at collecting contributions related to theoretical foundations of Data Science and Data Analytics. The Application Track is aimed at collecting contributions related to applications of Data Science and Data Analytics in real life scenarios. DSAA’2017 solicits then both theoretical and practical works on data science and advanced analytics. -
Develop and teach 8 hour course "Predictive Data Science in R"
-
See projectGo through a sprint of a predictive data mining project, introducing R as we go. Review the training process for regression, backpropagation neural nets, decision trees and XGboost. Introduce R data.tables and the caret interface to 233 predictive algorithms. Focus on strategies to structure a successful project design and data pull. Review a variety of preprocessing and knowledge representation. Provide questions you can take away and apply to the design of your future projects, to…
Go through a sprint of a predictive data mining project, introducing R as we go. Review the training process for regression, backpropagation neural nets, decision trees and XGboost. Introduce R data.tables and the caret interface to 233 predictive algorithms. Focus on strategies to structure a successful project design and data pull. Review a variety of preprocessing and knowledge representation. Provide questions you can take away and apply to the design of your future projects, to describe models to clients (sensitivity analysis code included) and to manage models over their natural lifecycle. Introduce R + Spark integrations, and show an example R Shiny web GUI interface.
-
Automatic Model Building (had a patent application)
-
How can the process of Knowledge Discovery in Databases be automated, competitive and reliable? One approach is to focus on a narrow vertical market application, with known data sources and data feeds. Then you can automate the Exploratory Data Analysis (EDA) and Preprocessing phases. But how do you automate the selection of training data? Can the enterprise application be installed and configured at a variety of clients without a Senior Knowledge Discovery Engineer? How can you minimize…
How can the process of Knowledge Discovery in Databases be automated, competitive and reliable? One approach is to focus on a narrow vertical market application, with known data sources and data feeds. Then you can automate the Exploratory Data Analysis (EDA) and Preprocessing phases. But how do you automate the selection of training data? Can the enterprise application be installed and configured at a variety of clients without a Senior Knowledge Discovery Engineer? How can you minimize "worst case" results of such a system when used by a business user going through their normal business role? How can you deeply investigate and model "business values" (i.e. things that can get an end user promoted or fired) into the core of the data mining algorithms?
This talk will answer these questions and more. The patent-pending application, ELF, is an enterprise application in the retail supply chain vertical market. Before the development of this system, one enterprise application was used to lay out a weekly newspaper flier three weeks before the sales event, which in turn fed data into a replenishment application. The replenishment application kept products on the store shelves, with a minimal amount of over stock and under stock. The pain point was that the retail buyer would have to manually estimate the the sales lift, or the multiplier increase in sales, for every item for every store. While human expertise can be great, it isn\'t as scalable when applied to a sales event with 1,000 - 4,000 items on sale in 6,000 stores. ELF (Event Lift Forecasting) would import data from a planned event and automatically analyze and forecast the lift for each store-item combination. Data elements used included pricing, placement in the flier, store geography and demographics, seasonality, and product hierarchy.
The resulting ELF system produced a 8-30% reduction in over and under stock costs, significant for the supply chain industry.Other creatorsSee project
Recommendations received
8 people have recommended Greg
Join now to viewMore activity by Greg
-
A January 2025 paper made a room of 130 engineers go quiet. Frontier LLMs can pursue goals you didn't give them. It showed that frontier LLMs - GPT-…
A January 2025 paper made a room of 130 engineers go quiet. Frontier LLMs can pursue goals you didn't give them. It showed that frontier LLMs - GPT-…
Shared by Greg Makowski
-
AI in 2026 isn’t just models and prompts. It’s orchestration, memory, governance, and reliability. If your stack stops at LLM + vector DB, you’re…
AI in 2026 isn’t just models and prompts. It’s orchestration, memory, governance, and reliability. If your stack stops at LLM + vector DB, you’re…
Shared by Greg Makowski
-
Free ACM talk: "Giving LLMs a Map: Building Smarter GenAI with GraphRAG" Mon 3/23/26 6:45pm (in Mountain View, CA or on Zoom) RSVP:…
Free ACM talk: "Giving LLMs a Map: Building Smarter GenAI with GraphRAG" Mon 3/23/26 6:45pm (in Mountain View, CA or on Zoom) RSVP:…
Shared by Greg Makowski
-
Most LLMs weren’t built for your business. We train them to understand your domain, your data, and your workflows. → Context-aware…
Most LLMs weren’t built for your business. We train them to understand your domain, your data, and your workflows. → Context-aware…
Shared by Greg Makowski
-
120 billion parameters now fit in your pocket, without a cloud connection. The Tiiny AI Pocket Lab, officially verified by Guinness World Records…
120 billion parameters now fit in your pocket, without a cloud connection. The Tiiny AI Pocket Lab, officially verified by Guinness World Records…
Liked by Greg Makowski
-
Shout out to Mahesh Lalwani, the person I connected with the most on LinkedIn this year. #YearinReview
Shout out to Mahesh Lalwani, the person I connected with the most on LinkedIn this year. #YearinReview
Shared by Greg Makowski
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content