Data Analysis and Decision-Making

Explore top LinkedIn content from expert professionals.

James Zou

Associate Professor at Stanford University

18,727 followers 3mo
Report this post
Today in Nature Medicine we report that AI can predict 130 diseases from 1 night of sleep 🛌. We trained a foundation model (#SleepFM) on 585K hours of sleep recordings from 65K people—brain, heart, muscle & breathing signals combined. AI learns the language of sleep! Paper: https://lnkd.in/grpRD3Qp Open source code: https://lnkd.in/g_MqFCm6 Participants are linked to their EHR. SleepFM predicts risks for diverse diseases--including dementia, heart failure, kidney disease, and stroke--years before clinical diagnosis. It substantially outperforms using demographic features, which are strong predictors. SleepFM uses a new architecture to integrate multimodal sleep time-series data. CNNs learn local features, transformers aggregate information across time + channels, and leave-one-modality-out contrastive learning trains robust representations. This design generalizes across sites and diverse populations. We spend 1/3 of our lives sleeping but it has been underexplored with AI. Most work focuses on narrow tasks like sleep staging and apnea detection. By learning a holistic representation of sleep, SleepFM opens new doors for studying the science and medicine of sleep. Truly wonderful collaboration with Emmanuel JM Mignot's lab, led by Rahul Thapa and Magnus Ruud Kjaer! Thanks to all the awesome collaborators: Bryan He, Ian Covert, Hyatt Moore, Umaer Hanif, Gauri G., M Brandon Westover, Poul Jennum, Andreas Brink-Kjær 👏
No more previous content

No more next content
112 Comments
Like Comment
Jan-Benedict Steenkamp Jan-Benedict Steenkamp is an Influencer

Massey Distinguished Professor | Editor in Chief Journal of Marketing | Award-winning author | Top 0.02% scientist worldwide | Creator of the 4-factor Grit Scale

27,345 followers 1y Edited
Report this post
STATISTICAL V. SUBSTANTIVE SIGNIFICANCE Take a moment to consider the following scenario. One study with n = 100 reports a focal effect with an associated p-value of 0.02. Another study with n = 1000 reports a focal effect with an associated p-value of 0.02. Which study presents the strongest evidence the effect is really there? This scenario is adapted from Bakan (Psych. Bull. 1966). Many scholars chose the second scenario. They are wrong. I quote Bakan (p. 429): “The rejection of the null hypothesis when the number of cases is small speaks for a more dramatic effect in the population [larger effect size]; and if the p-value is the same, the probability of committing a Type I error remains the same.” Many papers equate implicitly or explicitly statistical significance with substantive significance. Yet, a p-value does not inform you whether the effect has any real world meaning. Ralph Tyler (Educ. Res. Bulletin, 1931) already wrote that a statistically significant difference is not necessarily an important difference, and a difference that is not statistically significant may be an important difference. Unfortunately, we are still making the same mistake 90 years later. A statistically significant result may be substantively nonsignificant (trivial). But also, a statistically nonsignificant result may be substantively significant. I see so many studies in our field reporting regression coefficients with *** and I have no idea how large the effect is. This tendency to equate statistical with substantive significance persists to the extent that the prestigious American Statistical Association (not exactly an organization afraid of advanced statistics) came out with a formal statement on p-values—the “ASA Statement on Statistical Significance and P-Values" cautioning researchers: "Statistical significance is not equivalent to scientific, human, or economic significance. Smaller p-values do not necessarily imply the presence of larger or more important effects, and larger p-values do not imply a lack of importance or even lack of effect. Any effect, no matter how tiny, can produce a small p-value if the sample size or measurement precision is high enough, and large effects may produce unimpressive p-values if the sample size is small or measurements are imprecise." I am not arguing against statistical significance. Rather that articles should report statistical AND substantive significance. In my view, our primary task is uncovering factors that make a meaningful difference. That calls for effect sizes. As a nice “bonus,” substantive significance is less amenable to p-hacking than statistical significance. If you enjoyed this, share it with others and follow me, Jan-Benedict Steenkamp, for more writing. Journal of Marketing
No more previous content

No more next content
50 Comments
Like Comment
Vitaly Friedman Vitaly Friedman is an Influencer

Practical insights for better UX • Running “Measure UX” and “Design Patterns For AI” • Founder of SmashingMag • Speaker • Loves writing, checklists and running workshops on UX. 🍣

225,928 followers 3w
Report this post
📈 How To Deal With Statistical Significance In UX (https://lnkd.in/dTzksfBM), a helpful reminder that statistical significance may not matter as much as practical significance, and useful insights often emerge in small-sample studies with just enough practical significance. Written by Rachel Banawa, PhD. ✅ STATISTICAL significance → result is unlikely to occur by chance. ✅ It means: probability value (p-value) < preset threshold (0.05). ✅ It gives us reliability: same test likely to return same outcomes. 🤔 It doesn’t say how large or impactful it is for users/businesses. ✅ Reliability ≠ Impact → It doesn’t mean it’s worth acting upon. ✅ PRACTICAL significance → good enough to matter in real life. ✅ It means: Effect Size is big enough to have an impact. 🤔 Large sample sizes make small differences look “significant”. 🤔 Small sample sizes are more vulnerable to noise and outliers. 🤔 As a result, they can hide or skew meaningful patterns. I love Rachel's point that statistical significance doesn’t tell us if a difference is noticeable to users, meaningful for experiences, or influential for a business. We need to understand is a difference is meaningful enough that it would affect users and businesses in the real world. Teams often obsess about discovering ultimate truths by arguing about statistical significance between events — but in reality, it may not matter if it won’t have a noticeable impact. Instead, we need to uncover user needs and find leverage to dial up success moments and reduce pain points. There are 2 key questions we ask to evaluate practical significance: ⌾ 1. User Perception: Would real users actually notice this change? We are exploring if a particular change will meaningfully impact user’s experience, e.g. in terms of perception of speed, reducing frustrations or hesitations, minimizing confusion, or changing behavior. It's also where we look at frequency and severity of a problem a change addresses. ⌾ 2. Business Value: Does this difference matter to the organization? Does a result affect important business outcomes, e.g. saving expenses, reducing cost acquisition costs, increasing efficiency, reducing errors or failures. We also explore its impact on business KPIs and design KPIs. --- Small improvements can compound in high-volume contexts, while larger improvements may be irrelevant if they don’t support strategic goals. In practice, even if we have low statistical significance but high qualitative evidence (“practical significance”), we can initiate experiments, roll them out to a small percentage of users (5%) and iterate from there. And: as Nikki Anderson pointed out, statistical significance was never designed for qualitative research. It ensures that our findings aren’t random. The real question is how many users we can dismiss as "irrelevant" or "not representative" until we realize that there is a problem that's worth solving. Continues in comments ↓
No more previous content

No more next content
15 Comments
Like Comment
Jeff Winter Jeff Winter is an Influencer

Industry 4.0 & Digital Transformation Enthusiast | Business Strategist | Avid Storyteller | Tech Geek | Public Speaker

173,084 followers 5mo
Report this post
Your business strategy and data strategy are no longer two strategies. They have become one. For years, companies treated them as separate tracks. Business leaders made decisions. Data teams produced reports. IT kept the systems alive. And the two worlds occasionally touched but rarely moved as a single unit. That era is over. As digital systems expanded, as operations became more connected, and as the pace of decision making accelerated, data stopped being a supporting function and became the structural backbone of how a company competes. The lines blurred. Then they merged completely. Today, a business strategy is a data strategy because every growth lever depends on it. Customer experience. Supply chain. Sales. Operations. R&D. Finance. Everything. 𝐁𝐮𝐭 𝐡𝐞𝐫𝐞 𝐢𝐬 𝐭𝐡𝐞 𝐬𝐡𝐢𝐟𝐭 𝐦𝐨𝐬𝐭 𝐨𝐫𝐠𝐚𝐧𝐢𝐳𝐚𝐭𝐢𝐨𝐧𝐬 𝐚𝐫𝐞 𝐰𝐚𝐤𝐢𝐧𝐠 𝐮𝐩 𝐭𝐨: Being data-driven is no longer a differentiator. It is the minimum requirement for being AI-ready. AI will not thrive on partial visibility, inconsistent definitions, disconnected systems, or gut-based decision making. It needs high-quality, contextualized, governed data that flows across the enterprise. And as AI becomes central to competitiveness, the companies that win will be the ones whose business strategy was designed from the start around the data needed to power it. This is why we are entering the era of the data-driven business strategy, where data is not an enabler but the language of the business itself. AI is simply accelerating the trend that was already unfolding. 𝐋𝐢𝐤𝐞 𝐭𝐡𝐢𝐬 𝐩𝐨𝐬𝐭 𝐚𝐧𝐝 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐫𝐞𝐚𝐝 𝐦𝐨𝐫𝐞, 𝐢𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐚𝐝𝐯𝐢𝐜𝐞? https://lnkd.in/euSANUJN
No more previous content

No more next content
101 Comments
Like Comment
Dr. Kedar Mate Dr. Kedar Mate is an Influencer

Founder & CMO of Qualified Health-genAI for healthcare company | Faculty Weill Cornell Medicine | Former Prez/CEO at IHI | Co-Host "Turn On The Lights" Podcast | Snr Scholar Stanford | Continuous, never-ending learner!

23,858 followers 6mo
Report this post
Information as a Determinant of Health Yesterday for our podcast #TurnOnTheLights, Don Berwick and I interviewed the brilliant Joshua M. Sharfstein and incomparable Joanne Kenen on their new book “Information Sick”. During the conversation Josh and Joanne made a pitch for something that I had not thought of before: the information ecosystem that each of us lives may determine our health even more than biology or the home that we live in. We’ve long known that a person's ZIP code matters as much or more than their genetic code when it comes to health outcomes. But here's what Josh and Joanne were saying: The information ecosystem someone inhabits may be just as powerful a determinant of health. Our choices for where we get our health news—CDC, TV, medical journals, social media, or WhatsApp message groups—predict our health-seeking behaviors which in turn predict our health outcomes. Right now, parents are deciding whether to vaccinate themselves or their children not based on biology or genetic risk, but on the information streams they have come to trust. The Facebook groups they're in. The podcast they listened to. The Instagram influencer who shared a video. The friend whose story seemed so compelling. This is information—and increasingly, misinformation and disinformation—operating as a determinant of health in real time. Two parents with identical children, identical insurance, identical access to pediatric care can make radically different vaccination decisions based solely on their information environments. One child gets protected against measles. One doesn't. We've built magnificent systems to understand biological and social determinants of health. But we're barely beginning to grapple with information as determinant. I’ve seen my role as a physician to be a supplier of accurate, scientific information about health and care. But I’ve rarely understood the information ecosystem that my patients live in every minute of every day—the very info environment they immerse themselves in the minute they leave my exam room. Until we in healthcare meet people inside their information ecosystems—the ones they actually live in—not the ones where we wish they lived—we're missing something fundamental about how health gets created or destroyed in our communities. Josh and Joanne are opening a new front in how we create health in our world. Not just in biology or genomics, or in sociology or economics, but also in the information ecosystems that our patients inhabit. "Health is created at home," my colleague Nigel Crisp once wrote…and, perhaps in a very 21st-century rider, "health is also created online." If this resonates please share your thoughts, I’d be interested in how information ecosystems are shaping your health decisions or the decisions of the communities you serve? #HealthInformation #InformationasDeterminantofHealth #HealthEquity #SocialDeterminantsOfHealth #PublicHealth #Misinformation
No more previous content

No more next content
35 Comments
Like Comment
João António Sousa

Solutions Engineering @ Hightouch | Ex-McKinsey

9,141 followers 1y
Report this post
Reporting is NOT delivering insights. Unfortunately, many data & analytics professionals think it is. Reporting dashboards show WHAT's happening and enable basic slicing and dicing, but fail to deliver WHY. Example - "Performance is down 15% WoW" This is just stating the obvious. It's not a real insight. It's not actionable. This leaves many business leaders frustrated. When business stakeholders ask for more dashboards, what they are ultimately trying to achieve is "I need to know what's impacting my key business metrics and what I should do to improve it". Adding 15 more charts/views/slices won't help much to understand what's impacting the key business metrics and which actions should be taken. The key to REAL INSIGHTS that can move the needle? ROOT-CAUSE ANALYSIS to find the WHY (i.e., DIAGNOSTIC analytics) This is the most effective way to drive change with data & analytics. This can make the data & analytics team a TRUSTED ADVISOR and get a seat at the leadership and decision-making table. Insights need to be: 🟢SPEEDY: business stakeholders need quick insights into performance changes to make decisions before it's too late 🟢PROACTIVE: don't wait for business stakeholders to ask. Monitor key metrics and proactively share insights to become that trusted advisor 🟢IMPACT-ORIENTED: focus on the key drivers that drove most of the change and communicate accordingly 🟢EFFECTIVELY COMMUNICATED to drive the right action #data #analytics #impact #diagnosticanalytics

20 Comments
Like Comment
Beth Kanter Beth Kanter is an Influencer

Trainer, Consultant & Nonprofit Innovator in digital transformation & workplace wellbeing, recognized by Fast Company & NTEN Lifetime Achievement Award.

521,968 followers 6mo
Report this post
This Stanford study examined how six major AI companies (Anthropic, OpenAI, Google, Meta, Microsoft, and Amazon) handle user data from chatbot conversations. Here are the main privacy concerns. 👀 All six companies use chat data for training by default, though some allow opt-out 👀 Data retention is often indefinite, with personal information stored long-term 👀 Cross-platform data merging occurs at multi-product companies (Google, Meta, Microsoft, Amazon) 👀 Children's data is handled inconsistently, with most companies not adequately protecting minors 👀 Limited transparency in privacy policies, which are complex and hard to understand and often lack crucial details about actual practices Practical Takeaways for Acceptable Use Policy and Training for nonprofits in using generative AI: ✅ Assume anything you share will be used for training - sensitive information, uploaded files, health details, biometric data, etc. ✅ Opt out when possible - proactively disable data collection for training (Meta is the one where you cannot) ✅ Information cascades through ecosystems - your inputs can lead to inferences that affect ads, recommendations, and potentially insurance or other third parties ✅ Special concern for children's data - age verification and consent protections are inconsistent Some questions to consider in acceptable use policies and to incorporate in any training. ❓ What types of sensitive information might your nonprofit staff share with generative AI? ❓ Does your nonprofit currently specifically identify what is considered “sensitive information” (beyond PID) and should not be shared with GenerativeAI ? Is this incorporated into training? ❓ Are you working with children, people with health conditions, or others whose data could be particularly harmful if leaked or misused? ❓ What would be the consequences if sensitive information or strategic organizational data ended up being used to train AI models? How might this affect trust, compliance, or your mission? How is this communicated in training and policy? Across the board, the Stanford research points that developers’ privacy policies lack essential information about their practices. They recommend policymakers and developers address data privacy challenges posed by LLM-powered chatbots through comprehensive federal privacy regulation, affirmative opt-in for model training, and filtering personal information from chat inputs by default. “We need to promote innovation in privacy-preserving AI, so that user privacy isn’t an afterthought." How are you advocating for privacy-preserving AI? How are you educating your staff to navigate this challenge? https://lnkd.in/g3RmbEwD
No more previous content

No more next content
16 Comments
Like Comment
Phil Dinh

Data Analyst | Analytics Engineer | Data Engineer | Tech Skills & Business Thinking 🔥

3,888 followers 7mo
Report this post
❌ I spent 5 months learning Machine Learning… and never used it once as a Data Analyst When I started my data journey, I didn’t know what to focus on, and I had no clear pathway what I need to learn or how to stand out among thousands of applicants. At that time, AI was growing rapidly and becoming so popular and trendy. Terms like “Machine Learning”, “Python”, and “AI” immediately captured my attention because they sounded so powerful and fancy. I thought if I added them to my resume, I would become more competitive and stronger than other people. On top of that, I also got distracted by job descriptions for Junior Data Analyst roles that listed requirements like Python, ETL pipelines, and even predictive modeling—which made me believe those were must-have skills from day one. But I was wrong. 🚫 I wasted too much time studying things that a Data Analyst doesn’t really need and rarely uses in a career. I’m honestly surprised how many people have reached out to me and said they faced the same struggle—without a clear pathway, they also didn’t know what to focus on. Even many universities offering Business Analytics courses put heavy emphasis on R, Python, and Machine Learning. ✨ From my experience, here’s what you should focus on to secure a Data Analyst role: Data Analyst: Work with structured data to identify patterns, create reports, and provide insights that guide business decisions. Core tools: Power BI / Tableau (build dashboards), SQL (Beginner → Intermediate), Excel (Power Query, Macros, VBA). 💡 My best tip: Data Analysts live and breathe data visualization. Since many people associate the role with dashboards, a strong Power BI portfolio can instantly capture HR’s attention. I tested this myself (and experienced it from many successful people), and it really works—once I focused on building and sharing more Power BI projects on LinkedIn, the number of interviews I landed increased significantly. Data Engineer: Transform raw data into structured data, build pipelines, and maintain systems that make data reliable and accessible. Core tools: Python, SQL, Cloud platforms (AWS/Azure/GCP), ETL pipelines. Data Scientist: Apply statistics and machine learning to explore data, build predictive models, and uncover deeper business opportunities. Core tools: Python, R, ML frameworks, Statistics, Mathematics. ⚠️ Don’t let job descriptions trick you. Many will list every tool under the sun, but the truth is: ➡️ Focus on SQL, Excel, and BI tools first. ➡️ Build projects (Dashboards) that show you can turn data into insights. ➡️ Save Machine Learning and Python for later, if you decide to move into Data Science and Data Engineering. ✨ let’s connect with me and share your ideas (I would love to hear it from you). Thank you very much! #DataAnalytics #PowerBI #SQL #CareerGrowth #DataVisualization
No more previous content

No more next content
169 Comments
Like Comment
Dipu Patel, DMSc, MPAS, ABAIM, PA-C

“Change happens at the speed of trust.” Shaping the AI-Ready Clinician | Designing Intelligent Systems for Healthcare Education | Speaker | Strategist | Author

6,124 followers 2mo
Report this post
This article maps bias across the full lifecycle of medical AI: training data (who is in the dataset and what’s missing) --> labels (how “ground truth” encodes human bias) --> model development and evaluation --> real-world implementation --> which models get published and from where. It illustrates concrete clinical risks, from melanoma models that underperform on dark skin to ICU mortality models with recall as low as 25% in underrepresented groups, and shows how biased systems can drive substandard decisions for the very patients who most need better care. The authors argue that mitigation must go beyond technical fixes, combining diverse datasets, fairness-aware modeling, interpretability, stronger standards, and clinical trials that explicitly test for unbiased performance. Key takeaways - Bias enters early: imbalanced cohorts, nonrandom missing data, and the absence of social determinants of health all push models to work best for already advantaged groups. - “Ground truth” is not neutral: labels reflect provider behavior, misclassification, and structural inequities, so models can learn and amplify existing clinical biases rather than correct them. - Whole-cohort metrics like AUC can hide harm; subgroup performance, fairness metrics, and interpretability tools are essential to detect and mitigate inequity in model outputs. - Real-world deployment introduces new bias: models can fail on populations unlike the training data (Epic sepsis model is a key example), and clinician use/override patterns can themselves be inequitable. - Publication and funding ecosystems skew what gets built and validated, with over half of clinical AI models using US or Chinese data, and radiology dominating the literature. Dipu’s Take If AI in medicine isn’t explicitly designed and governed for equity, it will quietly operationalize our worst blind spots at scale. Accuracy alone is a distraction metric; the harder questions are “for whom, in which contexts, and at what clinical cost?” The leadership opportunity here is to treat debiasing as core safety and quality work: mandate diverse data, require subgroup reporting and fairness metrics, bake bias monitoring into post-deployment oversight, and tie reimbursement and approvals to demonstrated equitable performance in trials.

Bias in medical AI: Implications for clinical decision-making pmc.ncbi.nlm.nih.gov

1 Comment
Like Comment
Antonio Vizcaya Abdo

Sustainability Leader | Governance, Strategy & ESG | Turning Sustainability Commitments into Business Value | TEDx Speaker | 126K+ LinkedIn Followers

126,225 followers 10mo
Report this post
Double Materiality Assessment Checklist 🌎 Double materiality provides a structured way to connect sustainability issues with both financial performance and broader societal and environmental impacts. It supports more informed decision-making by highlighting areas of convergence between business risk and external impact. The process begins by understanding how the organization is structured across regions, business units, and value chains. This includes identifying where operational control exists, how influence is exercised, and how these factors shape the boundaries of the assessment. Governance frameworks and internal reporting structures provide important reference points. Identifying sustainability topics involves analyzing external frameworks, stakeholder expectations, and sector-specific risks. Trends such as regulatory developments, investor pressure, or resource volatility often help signal material issues. Segmenting stakeholder perspectives prevents generalization and adds precision to the process. Both risks and opportunities are relevant. Some topics may represent cost or compliance exposure, while others point to potential for innovation or improved market positioning. Capturing this range contributes to a more complete understanding of where sustainability intersects with business strategy. Assessing impact significance and financial relevance requires evaluating the scale, likelihood, and persistence of potential outcomes. Financial implications can include cost increases, revenue loss, or capital constraints. A combination of internal data, expert input, and scenario testing helps ground these evaluations. Treating impact and financial relevance as separate dimensions supports more accurate prioritization. When issues are collapsed into a single score, important nuances are often lost. A two-axis approach makes it easier to identify high-priority topics that require targeted responses across different functions. The output of the assessment should link to existing systems such as risk registers, compliance tools, or sustainability reporting frameworks. This improves internal coherence and ensures that material topics inform both risk oversight and performance tracking. To remain effective, the assessment framework should be updated regularly to reflect new data, evolving stakeholder expectations, and changes in the external environment. Continuous alignment with strategic objectives and governance processes ensures that results stay relevant and actionable. #sustainability #sustainable #esg #business #materiality
No more previous content

No more next content
7 Comments
Like Comment

Data Analysis and Decision-Making

More in Data Analysis and Decision-Making

More Technology topics

Explore categories