Machine Learning Algorithms

Explore top LinkedIn content from expert professionals.

  • View profile for Andrew Ng
    Andrew Ng Andrew Ng is an Influencer

    DeepLearning.AI, AI Fund and AI Aspire

    2,471,241 followers

    Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress: Reflection, Tool use, Planning and Multi-agent collaboration. Instead of having an LLM generate its final output directly, an agentic workflow prompts the LLM multiple times, giving it opportunities to build step by step to higher-quality output. Here, I'd like to discuss Reflection. It's relatively quick to implement, and I've seen it lead to surprising performance gains. You may have had the experience of prompting ChatGPT/Claude/Gemini, receiving unsatisfactory output, delivering critical feedback to help the LLM improve its response, and then getting a better response. What if you automate the step of delivering critical feedback, so the model automatically criticizes its own output and improves its response? This is the crux of Reflection. Take the task of asking an LLM to write code. We can prompt it to generate the desired code directly to carry out some task X. Then, we can prompt it to reflect on its own output, perhaps as follows: Here’s code intended for task X: [previously generated code] Check the code carefully for correctness, style, and efficiency, and give constructive criticism for how to improve it. Sometimes this causes the LLM to spot problems and come up with constructive suggestions. Next, we can prompt the LLM with context including (i) the previously generated code and (ii) the constructive feedback, and ask it to use the feedback to rewrite the code. This can lead to a better response. Repeating the criticism/rewrite process might yield further improvements. This self-reflection process allows the LLM to spot gaps and improve its output on a variety of tasks including producing code, writing text, and answering questions. And we can go beyond self-reflection by giving the LLM tools that help evaluate its output; for example, running its code through a few unit tests to check whether it generates correct results on test cases or searching the web to double-check text output. Then it can reflect on any errors it found and come up with ideas for improvement. Further, we can implement Reflection using a multi-agent framework. I've found it convenient to create two agents, one prompted to generate good outputs and the other prompted to give constructive criticism of the first agent's output. The resulting discussion between the two agents leads to improved responses. Reflection is a relatively basic type of agentic workflow, but I've been delighted by how much it improved my applications’ results. If you’re interested in learning more about reflection, I recommend: - Self-Refine: Iterative Refinement with Self-Feedback, by Madaan et al. (2023) - Reflexion: Language Agents with Verbal Reinforcement Learning, by Shinn et al. (2023) - CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, by Gou et al. (2024) [Original text: https://lnkd.in/g4bTuWtU ]

  • View profile for Sreedath Panat

    MIT PhD | IITM | 100K+ LinkedIn | Co-founder Vizuara & Videsh | Making AI accessible for all

    117,450 followers

    YOLO (You Only Look Once) revolutionized object detection by solving a fundamental problem: how to detect objects in real-time with just one forward pass through a neural network. Here is how it works in simple terms: Instead of scanning an image multiple times like traditional methods, YOLO divides the entire image into a grid (typically 4x4 or larger). Each grid cell becomes responsible for predicting whether it contains an object and what that object is. For every grid cell, the algorithm predicts three key things: 1. Objectness confidence - how likely is there an object here? 2. Class probability - what type of object is it? 3. Bounding box parameters - where exactly is the object located? The genius is in the "only look once" approach. Traditional object detection methods would run multiple scans across different regions of an image. YOLO does everything in a single pass, making it incredibly fast for real-time applications. The backbone is typically a CNN that processes the entire image simultaneously. The final confidence score combines the objectness probability with the intersection-over-union (IoU) ratio, giving you both detection accuracy and precise localization. Of course, vanilla YOLO has limitations - it struggles with small objects, crowded scenes, and unusual aspect ratios. But its speed and simplicity made it a game-changer for computer vision applications. If you are just getting started with object detection, I recently created an introductory lecture breaking down YOLO for total beginners on Vizuara's YouTube channel: https://lnkd.in/gwEEzqiT What is your experience with real-time object detection? Have you implemented YOLO in any projects?

  • View profile for Federico Danieli

    AI/ML Research Scientist

    2,480 followers

    We taught LSTMs to run in parallel. Now they've grown to 7B parameters, and are ready to challenge Transformers. For years, we’ve assumed RNNs were doomed—inherently sequential, too slow to train, impossible to scale—and looked at Transformers as the go-to choice for Large Language Modelling. Turns out we just needed better math. Introducing 𝗣𝗮𝗿𝗮𝗥𝗡𝗡: 𝗨𝗻𝗹𝗼𝗰𝗸𝗶𝗻𝗴 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗼𝗳 𝗡𝗼𝗻𝗹𝗶𝗻𝗲𝗮𝗿 𝗥𝗡𝗡𝘀 𝗳𝗼𝗿 𝗟𝗟𝗠𝘀 👉 [TL;DR] We can now train nonlinear RNNs at unprecedented scales, by parallelising what was previously considered inherently sequential—the unrolling of recurrent computations. If you care about fast inference for LLMs, or are into time-series analysis, we got good news for you: RNNs are back on the menu. 🐍 But wait, doesn’t Mamba parallelise this too? Sure, but here's the catch: Mamba requires state space updates to be linear, fundamentally affecting expressivity. We want the freedom to apply nonlinearities sequence-wise. 💡 Our approach: Recast the sequence of nonlinear recurrences as a system of equations, then solve them in parallel using Newton's method. As a bonus, make everything blazingly fast with custom CUDA kernels. ⚡ The result? Up to 665x speedup over naive sequential processing, and training times comparable to Mamba, even with the extra overhead from Newton’s iterations. 📈 So we took LSTM and GRU architectures—remember those from the pre-Transformer era?—scaled them to 7B parameters, and achieved perplexity comparable to similarly-sized Transformers. No architectural tricks. Just pure scale, finally unlocked. 🔥 Why this matters: Mamba challenged the Transformer’s monopoly. ParaRNN expands the search space of available architectures. It’s time to get back to the drawing board and use these tools to start designing the next generation of inference-efficient models. 💻 To aid with this, we’re releasing open-source code to parallelise RNN applications, out-of-the box. No need to bother implementing your own parallel scan, nor trying to remember how Newton works: just prescribe the recurrence relationship, flag eventual structures in your hidden state update, and watch GPUs go 𝘣𝘳𝘳𝘳𝘳𝘳𝘳𝘳. Paper: https://lnkd.in/dTEGh5Jp Code: https://lnkd.in/d_Ven9Y2 Collaborators: Pau Rodriguez Lopez, Miguel Sarabia, Xavier Suau, Luca Zappella --------------------------------------------- 💼 And if you're a PhD student interested in working on these topics, we got a fresh internship position just for you: https://lnkd.in/dDVSsfJj 𝗧𝗶𝗺𝗲 𝘁𝗼 𝗲𝘅𝗽𝗹𝗼𝗿𝗲 𝘄𝗵𝗮𝘁 𝘁𝗿𝘂𝗹𝘆 𝗻𝗼𝗻𝗹𝗶𝗻𝗲𝗮𝗿 𝗥𝗡𝗡𝘀 𝗰𝗮𝗻 𝗱𝗼 𝗮𝘁 𝘀𝗰𝗮𝗹𝗲

  • View profile for Timothy Armoo
    Timothy Armoo Timothy Armoo is an Influencer

    Business Builder | Global Speaker | #1 Sunday Times Bestselling Author

    212,328 followers

    You think Shein is just another fast fashion company. But really they are one of the biggest technology companies you've never truly understood. Here are 3 things that this Chinese-owned fashion giant did to be worth more than H&M and Zara combined! 👇 👉 Shein uses its own proprietary tools along with Google trends to figure out what styles are trending through search and social media. These then go to their team of 800 designers who create designs based on the styles. Designs are created within 36 hours. Social trends generate hype before the item has even been produced by Shein. This is incredibly smart...they sell into pent-up demand for a certain style. Then their products leverage that hype to build more hype but this time specifically around their brand. 👉 Shein creates small batches of products - as small as 10 pieces then put them on their site. Then using AI, they're able to track behaviour like browsing product details, the number of add-to carts, and how long people spend on a particular product. All their 3000 suppliers are given access to Shein's ERP platform which allows them to see in real-time what products are being clicked on the most, then instantly create more of those. This real-time model reduces the time from design to finished product to 3 days and drastically reduces excess waste. To give some context, their nearest competitor Zara takes 5 weeks and that's considered fast. There is fast fashion...then there is supersonic fashion. 👉 Shein adds over 1000 new styles every single day in over 220 countries! To put this into context, Zara adds 300 per month. In reality, Shein is really an optimisation machine - ingesting what we like then feeding us more of that...and doing it in a localised way. There is a second algorithm that weighs how deep the actions are of a user on a site - a product which has more viewing time has a higher weighting than one which people merely clicked on for 2 seconds. This then leads to the user being shown similar products. You and I could go on Shein and be shown drastically different products due to our browsing data. If this hyper-personalisation sounds very similar, it's because there is another super app launched in China which has the same thing... TikTok. -- It’s no surprise why Shein has been called The TikTok of commerce. Whether you love them or hate them - and there are many reasons to hate them - you can’t deny that they’re building the future of what commerce looks like. Commerce is moving from a human-designed POV and moving to what the machines tell us. I believe that's generally where consumption is moving, when there is so much choice, let the algorithms decide. There will be no need for human curation. Even more important, I believe we're just at the beginning of this wave...

  • View profile for Dr. Ayesha Khanna
    Dr. Ayesha Khanna Dr. Ayesha Khanna is an Influencer

    AI Entrepreneur. Board Member. Reuters Trailblazing Woman in Enterprise AI (2026). Forbes Groundbreaking Female Entrepreneur in Southeast Asia. LinkedIn Top Voice for AI.

    92,178 followers

    SHEIN, the fast fashion giant, has nearly doubled its carbon emissions in 2023, becoming the industry's biggest polluter.   Here's the twist: its AI and machine learning tech, designed to optimize demand and supply, is fueling this rapid growth and the environmental fallout.   An eye-opening article from Grist underscores my concerns: many companies are so dazzled by AI's profit potential that they overlook its societal and sustainability impacts.   Key Takeaways: ► AI personalizes clothing trends in almost real-time, leading to a surge of new styles. With all 5,400 suppliers accessing an AI platform, SHEIN adds up to 10,000 new items daily to keep up with trends. ► New designs can go from concept to garment in just 10 days, significantly spiking carbon emissions. Plus, heavy reliance on air freight further worsens their carbon footprint compared to sea or land transport.   The worst part? The more AI churns out cheap, trendy items, the more customers buy, and the cycle continues. 😔   Bottom line: While AI streamlines Shein's operations, it also amplifies the harmful effects of fast fashion, leading to increased emissions, waste, and resource depletion. We need to rethink our approach. Read the article here: https://lnkd.in/ezP7PDJy #artificialintelligence #fastfashion #innovation

  • View profile for Andrew Jones

    Data Science Infinity | 100k+ Followers | Amazon | PlayStation | 6x Patents | Author | Advisor

    116,926 followers

    PCA (Principal Component Analysis) is a tricky concept to grasp. Here is a MATH-FREE explanation: Principal Component Analysis is a technique often used in Data Science & ML for "dimensionality reduction" This means it can help us reduce a large set of variables or features down to a smaller set that still contains much of the original information or variance! For example's sake, let's say our original dataset contained 10 numeric columns (features). PCA could reduce this set of ten features down to a smaller number of features (let's say 3) each of which is a "principal component" These newly created features or principal components are somewhat abstract. They are a blend of some of the original features, where the algorithm found they were correlated. By blending the original variables rather than simply removing them (like we might with feature selection techniques) we hope to keep much of the key information that is held within our original feature set. To be completely clear - in our example so far, the PCA algorithm itself did not choose to create 3, we, the Data Scientist actually pre-specified this number. Similar to algorithms like k-means, we have to tell the algorithm how many components we want to end up with - otherwise it will just construct a component for every original feature! So how do we decide how many components we want or need? There is no right or wrong answer to this question - we have a trade-off on our hands! We need to understand how much variance from the original feature set is captured by each additional principal component. Based on this, we must decide what is best for our task! [Pro Tips] Before applying PCA: Standardize your original features to ensure they all exist on a comparable scale Accept that you will lose some of the information/variance contained in your original data Accept that it may become more difficult to interpret the outputs of a model using components as inputs vs. the original features #datascience #analytics #data #datascienceinfinity

  • View profile for Shailendra Sahu, FRM, CQF

    HFT || Risk Management & Analytics || Data Science

    9,742 followers

    Factor Analysis vs. Principal Component Analysis Many people often confuse factor analysis (FA) and principal component analysis (PCA). While both are dimensionality reduction techniques, they serve different purposes. Principal Component Analysis (PCA) Principal Component Analysis is a technique that transforms the original variables into a new set of uncorrelated variables called principal components. These principal components are linear combinations of the original variables, and they are ordered in such a way that the first principal component explains the maximum possible variance in the data, the second principal component explains the next highest variance, and so on. The main goals of PCA are: 1. Variance Explanation: PCA aims to explain as much of the total variance in the dataset as possible. This is achieved by finding principal components that capture the maximum variance. 2. Dimensionality Reduction: By selecting a subset of the principal components, PCA reduces the dimensionality of the data while retaining most of the variability present in the original variables. 3. Orthogonality: Principal components are orthogonal to each other, ensuring that they capture distinct aspects of the data’s variance. Factor Analysis (FA) Factor Analysis is a statistical method used to identify latent variables, or factors, that explain the observed correlations among the original variables. These latent factors are not directly observed but are inferred from the patterns of covariance among the observed variables. The primary objectives of FA are: 1. Covariance Explanation: FA focuses on explaining the covariance among the original variables. It seeks to uncover underlying factors that account for the shared variance. 2. Latent Variables: The goal is to identify a smaller number of unobserved factors that can describe the relationships among the observed variables. These factors are assumed to be the source of the observed correlations. 3. Model-Based Approach: FA is based on a specific model where the observed variables are expressed as linear combinations of the factors plus unique error terms. Key Differences 1. Purpose: PCA aims to reduce dimensionality by explaining the total variance in the data, while FA seeks to uncover latent factors that explain the covariance among variables. 2. Components vs. Factors: PCA produces principal components that are linear combinations of the original variables and aim to capture as much variance as possible. FA identifies latent factors that are inferred from the observed variables and aims to explain the covariance structure. 3. Variance vs. Covariance: PCA focuses on maximizing variance explained by the components, whereas FA focuses on modeling the covariance structure of the data. In summary, while both PCA and FA are used for reducing the dimensionality of data, they serve different purposes and are based on different conceptual frameworks. #quant #regression #pca #factor #variance

  • View profile for Aishwarya Srinivasan
    Aishwarya Srinivasan Aishwarya Srinivasan is an Influencer
    627,890 followers

    Most people building with AI today have never really thought about GANs. But if you want to understand how modern AI systems learn to get better, GANs are one of the most important ideas to know. A Generative Adversarial Network (GAN) works by training two models together: 1/ Generator: creates synthetic data from noise 2/ Discriminator: tries to detect whether the data is real or fake The generator improves by trying to fool the discriminator. The discriminator improves by learning to catch the generator’s mistakes. This competitive loop pushes both models to improve. What many people miss is that GANs were never just about generating images. But long before generative AI became mainstream, GANs were already being used in practical machine learning systems. When I was working as a data scientist at IBM, we experimented with GAN-style setups to improve model performance. In practice, this meant the system was constantly being exposed to harder edge cases instead of only clean training data, which helped improve the model’s accuracy and generalization. The generator creates harder problems, and the model becomes smarter by learning to solve them. That’s why understanding GANs still matters today. You’ll see these ideas show up in: ✦ Synthetic data generation ✦ Data augmentation for small datasets ✦ Image super-resolution ✦ Medical imaging and drug discovery ✦ Robustness testing for ML systems Even though diffusion models and LLMs dominate the conversation today, the core idea of adversarial learning is still everywhere in modern AI systems. So if you find yourself at a cross-roads where the model accuracy is stunted, try using GAN

  • View profile for Sahar Mor

    I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor

    41,888 followers

    Researchers from Oxford University just achieved a 14% performance boost in mathematical reasoning by making LLMs work together like specialists in a company. In their new MALT (Multi-Agent LLM Training) paper, they introduced a novel approach where three specialized LLMs - a generator, verifier, and refinement model - collaborate to solve complex problems, similar to how a programmer, tester, and supervisor work together. The breakthrough lies in their training method: (1) Tree-based exploration - generating thousands of reasoning trajectories by having models interact (2) Credit attribution - identifying which model is responsible for successes or failures (3) Specialized training - using both correct and incorrect examples to train each model for its specific role Using this approach on 8B parameter models, MALT achieved relative improvements of 14% on the MATH dataset, 9% on CommonsenseQA, and 7% on GSM8K. This represents a significant step toward more efficient and capable AI systems, showing that well-coordinated smaller models can match the performance of much larger ones. Paper https://lnkd.in/g6ag9rP4 — Join thousands of world-class researchers and engineers from Google, Stanford, OpenAI, and Meta staying ahead on AI http://aitidbits.ai

  • View profile for Tomasz Tunguz
    Tomasz Tunguz Tomasz Tunguz is an Influencer
    405,481 followers

    I discovered I was designing my AI tools backwards. Here’s an example. This was my newsletter processing chain : reading emails, calling a newsletter processor, extracting companies, & then adding them to the CRM. This involved four different steps, costing $3.69 for every thousand newsletters processed. Before: Newsletter Processing Chain (first image) Then I created a unified newsletter tool which combined everything using the Google Agent Development Kit, Google’s framework for building production grade AI agent tools : (second image) Why is the unified newsletter tool more complicated? It includes multiple actions in a single interface (process, search, extract, validate), implements state management that tracks usage patterns & caches results, has rate limiting built in, & produces structured JSON outputs with metadata instead of plain text. But here’s the counterintuitive part : despite being more complex internally, the unified tool is simpler for the LLM to use because it provides consistent, structured outputs that are easier to parse, even though those outputs are longer. To understand the impact, we ran tests of 30 iterations per test scenario. The results show the impact of the new architecture : (third image) We were able to reduce tokens by 41% (p=0.01, statistically significant), which translated linearly into cost savings. The success rate improved by 8% (p=0.03), & we were able to hit the cache 30% of the time, which is another cost savings. While individual tools produced shorter, “cleaner” responses, they forced the LLM to work harder parsing inconsistent formats. Structured, comprehensive outputs from unified tools enabled more efficient LLM processing, despite being longer. My workflow relied on dozens of specialized Ruby tools for email, research, & task management. Each tool had its own interface, error handling, & output format. By rolling them up into meta tools, the ultimate performance is better, & there’s tremendous cost savings. You can find the complete architecture on GitHub.

Explore categories