Steef-Jan Wiggers’ Post

1mo

GitHub has announced that starting April 24, interaction data from Copilot Free, Pro, and Pro+ users will be used to train its AI models. Users are opted in by default and must manually disable the setting. Copilot Business and Enterprise users are excluded. The scope includes accepted or modified outputs, code snippets, repository structure, navigation patterns, and more. Private repository code can be collected when actively working with Copilot. Collected data may also be shared with Microsoft and its subsidiaries. Community reaction has been largely negative. Developers have called the opt-in-by-default approach a dark pattern, raised concerns about model collapse from training on AI-generated code, and flagged potential GDPR issues with GitHub's "legitimate interest" basis for processing personal data. For organizations, the policy creates a practical risk: individual users on personal-tier licenses could inadvertently expose proprietary code if they don't opt out. GitHub's FAQ clarifies that data from paid organization repositories is excluded regardless of subscription tier. If you're a Copilot user, check your settings before April 24. #github #copilot #ai #privacy #gdpr #opensource #softwaredevelopment https://lnkd.in/eyDcuKkJ

GitHub Will Use Copilot Interaction Data from Free, Pro, and Pro+ Users to Train AI Models infoq.com

To view or add a comment, sign in

More Relevant Posts

Adam C.
1w
Report this post
GitHub changed its policy on April 24. Three days ago. If your developers use Copilot — GitHub's AI coding tool — the ones on personal Free, Pro or Pro+ accounts are now sharing their interaction data with GitHub for AI model training. Prompts they type. Code they show Copilot. Context from whatever they're building. All of it. By default. The Business and Enterprise plans are protected — they're contractually excluded. But here's the thing: in most organisations I know, not every developer is on the company-managed plan. Some are on personal accounts. Some have their own Copilot subscription from before the company set one up. Some are on Free tier because nobody told them there was a company plan to join. Those people are now — unless they opted out — sharing work-related code interactions with GitHub. This isn't a breach. It's a policy gap. The kind that happens when AI tools move faster than procurement. Three things to do this week: → Find out how your developers are actually licensed for Copilot → Anyone on a personal account using it for work: opt out (GitHub Settings > Privacy > disable AI model training) or move them onto the Business plan → Add "which AI tools, on which accounts" to your AI use policy It's a thirty-minute audit. The alternative is finding out the hard way. https://lnkd.in/g_ZH5-pd #Cybersecurity #AI #Leadership #AIGovernance #Copilot

GitHub: We going to train on your data after all theregister.com

5 Comments
Like Comment
To view or add a comment, sign in
Dusten Harrison, CISSP
4d
Report this post
This should be setting off alarm bells in every engineering and security organization right now. GitHub quietly flipped a default switch on April 24th, and unless your developers on personal Copilot accounts actively opted out, your proprietary source code, prompts, and internal context are now being used to train AI models — with no breach notification, no warning, and no fanfare. This isn't a theoretical risk. Think about what your developers actually type into Copilot: internal architecture decisions, unreleased product code, authentication logic, business-sensitive workflows. All of that is now fair game under GitHub's updated policy for personal account holders. The uncomfortable truth is that most organizations have no idea how many of their developers are using personal Copilot accounts for work. Shadow AI adoption is real, it's widespread, and policies almost never keep pace with how fast developers actually adopt these tools. This is a 30-minute audit that could prevent a significant intellectual property and compliance exposure. Do it this week — find out who is using what, on which account, and close the gap before your legal or compliance team finds out the hard way.

Adam C.

Keynote Speaker | Digital Transformation | Board Advisor | CISO | CTO | CEO
1w

GitHub changed its policy on April 24. Three days ago. If your developers use Copilot — GitHub's AI coding tool — the ones on personal Free, Pro or Pro+ accounts are now sharing their interaction data with GitHub for AI model training. Prompts they type. Code they show Copilot. Context from whatever they're building. All of it. By default. The Business and Enterprise plans are protected — they're contractually excluded. But here's the thing: in most organisations I know, not every developer is on the company-managed plan. Some are on personal accounts. Some have their own Copilot subscription from before the company set one up. Some are on Free tier because nobody told them there was a company plan to join. Those people are now — unless they opted out — sharing work-related code interactions with GitHub. This isn't a breach. It's a policy gap. The kind that happens when AI tools move faster than procurement. Three things to do this week: → Find out how your developers are actually licensed for Copilot → Anyone on a personal account using it for work: opt out (GitHub Settings > Privacy > disable AI model training) or move them onto the Business plan → Add "which AI tools, on which accounts" to your AI use policy It's a thirty-minute audit. The alternative is finding out the hard way. https://lnkd.in/g_ZH5-pd #Cybersecurity #AI #Leadership #AIGovernance #Copilot

GitHub: We going to train on your data after all theregister.com
Like Comment
To view or add a comment, sign in
Ann-Mary Rajanayagam
2w Edited
Report this post
⚠️ Warning! GitHub is quietly backtracking on its commitment not to use your code to train its AI. Starting April 24, 2026, Microsoft (owner of GitHub) will 𝐚𝐮𝐭𝐨𝐦𝐚𝐭𝐢𝐜𝐚𝐥𝐥𝐲 𝐞𝐧𝐫𝐨𝐥𝐥 GitHub Copilot Free, Pro, and Pro+ users in a program that uses their 'Interaction Data' to train AI models 𝐮𝐧𝐥𝐞𝐬𝐬 𝐭𝐡𝐞𝐲 𝐦𝐚𝐧𝐮𝐚𝐥𝐥𝐲 𝐨𝐩𝐭 𝐨𝐮𝐭." 🔵 What is Github? GitHub is a repository of the world's collective engineering logic. If your company builds software, your most valuable trade secrets likely live on GitHub. In 2018, Microsoft bought it for $7.5 billion. 🔵 What is actually changing? GitHub has always promised that your private repositories will stay private. That is still mostly true "at rest" (while the code is sitting on their servers). The moment you start typing, that code is now "in-flight." Under the new policy, the snippets sent to the AI, and how you interact with them are now fair game for training. 🔵 Why now? The AI industry has hit a wall. Public data (the internet) has been "scraped dry." To make the next generation of AI "Agentic" (capable of reasoning), companies need Human-in-the-Loop data. ▸They don’t just want your code. ▸They want to see why you rejected a suggestion. ▸They want to see how you fixed a bug. They are harvesting professional intuition. 🔵The Real Signal Opt-In is the gold standard for trust. By making this Opt-Out, GitHub is banking on "inertia." They are assuming users are too busy to check the fine print. Notably, they aren't touching Enterprise accounts or Students. That leaves the mid-market and individual pros as the primary "training crop" for Microsoft’s next trillion-dollar model. 🔵The Strategic Implication If you are in a regulated industry, a family office, or a high-growth tech firm, your "logic" is your competitive advantage. Allowing it to be subsumed into a global model is IP leakage. 🔵The Fix → Log in to GitHub. → Settings > Copilot > Privacy. → Toggle OFF "Allow GitHub to use my data for AI model training." 🍋🟩 My Take Let’s call this what it actually is: Mass-scale, involuntary RLHF. Reinforcement Learning from Human Feedback (RLHF) is the "secret sauce" that makes AI feel smart. It’s the process where humans grade the AI’s homework. Usually, companies pay millions to specialized data labelers for this. By flipping this switch, 𝐌𝐢𝐜𝐫𝐨𝐬𝐨𝐟𝐭 𝐢𝐬 𝐞𝐟𝐟𝐞𝐜𝐭𝐢𝐯𝐞𝐥𝐲 𝐭𝐮𝐫𝐧𝐢𝐧𝐠 𝐲𝐨𝐮 𝐢𝐧𝐭𝐨 𝐚𝐧 𝐮𝐧𝐩𝐚𝐢𝐝 𝐝𝐚𝐭𝐚 𝐥𝐚𝐛𝐞𝐥𝐞𝐫. Every time you "fix" a Copilot suggestion, you are providing high-value "correction" data that Microsoft needs to refine its products. Opt Out now. Don't pay a monthly fee to act as an unpaid data labeler for a trillion-dollar company that just redefined your private work as their free training data. ✍ What’s your take? Is "Default Opt-In" for model training a reasonable cost for innovation, or is it a breach of trust for a professional platform?
2 Comments
Like Comment
To view or add a comment, sign in
Rakesh Gorkal
3w Edited
Report this post
Action Required: Disable GitHub Copilot's New Data Training Policy If you use GitHub Copilot, there is a critical update you need to know about. Starting April 24, 2026, GitHub will begin using your interactions, including code snippets, inputs, and outputs, to train its AI models by default for Free, Pro, and Pro+ accounts. While AI advancement is exciting, many developers and companies have strict privacy requirements regarding their proprietary logic and data flow. How to opt-out (in under 60 seconds): Go to your GitHub Settings. Click on Copilot in the left-hand sidebar. Look for the dropdown: "Allow GitHub to use my data for AI model training". Change the selection to "Disabled". Why this matters: IP Protection: Ensure your unique solutions don't inadvertently influence future model outputs. Privacy: Keep your active session context private. Consent: This is an "opt-out" system, meaning it's active unless you manually turn it off. Note: This change does not currently apply to Copilot Business or Enterprise tiers, but it’s worth a quick audit for every developer to ensure your settings align with your privacy preferences. Spread the word to your fellow devs! 💻🔒 #GitHub #GitHubCopilot #DataPrivacy #SoftwareEngineering #DeveloperTips #AI Ethics
Like Comment
To view or add a comment, sign in
Robyna May
2w
Report this post
I think this is a little worrying personally. We were all training the AI models through Captcha and this is another play to gain a dataset and human labour. https://lnkd.in/gMqYqVXS
Ann-Mary Rajanayagam

Helping Leaders Make Responsible AI & Tech Decisions | Chief Technology Officer | Founder - Alderon & Female Founders Club | NED | Speaker | Creator of the Human-First, AI-Native Framework
2w Edited

⚠️ Warning! GitHub is quietly backtracking on its commitment not to use your code to train its AI. Starting April 24, 2026, Microsoft (owner of GitHub) will 𝐚𝐮𝐭𝐨𝐦𝐚𝐭𝐢𝐜𝐚𝐥𝐥𝐲 𝐞𝐧𝐫𝐨𝐥𝐥 GitHub Copilot Free, Pro, and Pro+ users in a program that uses their 'Interaction Data' to train AI models 𝐮𝐧𝐥𝐞𝐬𝐬 𝐭𝐡𝐞𝐲 𝐦𝐚𝐧𝐮𝐚𝐥𝐥𝐲 𝐨𝐩𝐭 𝐨𝐮𝐭." 🔵 What is Github? GitHub is a repository of the world's collective engineering logic. If your company builds software, your most valuable trade secrets likely live on GitHub. In 2018, Microsoft bought it for $7.5 billion. 🔵 What is actually changing? GitHub has always promised that your private repositories will stay private. That is still mostly true "at rest" (while the code is sitting on their servers). The moment you start typing, that code is now "in-flight." Under the new policy, the snippets sent to the AI, and how you interact with them are now fair game for training. 🔵 Why now? The AI industry has hit a wall. Public data (the internet) has been "scraped dry." To make the next generation of AI "Agentic" (capable of reasoning), companies need Human-in-the-Loop data. ▸They don’t just want your code. ▸They want to see why you rejected a suggestion. ▸They want to see how you fixed a bug. They are harvesting professional intuition. 🔵The Real Signal Opt-In is the gold standard for trust. By making this Opt-Out, GitHub is banking on "inertia." They are assuming users are too busy to check the fine print. Notably, they aren't touching Enterprise accounts or Students. That leaves the mid-market and individual pros as the primary "training crop" for Microsoft’s next trillion-dollar model. 🔵The Strategic Implication If you are in a regulated industry, a family office, or a high-growth tech firm, your "logic" is your competitive advantage. Allowing it to be subsumed into a global model is IP leakage. 🔵The Fix → Log in to GitHub. → Settings > Copilot > Privacy. → Toggle OFF "Allow GitHub to use my data for AI model training." 🍋🟩 My Take Let’s call this what it actually is: Mass-scale, involuntary RLHF. Reinforcement Learning from Human Feedback (RLHF) is the "secret sauce" that makes AI feel smart. It’s the process where humans grade the AI’s homework. Usually, companies pay millions to specialized data labelers for this. By flipping this switch, 𝐌𝐢𝐜𝐫𝐨𝐬𝐨𝐟𝐭 𝐢𝐬 𝐞𝐟𝐟𝐞𝐜𝐭𝐢𝐯𝐞𝐥𝐲 𝐭𝐮𝐫𝐧𝐢𝐧𝐠 𝐲𝐨𝐮 𝐢𝐧𝐭𝐨 𝐚𝐧 𝐮𝐧𝐩𝐚𝐢𝐝 𝐝𝐚𝐭𝐚 𝐥𝐚𝐛𝐞𝐥𝐞𝐫. Every time you "fix" a Copilot suggestion, you are providing high-value "correction" data that Microsoft needs to refine its products. Opt Out now. Don't pay a monthly fee to act as an unpaid data labeler for a trillion-dollar company that just redefined your private work as their free training data. ✍ What’s your take? Is "Default Opt-In" for model training a reasonable cost for innovation, or is it a breach of trust for a professional platform?
1 Comment
Like Comment
To view or add a comment, sign in
Ilias Dimopoulos
4d
Report this post
GitHub is moving Copilot from Premium Requests to GitHub AI Credits. What does this mean in practice? It means that, most probably, you cannot build a multi-million-dollar company on top of LLMs while assuming the cost is "€20 per month" or "basically free". As I have said many times in previous posts, LLMs are expensive. The real cost of your "ChatGPT therapist", your "LLM developer", your "AI designer", or your "team of agents" is not the monthly subscription price you see today. That price is often a subsidized entry point. The reason major vendors offer increasingly powerful models at very low prices is not because LLM is magically cheap. It is because they want adoption, dependency, and eventually vendor lock-in. They want you to build your workflows, your products, your habits, your teams, and your operating model around their LLM layer. And then, once you rely on it deeply enough, the pricing starts moving closer to the actual cost of consumption. This was something I was predicting in previous years. It is not a prediction anymore. It is happening. GitHub, for example, is moving Copilot from Premium Requests to GitHub AI Credits. In practice, this means usage will be priced based on the tokens consumed and the model you choose. So if you have built "amazing agents" that take care of every detail, run long workflows, use large context windows, produce very detailed responses, and coordinate multiple sub-agents, then the equation is simple: --> The more tokens you burn, the more you pay. <-- And I am watching many LLM-based solutions being used for problems that are actually deterministic. Problems that could have been solved with a Python script. Problems that could have been solved with a simple workflow engine. Problems where the LLM is not being used for reasoning, ambiguity, language understanding or judgment, but simply as an expensive replacement for basic automation. That is where the cost problem becomes dangerous. LLMs are powerful. But if you use them everywhere, for everything, without understanding where they actually add value, your "AI-first" architecture can very quickly become an "AI-cost-first" architecture. Curious to hear your view. Share your experience in the comments Repost if this is a discussion more people should have, or reach out if you are trying to understand where AI actually adds value, and where simple automation is the better choice.

6 Comments
Like Comment
To view or add a comment, sign in
Dhruvi Mungala
1w
Report this post
🚨 Most developers ignore platform notifications. That’s risky. Today I noticed an important update from GitHub regarding GitHub Copilot interaction data and AI model training. Starting April 24, GitHub states that Copilot interaction data may be used for AI model training unless users choose to “opt out”. Many developers still don’t fully understand what “opt out” means. Simple meaning: • If you do nothing → the feature remains enabled • If you opt out → you manually disable participation This is a reminder that: • privacy settings matter • AI tools may use interaction data • developers should read platform updates carefully • platforms often expect users to manage their own preferences and privacy settings Important: This update is about GitHub Copilot interaction data — not simply pushing code to GitHub repositories. As developers, we should understand the tools we use daily instead of blindly accepting every update. You can review your settings here: GitHub → Settings → Copilot → Privacy / Data Controls → “Allow GitHub to use my data for AI model training” → Enable/Disable Official update from GitHub: https://lnkd.in/diCddST8 Have you checked your GitHub Copilot privacy settings yet? #GitHub #GitHubCopilot #AI #Privacy #DeveloperAwareness #SoftwareEngineering #Programming
Like Comment
To view or add a comment, sign in
Tim Janus
6d
Report this post
Microsoft’s GitHub Copilot drastically raises prices for annual plans by switching to usage-based pricing. Yesterday, GitHub announced that, effective June 1, 2026, all Copilot plans will transition from request-based billing to token-based usage measured in GitHub AI Credits. Existing annual subscribers will face significantly higher model multipliers in the interim and will ultimately need to convert to monthly plans. This change coincides precisely with the amended Microsoft-OpenAI partnership also announced on April 27, which ends exclusivity, restructures revenue sharing, and grants OpenAI greater freedom to partner with other cloud providers. The message is clear: the era of heavy subsidization of AI tools by Microsoft is coming to an end. Yet the fundamentals remain compelling. Raw AI inference costs are still very low in absolute terms. The user base for AI-assisted coding continues to grow rapidly, and for professionals who apply these tools strategically, the productivity gains are substantial. For now, the most effective approach is to combine the strongest models rather than depending on a single provider. OpenAI’s Codex GPTs and Anthropic’s Claude Opus models deliver excellent depth for initial feature drafting, complex reasoning, and architectural work. In my experience, GPT 5.5 beats Opus 4.7 at the moment — however, what's important is that they continue to operate under competitive, partially subsidized token economics. Pairing them with a model focused on coding tasks, e.g., Grok Code Fast, for lightning-fast iteration, code reviews, refactoring, and agentic workflows creates the optimal mix for modern development. Having the right AI roles, workflows, and guardrails in place lays the foundation for the transition from vibe-coding to agentic programming. However, developers and engineering teams who master multi-model workflows will maintain a clear advantage as the industry fully shifts to usage-based pricing. If you are navigating these changes and would like expert guidance on optimizing your AI coding stack, building cost-efficient workflows, or maximizing return on investment in this new environment, I offer targeted consultancy sessions for individuals and teams. Feel free to send me a direct message or comment “AI WORKFLOW” below, and I will be pleased to arrange a conversation. The subsidy era is closing. The era of strategic, high-leverage AI usage is just beginning. #AgenticProgramming #DeveloperProductivity #AgenticShift
Like Comment
To view or add a comment, sign in
Oleksandr Liubushyn
2w Edited
Report this post
GitHub just quietly announced they'll train AI models on your Copilot interaction data starting April 24. If you're on Free or Pro, you're opted in. By default. That means your prompts, accepted suggestions, code context around the cursor, file names, repo structure, navigation patterns — all of it feeding the next generation of Copilot models. Here's the thing most people are missing: Business and Enterprise accounts are exempt. Read that again. GitHub is basically telling you that your company's code is training data — unless you're paying enterprise rates. That's not a privacy policy. That's a pricing strategy. What to do before April 24: ✅ Audit which Copilot plan every developer is on ✅ If you're on Pro — go to Settings → Copilot → toggle off data training ✅ If you're building anything regulated (finance, healthcare, gov) — upgrade to Business. The $19/seat is cheaper than the compliance conversation later ✅ Document your AI tool data policies. Your clients will ask. This isn't about being paranoid. It's about knowing where your intellectual property goes before someone else decides for you. What's your team's policy on AI tool data? Or is that conversation still "on the list"? #EnterpriseAI #GitHubCopilot #DevTools #CTO #AIGovernance
Like Comment
To view or add a comment, sign in