← All articles March 04, 2026

Proprietary Source Code Is Leaking Through AI Assistants

Developers are sharing proprietary code with ChatGPT, Copilot, and Claude. Major companies have responded with bans. Here's why that's not enough.

Blacksight Team

Software development was one of the first knowledge work domains to embrace generative AI. Code completion, bug detection, refactoring suggestions, and documentation generation are genuinely transformative capabilities. But every line of proprietary code that enters an AI tool is a line of intellectual property that has left the organization’s control.

The Scale of the Problem

Developers interact with AI tools differently than most knowledge workers. Rather than asking broad questions, they paste specific code blocks, function signatures, architecture diagrams, and error traces. The context required for useful AI assistance in software engineering is inherently detailed and inherently proprietary.

This creates a unique exposure profile. A marketing employee pasting a draft press release into ChatGPT is sharing information that will likely become public anyway. A developer pasting an authentication module, a proprietary algorithm, or an internal API specification is sharing the organization’s core intellectual property.

The pattern is widespread. Developer surveys consistently show that a significant majority of professional software engineers use AI coding tools in their daily work. Many of those interactions involve proprietary codebases, and most organizations have limited visibility into what code is being shared.

Corporate Responses: The Ban Wave

Several of the world’s largest technology companies have restricted or banned employee use of external AI coding tools:

  • Apple internally restricted employee use of ChatGPT and external AI tools, reportedly due to concerns about confidential data being shared with third-party services.
  • Amazon warned employees not to share confidential information with ChatGPT after discovering that AI-generated responses closely resembled internal Amazon data, suggesting that proprietary information was already present in training sets.
  • Samsung banned ChatGPT usage entirely after the semiconductor trade secret leak, as covered in a previous article in this series.
  • JPMorgan Chase restricted employee use of ChatGPT, a move that extended across the broader financial services industry.
  • Deutsche Bank, Goldman Sachs, and Citigroup all implemented various levels of restriction on external AI tool usage.

These bans reflect a real understanding of the risk. They also reflect the absence of better options: without tooling to monitor and control what data enters AI tools, the only available lever is to prohibit usage entirely.

What Gets Leaked

The types of source code most commonly shared with AI tools are precisely the types that carry the most intellectual property value:

  • Authentication and authorization logic including token handling, permission models, and session management
  • Proprietary algorithms that represent competitive advantages in pricing, recommendation, search ranking, and other domains
  • Database schemas and data models that reveal the structure of internal systems
  • Internal API contracts that expose microservice architectures and integration points
  • Infrastructure-as-code including cloud configurations, deployment scripts, and environment details that reveal attack surfaces
  • Security implementations including encryption routines, key management patterns, and vulnerability mitigations

Why Bans Fall Short

Blanket bans on AI coding tools face a fundamental adoption problem. Developers who have experienced the productivity gains of AI-assisted coding are reluctant to abandon them. The result is predictable: usage moves to personal devices, personal accounts, and networks that bypass corporate controls entirely.

When usage goes underground, the organization loses all visibility. A developer using ChatGPT through the corporate network with a managed browser at least generates network logs that could theoretically be audited. A developer using Claude on their personal laptop over a mobile hotspot is completely invisible to the security team.

The more effective approach is to allow AI tool usage within a governed framework. This means deploying controls that can inspect AI interactions in real time, detect when proprietary code patterns are being submitted, and enforce policies that prevent the most sensitive intellectual property from leaving the organization.

Building a Governed Approach

A practical framework for managing source code exposure through AI tools includes several components:

  • Classification. Not all code is equally sensitive. Public open-source contributions, documentation, and boilerplate do not require the same protection as core algorithms and security infrastructure.
  • Detection. Real-time scanning of AI prompts for code patterns, including language-specific syntax, internal naming conventions, and proprietary identifiers.
  • Policy enforcement. Configurable rules that can warn, log, or block submissions based on the sensitivity of the detected content.
  • Developer experience. Controls that are transparent and minimally disruptive. Developers will work around tools that create excessive friction.

The goal is not to stop developers from using AI. It is to ensure that when they do, the organization’s most valuable intellectual property stays where it belongs.

Protect your organization from AI data leaks.

Blacksight AI monitors every AI interaction without reading prompts. Deploy in minutes, get visibility in seconds.