← All articles May 08, 2026

5,000 Vibe-Coded Apps Are Leaking Corporate Data Right Now. Yours Might Be One of Them.

A new study found thousands of AI-built web apps exposing medical records, financial data, and corporate secrets with zero authentication. This is the S3 bucket crisis all over again, but worse.

Blacksight Team

On May 7, 2026, WIRED reported that security researchers at RedAccess had discovered more than 5,000 web applications built with popular AI coding tools — Lovable, Replit, Base44, and Netlify — sitting on the open internet with virtually no security. No authentication. No access controls. Just a URL between the public internet and sensitive corporate data.

Around 40 percent of those apps exposed data that should never be public: medical records with personally identifiable information of doctors, corporate ad spending breakdowns, go-to-market strategy documents, full customer chatbot conversation logs including names and contact details, shipping cargo manifests, and financial records from dozens of companies. In some cases, the researchers could escalate to admin privileges and remove other administrators entirely.

The apps were trivially easy to find. Because the AI coding platforms host apps on their own domains, the researchers simply ran Google and Bing searches against those domains. No special tools. No hacking. Just search queries that anyone could run right now.

This is not a theoretical risk. It is happening at scale, today, and most organizations have no idea they are exposed.

What Is Vibe Coding, and Why Does It Matter?

“Vibe coding” is the term for using AI tools to generate complete, working applications from natural language descriptions. You tell the tool what you want — “build me a dashboard that tracks our cargo shipments” or “create a patient scheduling app for our clinic” — and it generates code, deploys it, and gives you a live URL. The entire process can take seconds.

The tools that enable this, including Lovable, Replit, Base44, and others, have made application development accessible to people who have never written a line of code. Marketing managers, operations leads, sales directors, and clinic administrators can now spin up fully functional web applications without involving engineering or IT.

The productivity gain is real. The security implications are catastrophic.

The S3 Bucket Crisis, Reloaded

Security professionals will recognize this pattern. Between 2017 and 2020, the industry dealt with an epidemic of exposed Amazon S3 storage buckets. Companies from Verizon to World Wrestling Entertainment accidentally left massive datasets publicly accessible due to confusing default settings and a lack of guardrails. Hundreds of millions of records were exposed. The cybersecurity industry partially blamed Amazon for security configurations that so consistently tripped up customers.

RedAccess founder Dor Zvi draws the same parallel. The AI coding platforms are creating identical dynamics: powerful tools with weak defaults, handed to users who lack security context, producing internet-facing applications that nobody in IT knows about.

But this time, the problem is arguably worse. S3 buckets were provisioned by engineers and ops teams who at least had some awareness of cloud security concepts, even if they misconfigured a setting. Vibe-coded apps are being created by people who have no security background at all and no reason to think about authentication, access controls, or data exposure. The concept of “this app is publicly accessible on the internet” may not even register as a concern for someone who just wanted an internal tool.

The Platform Response: “Not Our Problem”

When WIRED contacted the four AI coding platforms about the findings, the responses followed a familiar playbook. Replit’s CEO said that public apps being accessible on the internet is “expected behavior” and that users can toggle privacy settings with a single click. Lovable said it “gives builders the tools to build securely, but how an app is configured is ultimately the creator’s responsibility.” Base44’s parent company Wix said that public accessibility “reflects a user configuration choice, not a platform vulnerability.”

These responses are technically correct and entirely beside the point.

When the majority of your users are non-technical people building applications for the first time, “the user should have configured it correctly” is not a security strategy. It is an abdication of responsibility. Amazon made the same argument about S3 before ultimately redesigning its defaults and adding multiple layers of warnings and safeguards. The AI coding platforms will likely end up in the same place, but not before a significant amount of damage is done.

The Real Problem: Ungoverned Application Creation

The exposed data is the symptom. The underlying disease is that organizations have lost control over who can create internet-facing applications and what data those applications can access.

Traditional software development has guardrails for a reason. Code review catches security flaws. Security scans identify vulnerabilities before deployment. IT governance ensures that new applications go through an approval process. Network controls determine what is accessible from the internet and what is not.

Vibe-coded applications bypass every single one of these controls.

An operations manager builds a tracking dashboard using company data and deploys it to a public URL. A marketing coordinator creates a lead management tool that stores customer information in a database with no access controls. A clinic administrator puts together a scheduling app that includes patient information. None of these people are doing anything malicious. They are trying to be productive. But they are creating internet-facing applications that handle sensitive data, and they are doing it entirely outside the organization’s software development lifecycle.

As Zvi told WIRED: “Anyone from your company at any moment can generate an app, and this is not going through any development cycle or any security check. People can just start using it in production without asking anyone. And they do.”

This is shadow IT on a completely different scale. The old shadow IT problem was employees signing up for a SaaS tool. The new shadow IT problem is employees building entire applications.

The Phishing Dimension

The security implications extend beyond data exposure. RedAccess also found numerous phishing sites hosted on Lovable’s domain that impersonated major brands including Bank of America, Costco, FedEx, Trader Joe’s, and McDonald’s.

This is a compounding problem. Phishing pages hosted on legitimate platform domains inherit the domain’s reputation. Security tools that evaluate URL trustworthiness may not flag a page on a known, reputable domain. Employees trained to check for suspicious domains may let their guard down when the URL belongs to a recognized technology company. The AI coding platforms are inadvertently providing infrastructure for phishing at scale.

Why Traditional Security Tools Cannot See This

The architectural gap here is worth understanding, because it explains why this problem has grown so large without most organizations noticing.

Network-level controls like firewalls and web proxies operate at the perimeter. They can block known domains or restrict outbound traffic. But they cannot prevent an employee from opening a browser on their personal device, typing a prompt into an AI coding tool, and deploying a live application that contains corporate data. The application creation happens entirely within the AI platform’s interface. The data exposure happens on the AI platform’s infrastructure. The organization’s security stack never touches either transaction.

Endpoint detection and response tools monitor for malware, suspicious processes, and known threat indicators. An employee using a legitimate AI coding tool does not trigger any of these detections. The activity looks identical to normal web browsing.

Data loss prevention solutions scan email, file transfers, and cloud storage for sensitive content. They do not scan the contents of web applications deployed to third-party platforms. An employee can paste an entire customer database into an AI coding tool’s interface, deploy it as a live web app, and the organization’s DLP solution will never see it.

The gap is not a misconfiguration or a missing rule. It is a fundamental blind spot in how current security tools understand data flows. Data is no longer just being exfiltrated through files and messages. It is being transformed into live applications.

What Organizations Should Be Doing

The RedAccess findings should trigger an immediate reassessment of how organizations monitor and govern AI tool usage. Several steps are critical:

Gain visibility into AI tool usage across the organization. You cannot govern what you cannot see. Organizations need to understand which employees are using AI coding tools, what data they are providing to those tools, and what applications are being created. This requires tooling that monitors AI interactions specifically, not general-purpose web filtering that treats AI tools the same as any other website.

Implement real-time policy enforcement for sensitive data. Blocking AI tools outright is counterproductive — it pushes usage underground and eliminates visibility entirely. Instead, organizations need the ability to detect when sensitive data categories (PII, PHI, financial data, source code, credentials) are being submitted to AI tools and enforce appropriate policies in real time. Some data should be blocked. Some should trigger a warning. Some should be logged for audit. The policy should be granular, not binary.

Extend your security perimeter to include AI interactions. The traditional perimeter of firewalls, email gateways, and cloud access security brokers was built for a world where data left the organization through files, messages, and API calls. AI tools have created an entirely new exfiltration surface: the prompt window and the application builder. Security architecture needs to evolve to include this surface.

Establish governance for AI-created applications. Organizations need policies that address who is authorized to create applications using AI tools, what data those applications can access, and what review process applies before an AI-created application handles production data. This does not mean banning non-developers from using AI. It means ensuring that the same governance principles that apply to traditionally developed software also apply to AI-generated applications.

Audit your current exposure. If employees in your organization have access to AI coding tools — and statistically, they almost certainly do — there may already be vibe-coded applications sitting on the public internet with your data in them. The same search techniques that RedAccess used are available to anyone. Your security team should be running them before someone else does.

The Window Is Closing

The S3 bucket epidemic taught the industry a painful lesson about what happens when powerful tools are given to users without adequate guardrails. The damage from misconfigured storage buckets affected hundreds of millions of people and took years to fully address.

The vibe coding data exposure crisis is following the same trajectory, but faster. AI coding tools are growing explosively. The barrier to creating a public-facing web application is now a single sentence typed into a chat interface. And the people creating these applications are, in many cases, completely unaware that they are exposing data.

The 5,000 exposed applications that RedAccess found represent only what was discoverable on the AI platforms’ own domains. The actual number, including applications deployed to custom domains, is certainly much higher. And it is growing every day.

Organizations that wait for the AI coding platforms to solve this problem are making the same bet that companies made with Amazon S3 in 2017. It did not end well then. It will not end well now.

The question is not whether your organization has employees vibe coding applications with corporate data. The question is whether you can see it when they do.

This Is Why We Built Blacksight

Blacksight was built specifically for this class of problem. When an employee opens an AI coding tool and pastes in customer data, financial records, or internal documents to generate an application, Blacksight sees it in real time — before the data ever reaches the AI service.

Our browser extension and endpoint agent sit between the employee and the AI tool, scanning every interaction for sensitive data patterns: PII, credentials, financial records, health information, source code, and custom patterns you define. When a policy violation is detected, Blacksight can block the request entirely, redact the sensitive fields and let the cleaned prompt through, or log it for your security team to review — depending on the policy you set.

Critically, all of this scanning happens locally on the employee’s device. The prompt content never passes through our servers. We see the verdict — allowed, blocked, or redacted — plus metadata like which AI tool was used, when, and by whom. Your security team gets full visibility in the dashboard. The actual data stays on your network.

For organizations where employees use AI coding tools like Lovable, Replit, or Cursor, this means you can detect the moment someone feeds sensitive corporate data into an application builder. You can enforce policies that prevent regulated data from ever reaching those tools. And you can see exactly which AI services are being used across your organization, including ones your IT team has never heard of.

The vibe coding exposure problem is a shadow AI problem. And shadow AI visibility is exactly what Blacksight does. Start a free trial and see what your employees are sending to AI in the first ten minutes.