TL;DRĀ
Generative AI tools present significant and multifaceted data security risks for enterprises globally, extending beyond traditional breach vectors. IBM's 2024 report highlights the financial impact, with AI-related data breaches costing organisations an average of $5.2 million ā a staggering 28% higher than conventional breaches. Compounding this, a 2024 Kaspersky study revealed that a concerning 67% of employees are sharing internal company data with generative AI without authorisation, often unaware of the implications. To mitigate these risks, organisations must adopt a layered approach, including implementing AI-specific Data Loss Prevention (DLP) systems tailored for both internal and external use cases, deploying private AI instances (shown by Gartner to reduce data exposure incidents by 76%), establishing granular and context-aware usage policies, and implementing technical controls that address the unique challenges posed by different regional compliance requirements.
How Does Generative AI Create Data Security Risks for Global Enterprises?Ā
As generative AI tools like ChatGPT, Claude, and Gemini rapidly integrate into enterprise workflows, they introduce a new paradigm of data security challenges. The seemingly innocuous act of using these powerful tools, whether for drafting internal documents or assisting customer interactions, can inadvertently expose an organisation's most sensitive information in ways not previously encountered. For CISOs navigating this complex and evolving landscape, especially within diverse regulatory environments, a nuanced understanding of these risks (both internal and external) is not just advisable, it's mission-critical for safeguarding organisational integrity and customer trust.
What is the Scale of the AI Data Security Problem?Ā
Recent research underscores the urgency of addressing AI-related data security vulnerabilities:
- According to IBM's 2024 Cost of a Data Breach Report, AI-related data breaches not only carry a higher financial burden, averaging $5.2 million per incident, but also often involve more complex remediation and reputational damage compared to non-AI-related breaches.
- The Kaspersky study's finding that 67% of employees regularly share internal company data with generative AI tools without proper authorisation points to a significant gap in user awareness and control, highlighting the ease with which sensitive information can be unintentionally uploaded and potentially stored or used by third-party AI providers.
- Microsoft Security Research's revelation that 42% of enterprise data leaks in 2024 were traced back to the use of public AI services with sensitive information underscores the inherent risks associated with relying on external platforms for tasks involving confidential data.
How Does Data Leakage Occur Through GenAI Systems?
The mechanisms through which data leakage occurs with GenAI are multifaceted and require careful consideration for both internal and external applications:
Internal Use Risks: The Danger of Cross-Team Information Spillover
- Training Data Retention and Internal Data Silos: When employees across different departments input data into a shared GenAI instance (even a seemingly "private" one managed internally), the AI model may retain and learn from this information. This poses a significant risk of inadvertently exposing data across teams that should remain siloed. For instance, sensitive financial projections from the accounting department could be implicitly learned by the AI and potentially surface in responses generated for the marketing team querying for content ideas, violating internal confidentiality protocols and potentially exposing strategic insights prematurely.
- Contextual Bleed in Shared Instances: Even without explicit long-term retention, in multi-tenant internal AI environments, the context of one user's prompt and the AI's response could, in rare but possible scenarios, influence subsequent interactions with other users within the same instance. Imagine a HR query about a sensitive employee relations issue inadvertently shaping the tone or information provided in response to a sales team member's query about a client interaction.
- Lack of Granular Access Controls: Many organisations lack the sophisticated access controls within their internal AI deployments to restrict which data different teams or individuals can interact with or contribute to the AI's knowledge base. This can lead to unintentional data commingling and increase the attack surface for insider threats or accidental oversharing.
External Use Risks: Protecting Customer Data and Maintaining Trust
- Exposure via Support AI Agents: The deployment of AI-powered customer support agents, while offering efficiency gains, introduces the risk of inadvertently exposing sensitive customer data. If the AI model is not rigorously trained on anonymised data and lacks robust safeguards, it could potentially retrieve and share information from one customer's interaction with another, leading to severe privacy violations and a loss of customer trust. For example, a support bot might mistakenly reference details from a previous customer's account while assisting a new inquiry.
- Data Leakage Through Personalisation Features: GenAI is increasingly used to personalise customer experiences. However, if the underlying data used for personalisation is not properly secured and segmented, there's a risk of it being exposed to unauthorised parties or used in ways that violate privacy regulations. Imagine a scenario where a customer's purchase history, used to tailor recommendations, is inadvertently revealed to another customer through a system error or a flaw in the AI's personalisation algorithm.
- Prompt Injection Attacks Targeting External Systems: Malicious actors could potentially craft specific prompts to external-facing AI systems that are designed to extract sensitive information or manipulate the AI into revealing underlying data or system vulnerabilities. This is particularly concerning for AI agents connected to internal databases or APIs.
What Real-World Consequences Have Organisations in Key Financial Centres Faced?
- New York: Financial Services Data Exposure and Regulatory Scrutiny: A New York-based investment bank faced significant regulatory penalties and reputational damage after a large language model, used internally to summarise client communications, inadvertently included non-anonymized sensitive financial details in reports accessible to a broader group of employees than intended. This violated SEC and FINRA regulations concerning the safeguarding of client confidential information and led to a costly internal audit and mandated improvements to their data governance framework for AI.
- London: Pharmaceutical IP Compromise and Competitive Disadvantage: In early 2025, a prominent London-based pharmaceutical company suffered a substantial intellectual property breach when researchers, under pressure to accelerate drug discovery, used a publicly available GenAI tool to analyse proprietary research data. The AI model, having retained aspects of this input, later generated similar molecular structures and insights that were subsequently discovered in patent filings by a direct competitor, highlighting the critical need for robust IP protection strategies in the age of AI and raising concerns under UK intellectual property law.
Regional Regulatory Comparison: How Do AI Data Protection Requirements Differ?
The regulatory landscape for AI data protection is still evolving and varies significantly across regions:
- European Union (EU) and the United Kingdom (UK): GDPR places stringent requirements on the processing of personal data, including data inputted into AI systems. Organisations must ensure a lawful basis for processing, implement appropriate technical and organisational measures to protect this data, and be transparent about how AI systems use user inputs. The "right to be forgotten" and data minimisation principles are particularly relevant in the context of AI training data. The upcoming EU AI Act will further introduce specific obligations for high-risk AI systems, including those dealing with sensitive data.
- North America (United States and Canada): The US follows a sector-specific approach with regulations like HIPAA for healthcare and GLBA for financial institutions. While there isn't a comprehensive federal privacy law akin to GDPR, state laws like the California Consumer Privacy Act (CCPA) and the Virginia Consumer Data Protection Act (VCDPA) provide consumers with certain rights regarding their personal information, which extends to data processed by AI. Canada's Personal Information Protection and Electronic Documents Act (PIPEDA) also outlines obligations for organisations handling personal information.
- Asia-Pacific (APAC): Regulations vary widely across APAC countries. Singapore's Personal Data Protection Act (PDPA), Australia's Privacy Act, and China's Personal Information Protection Law (PIPL) all have provisions concerning the collection, use, and disclosure of personal data by AI systems. Multinational organisations operating in this region face the challenge of navigating diverse and sometimes conflicting requirements.
Understanding these regional nuances is crucial for global enterprises deploying GenAI, as data governance and compliance strategies must be tailored to each jurisdiction.
What Remediation Strategies Work Best?
A multi-layered approach is essential to effectively mitigate AI-related data security risks:
- Implement Context-Aware AI Data Loss Prevention (DLP) Systems: Deploy specialised AI-DLP solutions that go beyond traditional keyword-based scanning. These systems should be capable of understanding the context of information being shared with AI, identifying sensitive data patterns (including code, financial details, PII, and intellectual property), and enforcing policies based on the user, the type of AI being used (internal vs. external), and the sensitivity of the data.
- Implementation tip: Prioritise DLP solutions that offer granular control over data sharing for both internal and external AI tools, integrate seamlessly with your existing security stack, and provide real-time blocking and alerting capabilities for unauthorised AI usage. Configure different DLP rules for internal AI use (e.g., preventing the sharing of cross-team confidential documents) and external AI use (e.g., redacting PII before it's sent to a public LLM).
- Deploy Secure and Isolated AI Environments (Private AI Instances): For handling highly sensitive data, consider implementing private, on-premises AI solutions or dedicated cloud instances that keep your data within your security perimeter and under your direct control. This significantly reduces the risk of data exposure to third-party providers.
- According to Gartner's 2025 AI Security Report, organisations with private AI implementations experience a remarkable 76% fewer data exposure incidents compared to those relying solely on public services, highlighting the tangible benefits of this approach for mitigating both internal and external leakage risks.
- Establish Granular and Context-Aware AI Usage Policies: Develop comprehensive policies that clearly define acceptable and unacceptable uses of AI tools, specifying what types of data can and cannot be shared in different contexts (internal collaboration vs. external customer interaction). These policies should be:
- Specific to different data classification levels and use cases: Clearly outline guidelines for handling confidential, sensitive, and public data within various AI tools and workflows.
- Integrated into mandatory and role-based security awareness training: Educate employees on the specific risks associated with AI data sharing and their responsibilities in adhering to the organisation's policies.
- Regularly updated as AI capabilities and organisational use evolve: Continuously review and adapt policies to address new AI features and emerging threats.
- Implement Robust Technical Controls Tailored for AI Interactions:
- Utilise API Gateways with Data Sanitisation: For external-facing AI applications, implement API gateways that can inspect and sanitise data in requests and responses, preventing sensitive information from being inadvertently exposed.
- Deploy Browser Extensions and Desktop Agents with Real-time Warnings: Implement tools that can detect when users are attempting to paste sensitive data into AI interfaces (both internal and external) and provide warnings or block the action based on predefined policies.
- Create Secure and Audited AI Interaction Channels: For critical workflows involving sensitive data and AI, establish dedicated and monitored channels with built-in data redaction and anonymisation capabilities.
- Implement Strict Access Controls within Internal AI Platforms: Ensure granular role-based access control within internal AI deployments to limit data access and interaction based on the principle of least privilege.
Conclusion: Proactive Measures for a Secure AI FutureĀ
The transformative potential of generative AI is undeniable, but so too are the significant hidden data security risks it introduces. As a CISO, proactively recognising and addressing these risks, with specific attention to the nuances of internal and external use, is paramount to safeguarding your organisation's most valuable information assets and maintaining customer trust. By implementing robust, AI-specific data protection strategies, encompassing policy, technology, and user education, you can empower your organisation to harness the benefits of AI while effectively minimising its inherent risks in today's complex regulatory landscape.
About Metomic: Metomic provides industry-leading AI data protection solutions that enable secure AI adoption without compromising sensitive information. Our platform offers real-time monitoring, prevention, and governance controls specifically designed for enterprise AI usage across both internal and external applications. Contact us today to learn how we can help you navigate the evolving landscape of AI security.
ā