Protect multi-cloud and Edge Generative AI applications with F5 Distributed Cloud

Introduction

The release of ChatGPT in 2022 saw Generative AI and Large Language Models (LLMs) move from a theoretical field of study to a driving force for an increasing number of real applications. Bloomberg is estimating the GenAI market to reach a size of $1.3 trillion in 2032, an explosive increase of over $40 billion in 2022. The same research points to the synergy between rolling out new GenAI applications and the ongoing move of workloads to the public clouds.

The public cloud providers (AWS, Google, and Microsoft) seem very well positioned to support the massive demand for computation power required by GenAI and there is already stiff competition between them to attract developers and enterprises by expanding their GenAI-supporting features. Customers wanting to leverage the best tool and functionalities from each cloud provider may end up deploying their applications in a distributed way, across multiple cloud providers.

This also has some drawbacks, the complexity of utilizing different environments makes it more difficult to find the diverse skills needed, the lack of unified visibility hinders operations and inconsistent policy enforcement can lead to potential security vulnerabilities.

Securing distributed GenAI workloads with F5 Distributed Cloud

F5’s response with Distributed Cloud is to simplify connectivity and security across clouds. It can serve both legacy and modern applications, ensuring a consistent SaaS experience. It abstracts away the application delivery and security layers from the underlying infrastructure, preventing vendor lock-in and facilitating workload migrations between the public cloud providers. It also seamlessly integrates with an extensive partner ecosystem, allowing 3rd party service insertion and avoiding its lock-in.

As a testament to the speed of development in this area, a new direction is already being explored: running GenAI at the Edge. This move is partially driven by the power consumption (and therefore cost) projected to be needed in case GenAI models will keep following the existing trend of being deployed mainly in data centers, see Tirias Research’s “Generative AI Breaks The Data Center Part 1 and 2”.

Generation latency, security, and privacy regulations might be other reasons to consider deploying GenAI models at the Edge, at least for inference and potentially fine-tuning while the training may remain on the cloud. For example, research papers like “An Overview on Generative AI at Scale with Edge-Cloud Computing” show some potential future directions for architecting GenAI applications.

Research has also been carried out on the environmental impact of GenAI, for example, “Reducing the Carbon Impact of Generative AI Inference (today and in 2035)”, one of the mitigation measures being the intelligent distribution of requests to improve the carbon footprint but also maintain user experience, by minimizing user-response latency.

Edge computing has the potential to offer low latency, custom security, privacy compliance, and better cost management. The downsides are similar to the multi-cloud scenario, where multi-vendor complexity is driving up the total cost of ownership and increasing time to market.

F5’s Distributed Cloud AppStack offers a fully integrated stack that enables a consistent deployment model (on-prem or public/private cloud), lowering TCO and shortening TTM.

OWASP Top 10 LLM vulnerabilities

Securing GenAI and LLMs is a new and rapidly evolving topic but OWASP Foundation has already compiled the first Top 10 list of LLMs vulnerabilities and associated suggested mitigations.

LLM01: Prompt Injection - This manipulates a large language model (LLM) through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.

LLM02: Insecure Output Handling - This vulnerability occurs when an LLM output is accepted

without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.

LLM03: Training Data Poisoning - This occurs when LLM training data is tampered with, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.

LLM04: Model Denial of Service - Attackers cause resource-heavy operations on LLMs,

leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and the unpredictability of user inputs.

LLM05: Supply Chain Vulnerabilities - LLM application lifecycle can be compromised by

vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.

LLM06: Sensitive Information Disclosure - LLMs may inadvertently reveal confidential data in their responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.

LLM07: Insecure Plugin Design - LLM plugins can have insecure inputs and insufficient

access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.

LLM08: Excessive Agency - LLM-based systems may undertake actions leading to

unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.

LLM09: Overreliance - Systems or people overly depending on LLMs without

oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.

LLM10: Model Theft - This involves unauthorized access, copying, or exfiltration

of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.

F5 can protect LLMs wherever they are deployed. BIG-IP Advanced WAF and NGINX App Protect use their own advanced AI/ML techniques to protect GenAI workloads deployed on-prem or in containerized environments such as Kubernetes. When GenAI workloads are deployed on the Edge, F5 Distributed Cloud can bring the same level of protection and make it consistent regardless of the environment. For more information on various ways to deploy F5 Distributed Cloud and implementation examples (both manual through the SaaS console and automation), you can consult the “Deploy WAF on any Edge with F5 Distributed Cloud” DevCentral article.

For examples on how NGINX App Protect and F5 Distributed Cloud MultiCloud Networking can secure GenAI workloads, including protection against OWASP’s Sensitive Information Disclosure (LLM06), you can check the following demo:

As shown in this demo, F5 Distributed Cloud enables GenAI applications to be distributed across multiple public clouds (as well as on-prem and private clouds), seamlessly connecting their components with a unitary, single pane of glass, Multicloud Networking (MCN) solution.
F5 XC MCN solution employs Customer Edge sites as "portals" between different environments, allowing services from one environment to be exposed in the other. In the demo above, the LLM remote service from AWS/EKS is being advertised as local to GCP/GKE, to be used by the GenAI application. Since the service is exposed through an HTTP Loadbalancer XC object, a wide range of security features can be enabled for this service, helping secure the MCN connection.
F5 XC Secure MCN (S-MCN) is therefore a complete solution, connecting and securing multicloud and on-prem deployments, regardless of their location.

API Discovery and enforcement is one of the critical features of F5 Distributed Cloud in this context. Another would be API Rate Limiting, enabling protection against OWASP’s Model Denial of Service (LLM04). You can check the “Protect LLM applications against Model Denial of Service” for an implementation example.

Conclusion

F5 Distributed Cloud capabilities allow customers to use a single platform for connectivity, application delivery, and security of GenAI applications in any cloud location and at the Edge, with a consistent and simplified operational model, a game changer for streamlined operational experience for DevOps, NetOps, and SecOps.