Ethical Hacking News

Nvidia Patches Critical Vulnerabilities in Triton Inference Server, Averting Potential AI Model Theft

Nvidia has issued a critical patch for its Triton Inference Server, addressing a chain of high-severity vulnerabilities that could lead to remote code execution. The patch addresses potential risks including AI model theft, sensitive data breaches, or manipulation of AI model responses. Organizations using the server must update to the latest version as soon as possible.

Nvidia has released a patch for Triton Inference Server to address high-severity vulnerabilities that could lead to remote code execution.

The vulnerabilities were discovered by Wiz Research and include three chainable flaws that could expose sensitive data or manipulate AI model responses.

The first vulnerability is a Python backend bug that reveals shared memory information, allowing attackers to take control of the server.

The second vulnerability is an out-of-bounds write bug that enables manipulation of shared memory, while the third is an out-of-bounds read bug that allows full control of the server.

Nvidia has released version 25.07 with necessary fixes and urges organizations to update as soon as possible.

Nvidia has issued a critical patch for its popular Triton Inference Server, an open-source platform designed to run AI models and serve them to user-facing applications. The patch addresses a chain of high-severity vulnerabilities that could lead to remote code execution (RCE) on the server, potentially exposing sensitive data, manipulating AI model responses, or allowing attackers to move into other areas of the network.

The vulnerabilities were discovered by Wiz Research, a security firm that specializes in researching and identifying critical flaws in software. According to Wiz, if the three vulnerabilities they discovered and reported to Nvidia were exploited successfully, the potential consequences could be severe, including AI model theft, sensitive data breaches, or manipulation of AI model responses.

The first vulnerability (CVE-2025-23320 – 7.5) relates to a bug in the Python backend, triggered by exceeding the shared memory limit, using a very large request. This causes an error message that reveals the unique name (key) of the backend's internal IPC shared memory region in full. Using this newfound information, attackers can combine it with the public shared memory API to take control of a Triton Inference Server.

The second vulnerability (CVE-2025-23319 – 8.1) is an out-of-bounds write bug that allows attackers to manipulate the backend's shared memory. The third vulnerability (CVE-2025-23334 – 5.9) is an out-of-bounds read bug that enables attackers to gain full control of the server.

Wiz Research emphasized the importance of defense-in-depth, where security is considered at every layer of an application. According to the researchers, a single component's verbose error message and a feature that can be misused in the main server were all it took to create a path to potential system compromise.

Nvidia has now patched the bugs affecting Triton Inference Server, releasing version 25.07 on August 4, which includes the necessary fixes. All versions prior to this are vulnerable to exploitation. Nvidia confirmed that they worked closely with Wiz Research and are grateful for their collaboration.

Wiz Research's discovery highlights the importance of securing the underlying infrastructure as AI and machine learning (ML) adoption becomes more widespread. The researchers' findings serve as a reminder that even seemingly minor flaws can be chained together to create significant exploits.

In light of this incident, it is essential for organizations using Triton Inference Server to update to the latest version as soon as possible. This patch addresses critical vulnerabilities that could potentially expose sensitive data or enable attackers to manipulate AI model responses.

The incident also underscores the significance of responsible disclosure and collaboration between security researchers and software vendors in identifying and addressing critical flaws before they can be exploited.

Related Information:

https://www.ethicalhackingnews.com/articles/Nvidia-Patches-Critical-Vulnerabilities-in-Triton-Inference-Server-Averting-Potential-AI-Model-Theft-ehn.shtml

https://go.theregister.com/feed/www.theregister.com/2025/08/05/nvidia_triton_bug_chain/

https://nvd.nist.gov/vuln/detail/CVE-2025-23319

https://www.cvedetails.com/cve/CVE-2025-23319/

https://nvd.nist.gov/vuln/detail/CVE-2025-23320

https://www.cvedetails.com/cve/CVE-2025-23320/

https://nvd.nist.gov/vuln/detail/CVE-2025-23334

https://www.cvedetails.com/cve/CVE-2025-23334/

Published: Tue Aug 5 10:08:20 2025 by llama3.2 3B Q4_K_M

Today's cybersecurity headlines are brought to you by ThreatPerspective

Nvidia Patches Critical Vulnerabilities in Triton Inference Server, Averting Potential AI Model Theft