AI & HPC Data Centers
Fault Tolerant Solutions
Integrated Memory

Join us at Booth #1031 in San Jose, March 16-19, to explore groundbreaking AI infrastructure solutions that accelerate discovery and deliver peak performance.
Penguin Solutions is proud to be a Gold Sponsor of NVIDIA GTC, the premier global AI conference. As an NVIDIA-certified Elite OEM, DGX AI Compute Systems Solution Provider, and DGX-Ready Managed Services partner, we design, build, deploy, and manage high-performance, high-availability enterprise solutions, supporting customers in achieving breakthrough innovations.
With 25+ years of AI and HPC experience, 3.3 billion hours of GPU runtime, and 89,000+ GPUs deployed and managed, Penguin Solutions is trusted by AI leaders such as SK Telecom, Meta, Voltage Park, and Georgia Tech.
Visit us at Booth #1031 to explore our solutions and services designed to meet your current and future business needs.
Maximize your time at GTC and schedule a meeting or live demo with our experts. Our team is ready to listen, collaborate, and design the right solution for your unique goals.
Join CTO Phil Pokorny to discover how KV Cache technology improves AI inference performance and scalability. You’ll learn how CXL memory transforms infrastructure by expanding GPU capacity for high-value tasks, accelerating response times and unlocking new functionalities. Attend this session to see how sharing cached work across multiple GPUs drives higher system efficiency and reduces the total cost of inference.
Date: March 17, 2026
Time: 3:20 p.m. - 3:35 p.m. PT


A complete AI factory infrastructure solution for customers wanting to rapidly deploy GPUs at scale.
Simplifies AI deployment and management, maximizes GPU availability, and delivers predictable performance.

Intelligent cluster management software that ensures high availability and optimal performance for your AI infrastructure.
Streamline infrastructure management operations and deliver powerful results.

Trusted AI expertise accelerates time to value and ROI through rapid cluster deployment, maximizing performance, efficiency, and operational resilience.
Scale your AI initiatives without disruption as your needs grow.

A patent-pending Ethernet-based 11 TB memory appliance creating a shared CXL DDR5 memory pool across GPU nodes, for scalable AI inference.
Achieve faster AI inferencing with reduced latency.


