| Page 162 | Kisaco Research

Improve Price Performance for LLM Serving with vLLM on TPU & GKE

Dive into a hands-on workshop designed exclusively for AI developers. Learn to leverage the power of Google Cloud TPUs, the custom accelerators behind Google Gemini, for highly efficient LLM inference using vLLM. In this workshop, you will build and deploy Gemma 3 27B on Trillium TPUs with vLLM and Google Kubernetes Engine (GKE). Explore advanced tooling like Dynamic Workload Scheduler (DWS) for TPU provisioning, Google Cloud Storage (GCS) for model checkpoints, and essential observability and monitoring solutions

Location: Room 207

Duration: 1 hour

Author:

Niranjan Hira

Senior Product Manager

Google Cloud

As a Product Manager in our AI Infrastructure team, Hira looks out for how Google Cloud offerings can help customers and partners build more helpful AI experiences for users. With over 30 years of experience building applications and products across multiple industries, he likes to hog the whiteboard and tell developer tales.

Author:

Don McCasland

Developer Advocate Lead

Google Cloud

Don leads the Cloud Developer Relations team for AI Infrastructure at Google Cloud. A 20-year veteran in Developer Operations, he is focused on empowering the global developer community to build and scale the next generation of AI applications on Google's cutting-edge platforms.

Read more about Improve Price Performance for LLM Serving with vLLM on TPU & GKE

The Future of Enterprise AI Infrastructure: What The AI Deployments of Today Will Mean for Enterprise Deployments in the Future

DataBank, one of the nation’s leading data center operators, with more facilities in more markets than any other provider, has seen the future of enterprise AI infrastructure and knows how to help enterprises get there.

With a customer base that spans 2500+ enterprises – in addition to hyperscalers and emerging AI service providers – DataBank has a unique perspective on the trends and lessons learned from customer AI deployments to date, which include some of the industry’s first NVL72/GB200 installations.

In this 60-minute session, John Solensky, DataBank’s VP of Sales Engineering, and Mike Alvaro, DataBank’s Principal Solutions Architect, will share what DataBank has learned from its early GPU installations for hyperscalers and AI service providers, how those lessons were applied to later enterprise installations, the impact that next-generation GPUs are having on data center designs and solution costs, and the lessons for future enterprise deployments.

Location: Room 206

Duration: 1 hour

Author:

Mike Alvaro

Principal Solutions Architect

DataBank

Michael Alvaro brings over 12 years of industry expertise spanning construction and mathematics to his role as Principal Solution Architect at DataBank, where he serves as technical lead for the data center sales team. Specializing in enterprise colocation solutions, Michael guides organizations through complex infrastructure requirements involving high-performance computing deployments.

As AI workloads rapidly scale across enterprises, Michael has become a trusted advisor for deployments demanding both high-density air cooling and advanced liquid cooling solutions. His unique construction background provides critical insight into physical infrastructure challenges, while his mathematical foundation enables precise optimization of power, cooling, and space efficiency. Michael’s approach centers on translating complex technical requirements into actionable deployment strategies, helping clients understand not just what’s possible, but what’s most cost-effective and operationally efficient.

Author:

John Solensky

VP of Sales Engineering

DataBank

John Solensky is the Vice President of Solutions Engineering at DataBank, where he leads a team focused on delivering colocation, cloud, and AI infrastructure solutions. With over 26 years of industry experience , John brings deep expertise in helping enterprises design and deploy secure, scalable, and high-performance platforms.

Since joining DataBank in 2020, John has been instrumental in advancing the company’s solutions engineering strategy, enabling customers to modernize IT environments and harness the power of AI-driven applications hosted in DataBank’s facilities. His leadership emphasizes collaboration, technical excellence, and a client-first approach, ensuring that organizations can rely on DataBank for mission-critical workloads and next-generation innovations.

Author:

Greg McNutt

Technical Director

Pure Storage

Greg McNutt is the Technical Director at PureStorage, Inc. where he has spent nearly a decade developing efficient methods of utilizing expensive hardware and limited power and supporting relatively high touch engineering labs. Over his career has has spent many years working on products from lowest level facilities to cloud and high scale products. Greg studied dependable computing at Stanford University. https://www.linkedin.com/in/gcmcnutt

Read more about The Future of Enterprise AI Infrastructure: What The AI Deployments of Today Will Mean for Enterprise Deployments in the Future

From Rack to Response: Build & Deploy Generative AI in 30 Minutes with NeuReality

Experience the future of GenAI inference architecture with NeuReality’s fully integrated, enterprise-ready NR1® Inference Appliance. In this hands-on workshop, you'll go from cold start to live GenAI applications in under 30 minutes using our AI-CPU-powered system. The NR1® Chip – the world’s first AI-CPU purpose built for interference – pairs with any GPU or AI accelerator and optimizes any AI data workload. We’ll walk you through setup, deployment, and real-time inference using models like LLaMA, Mistral, and DeepSeek on our disaggregated architecture—built for smooth scalability, superior price/performance and near 100% GPU utilization (vs <50% with traditional CPU/NIC architecture). Join us to see how NeuReality eliminates infrastructure complexity and delivers enterprise-ready performance and ROI today.

Location: Room 201

Duration: 1 hour

Author:

Paul Piezzo

Enterprise Sales Director

NeuReality

Author:

Gaurav Shah

VP of Business Development

NeuReality

Author:

Naveh Grofi

Customer Success Engineer

NeuReality

Read more about From Rack to Response: Build & Deploy Generative AI in 30 Minutes with NeuReality

Scaling LLM Inference with vLLM and AWS Trainium

Join us in this hands-on workshop to learn how to deploy and optimize large language models (LLMs) for scalable inference at enterprise scale. Participants will learn to orchestrate distributed LLM serving with vLLM on Amazon EKS, enabling robust, flexible, and highly available deployments. The session demonstrates how to utilize AWS Trainium hardware within EKS to maximize throughput and cost efficiency, leveraging Kubernetes-native features for automated scaling, resource management, and seamless integration with AWS services.

Location: Room 206

Duration: 1 hour

Author:

Asheesh Goja

Principal GenAI Solutions Architect

AWS

Asheesh Goja is Principal Gen AI Solutions Architect at AWS. Prior to AWS, Asheesh worked at prominent organizations such as Cisco and UPS, where he spearheaded initiatives to accelerate the adoption of several emerging technologies. His expertise spans ideation, co-design, incubation, and venture product development. Asheesh holds a wide portfolio of hardware and software patents, including a real-time C++ DSL, IoT hardware devices, Computer Vision and Edge AI prototypes. As an active contributor to the emerging fields of Generative AI and Edge AI, Asheesh shares his knowledge and insights through tech blogs and as a speaker at various industry conferences and forums.

Author:

Pinak Panigrahi

Sr. Machine Learning Architect - Annapurna ML

AWS

Read more about Scaling LLM Inference with vLLM and AWS Trainium

GIGABYTE AI TOP: Train Your Own AI on Your Desk

GIGABYTE AI TOP is a groundbreaking desktop solution that empowers developers to train their own AI models locally. Featuring advanced memory offloading technology and support for open-source LLMs, LMMs, and other machine learning models, it delivers enterprise-grade performance in a compact desktop form factor. This solution enables both AI beginners and professionals to build, fine-tune, and deploy state-of-the-art models with enhanced privacy, flexibility, and security.

Author:

Charles Le

CTO, Channel AI Solutions

GIGABYTE

Dr. Charles Le currently serves as Chief Technology Officer of Channel AI Solutions at GIGABYTE. He leads the AI software division and is the architect behind GIGABYTE’s flagship platform, AI TOP Utility, which empowers developers and enterprises to train and deploy large AI models with ease.

He is an expert in the training, finetuning, and inference of LLMs, LMMs, and other machine learning models, with deep knowledge across algorithm design, hardware acceleration, and system integration.

Before joining GIGABYTE, Dr. Le spent four years applying deep learning to the development of radiative cooling materials for marine robotics. He also has six years of experience in structural health monitoring and modal identification for infrastructure under dynamic loads such as earthquakes and wind. More recently, he has applied AI to enhance business intelligence, hardware R&D, and service AI assistants using tools like LangChain and LLM deployment.

Read more about GIGABYTE AI TOP: Train Your Own AI on Your Desk

Normal EDA: AI-Native Verification Without the Rework

As specifications grow to hundreds of pages, traditional verification workflows struggle to maintain consistency, traceability, and speed. This session demos Normal EDA, which replaces subjective, hand-written flows with NormML - a proprietary formal language that ingests raw specs, timing diagrams, and existing testbenches to build an auditable graph that auto-generates zero-to-one test plans, SystemVerilog/UVM stimulus, and traceable coverage links. The system reasons across multimodal data to flag inconsistencies before RTL reaches the simulator, slashing coverage closure time.

Author:

Maxim Khomiakov

Senior AI Engineer

Normal Computing

Maxim Khomiakov, PhD, is a Senior AI Engineer at Normal Computing, building reliable, high‑performance AI software for automating chip verification and design. He previously developed and deployed production-scale machine learning models within Apple Maps. Before that, he led data science efforts at Otovo and co‑founded Sunmapper (acquired by Otovo). Maxim holds a PhD in Machine Learning from the Technical University of Denmark (DTU).

Read more about Normal EDA: AI-Native Verification Without the Rework

Hardware-Assisted Approaches for Quadrillion-Cycle Verification of AI Designs

Today’s AI designs stress verification teams to an unprecedented extent. The compound complexity from software, hardware, interfaces, and architecture options leads to the challenge of running quadrillions of verification cycles across IP, sub-systems, SoCs, and Multi-die designs. Learn how industry leaders like AMD, Arm, Nvidia, and others address these challenges with Synopsys’ latest family of Hardware-Assisted Verification products, modularity of verification, and mixed-fidelity execution setups using virtual prototyping, emulation, and FPGA-based prototyping.

Author:

Frank Schirrmeister

Executive Director, Strategic Programs, System Solutions

Synopsys

Frank Schirrmeister is Executive Director, Strategic Programs, System Solutions in Synopsys' System Design Group. He leads strategic activities across system software and hardware assisted development for industries like automotive, data center and 5G/6G communications, as well as for horizontals like Artificial Intelligence / Machine Learning. Prior to Synopsys, Frank held various senior leadership positions at Arteris, Cadence Design Systems, Imperas, Chipvision, and SICAN Microelectronics, focusing on product marketing and management, solutions, strategic ecosystem partner initiatives, and customer engagement. He holds an MSEE from the Technical University of Berlin and actively participates in cross-industry initiatives as Chair of the Design Automation Conference's Engineering Tracks.

Read more about Hardware-Assisted Approaches for Quadrillion-Cycle Verification of AI Designs

MooresLabAI’s Agentic AI Platform for Semiconductor Development

MooresLabAI is redefining the semiconductor development lifecycle with its Agentic AI platform — purpose-built for silicon teams. In this live demo, we’ll showcase VerifAgent™, our flagship AI-powered verification agent that slashes engineering time by 85% and accelerates time-to-market by 7x. Seamlessly integrating with standard EDA tools, VerifAgent automates testbench creation, debugging, and coverage — without requiring prompt engineering or changes to your flows. Join us to see how MooresLabAI’s platform brings human-grade precision, machine speed, and real-world silicon expertise into one powerful development force.

Author:

Shelly Henry

CEO & Co-founder

MooresLabAI

Shelly Henry is the CEO and Co-Founder of MooresLabAI, a company pioneering Agentic AI for semiconductor design and verification. With over 25 years of experience in silicon engineering and AI, including leadership roles at Microsoft and ARM, Shelly is driven by a mission to transform chip development through intelligent automation. He has led teams building high-performance SoCs and has a deep understanding of the verification bottlenecks plaguing the industry. At MooresLabAI, Shelly combines his technical expertise and entrepreneurial vision to accelerate chip innovation and empower engineering teams worldwide.

Read more about MooresLabAI’s Agentic AI Platform for Semiconductor Development

Author:

Niranjan Hira

Author:

Don McCasland

Google

Julianne Kur

Julianne Kur

Julianne Kur

Author:

Mike Alvaro

Author:

John Solensky

Author:

Greg McNutt

DataBank

Website: https://www.databank.com/

Author:

Paul Piezzo

Author:

Gaurav Shah

Author:

Naveh Grofi

NeuReality

Website: https://www.neureality.ai/

Author:

Asheesh Goja

Author:

Pinak Panigrahi

AWS

Website: https://aws.amazon.com/

Author:

Charles Le

GIGABYTE

Website: https://www.gigabyte.com/

Author:

Maxim Khomiakov

Normal Computing

Website: https://www.normalcomputing.com/

Author:

Frank Schirrmeister

Synopsys

Website: https://www.synopsys.com/

Author:

Shelly Henry

MooresLabAI

Website: https://www.mooreslab.ai/

Frank Schirrmeister

Frank Schirrmeister

Frank Schirrmeister