AI Infra Summit - Workshop Agenda

Agenda Days:

Wednesday, 10 Sep, 2025

09:00 AM
UALink Breakfast Briefing

Location: Room 201
Duration: 40 minutes

UALink Consortium

Website: https://ualinkconsortium.org/

The Ultra Accelerator Link (UALink) Consortium, incorporated in October 2024, is the open industry standard group dedicated to developing the UALink specifications, a high-speed, scale-up accelerator interconnect technology that advances next-generation AI & HPC cluster performance. The consortium is led by a board made up of stalwarts of the industry; Alibaba, AMD, Apple, Astera Labs, AWS, Cisco, Google, HPE, Intel, Meta, Microsoft, and Synopsys. The Consortium develops technical specifications that facilitate breakthrough performance for emerging AI usage models while supporting an open ecosystem for data center accelerators. For more information on the UALink Consortium, please visit www.UALinkConsortium.org.

Read more about UALink Breakfast Briefing

1:30 PM
Scaling LLM Inference with vLLM and AWS Tranium

Join us in this hands-on workshop to learn how to deploy and optimize large language models (LLMs) for scalable inference at enterprise scale. Participants will learn to orchestrate distributed LLM serving with vLLM on Amazon EKS, enabling robust, flexible, and highly available deployments. The session demonstrates how to utilize AWS Trainium hardware within EKS to maximize throughput and cost efficiency, leveraging Kubernetes-native features for automated scaling, resource management, and seamless integration with AWS services.
Location: Room 206
Duration: 1 hour

Author:

Asheesh Goja

Principal GenAI Solutions Architect
AWS

... read more

Asheesh Goja
Principal GenAI Solutions Architect
AWS

Author:

Pinak Panigrahi

Sr. Machine Learning Architect - Annapurna ML
AWS

... read more

Pinak Panigrahi
Sr. Machine Learning Architect - Annapurna ML
AWS

AWS

Website: https://aws.amazon.com/

Since 2006, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, media, and application development, deployment, and management from 114 Availability Zones within 36 geographic regions, with announced plans for 12 more Availability Zones and four more AWS Regions in New Zealand, the Kingdom of Saudi Arabia, Taiwan, and the AWS European Sovereign Cloud. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs. To learn more about AWS, visit aws.amazon.com.

Read more about Scaling LLM Inference with vLLM and AWS Tranium

2:45 PM
From Rack to Response: Build & Deploy Generative AI in 30 Minutes with NeuReality

Experience the future of GenAI inference architecture with NeuReality’s fully integrated, enterprise-ready NR1® Inference Appliance. In this hands-on workshop, you'll go from cold start to live GenAI applications in under 30 minutes using our AI-CPU-powered system. The NR1® Chip – the world’s first AI-CPU purpose built for interference – pairs with any GPU or AI accelerator and optimizes any AI data workload. We’ll walk you through setup, deployment, and real-time inference using models like LLaMA, Mistral, and DeepSeek on our disaggregated architecture—built for smooth scalability, superior price/performance and near 100% GPU utilization (vs <50% with traditional CPU/NIC architecture). Join us to see how NeuReality eliminates infrastructure complexity and delivers enterprise-ready performance and ROI today.
Location: Room 201
Duration: 1 hour

Author:

Paul Piezzo

Enterprise Sales Director
NeuReality

... read more

Paul Piezzo
Enterprise Sales Director
NeuReality

Author:

Gaurav Shah

VP of Business Development
NeuReality

... read more

Gaurav Shah
VP of Business Development
NeuReality

Author:

Naveh Grofi

Customer Success Engineer
NeuReality

... read more

Naveh Grofi
Customer Success Engineer
NeuReality

NeuReality

Website: https://www.neureality.ai/

Founded in 2020, NeuReality is revolutionizing AI with its complete NR1® AI Inference Solutions powered by the NR1® Chip – the world's first true AI-CPU built for inference workloads at scale. This powerful chip redefines AI by combining computing—6x more powerful than traditional CPUs—with advanced networking capabilities in an AI-NIC, all in one cohesive unit. This includes on-chip inference orchestration, video, and audio capabilities, ensuring businesses and governments maximize their AI hardware investments.
Our innovative technology solves critical compute and networking bottlenecks where expensive GPUs often sit idle. The NR1 pairs with any AI accelerators (GPUs, FPGAs, ASICs), super boosting their utilization to nearly 100% from <50% today with traditional CPUs. This unlocks wasted capacity, delivering superior price/performance, unparalleled energy efficiency, and higher AI token output within the same cost and power.
The NR1 Chip is the heart of our ready-to-go NR1® Inference Appliance which can be built with any GPU. This compact server comes preloaded with our comprehensive NR Software suite, including all necessary SDKs and Inference APIs. Furthermore, it's equipped with optimized AI models for computer vision, generative AI, and agentic AI, featuring popular choices such as Llama 3, DeepSeek, Qwen, and Mixtral. Our mission is to make the AI revolution accessible and affordable, dismantling the barriers of excessive cost, energy consumption, and complexity for all organizations.

Read more about From Rack to Response: Build & Deploy Generative AI in 30 Minutes with NeuReality

Databank Workshop - Title TBC

Location: Room 206
Duration: 1 hour

DataBank

Website: https://www.databank.com/

DataBank helps the world’s largest enterprises, technology, and content providers ensure their data and applications are always on, always secure, always compliant, and ready to scale to meet the needs of the artificial intelligence era. Recognized by Deloitte in 2023 and 2024, and Inc. 5000 in 2024 as one of the fastest-growing private US companies, DataBank’s edge colocation and infrastructure footprint consists of 65+ HPC-ready data centers in 25+ markets, 20 interconnection hubs, and on-ramps to an ecosystem of cloud providers with virtually unlimited reach.

Read more about Databank Workshop - Title TBC

Google Workshop - Title TBC

Location: Room 207
Duration: 1 hour

Google

Google Cloud provides leading infrastructure, platform capabilities, and industry solutions. We deliver enterprise-grade cloud solutions that leverage Google’s cutting-edge technology to help companies operate more efficiently and adapt to changing needs, giving customers a foundation for the future. Customers in more than 150 countries use Google Cloud as their trusted partner to solve their most critical business problems.

Read more about Google Workshop - Title TBC

4:00 PM
Runtime Attested HPC Cluster Reference Architecture for Confidential Computing

The rapid evolution of high-performance computing (HPC) clusters has been instrumental in driving transformative advancements in AI research and applications. These sophisticated systems enable the processing of complex datasets and support groundbreaking innovation. However, as their adoption grows, so do the critical security challenges they face, particularly when handling sensitive data in multi-tenant environments where diverse users and workloads coexist. Organizations are increasingly turning to Confidential Computing as a framework to protect AI workloads, emphasizing the need for robust HPC architectures that incorporate runtime attestation capabilities to ensure trust and integrity.
In this session, we present an advanced HPC cluster architecture designed to address these challenges, focusing on how runtime attestation of critical components – such as the kernel, Trusted Execution Environments (TEEs), and eBPF layers – can effectively fortify HPC clusters for AI applications operating across disjoint tenants. This architecture leverages cutting-edge security practices, enabling real-time verification and anomaly detection without compromising the performance essential to HPC systems.
Through use cases and examples, we will illustrate how runtime attestation integrates seamlessly into HPC environments, offering a scalable and efficient solution for securing AI workloads. Participants will leave this session equipped with a deeper understanding of how to leverage runtime attestation and Confidential Computing principles to build secure, reliable, and high-performing HPC clusters tailored for AI innovations.
Location: Room 201
Duration: 1 hour

Author:

Jason Rogers

CEO
Invary

Jason Rogers is the Chief Executive Officer of Invary, a cybersecurity company that ensures the security and confidentiality of critical systems by verifying their Runtime Integrity. Leveraging NSA-licensed technology, Invary detects hidden threats and reinforces confidence in an existing security posture. Previously, Jason served as the Vice President of Platform at Matterport, successfully launched a consumer-facing IoT platform for Lowe's, and developed numerous IoT and network security software products for Motorola.

... read more

Jason Rogers
CEO
Invary

Jason Rogers is the Chief Executive Officer of Invary, a cybersecurity company that ensures the security and confidentiality of critical systems by verifying their Runtime Integrity. Leveraging NSA-licensed technology, Invary detects hidden threats and reinforces confidence in an existing security posture. Previously, Jason served as the Vice President of Platform at Matterport, successfully launched a consumer-facing IoT platform for Lowe's, and developed numerous IoT and network security software products for Motorola.

Author:

Ayal Yogev

CEO & Co-founder
Anjuna

... read more

Ayal Yogev
CEO & Co-founder
Anjuna

Confidential Computing Consortium

Website: https://confidentialcomputing.io/

The Confidential Computing Consortium is a community focused on projects securing data in use and accelerating the adoption of Confidential Computing through open collaboration.
The Confidential Computing Consortium (CCC) brings together hardware vendors, cloud providers, and software developers to accelerate the adoption of Trusted Execution Environment (TEE) technologies and standards.
CCC is a project community at the Linux Foundation dedicated to defining and accelerating the adoption of Confidential Computing. It embodies open governance and open collaboration that has aided the success of similarly ambitious efforts. The effort includes commitments from numerous member organizations and contributions from several open source projects.

Read more about Runtime Attested HPC Cluster Reference Architecture for Confidential Computing

AWS Workshop - Title TBC

Location: Room 206
Duration: 1 hour

AWS

Website: https://aws.amazon.com/

Since 2006, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, media, and application development, deployment, and management from 114 Availability Zones within 36 geographic regions, with announced plans for 12 more Availability Zones and four more AWS Regions in New Zealand, the Kingdom of Saudi Arabia, Taiwan, and the AWS European Sovereign Cloud. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs. To learn more about AWS, visit aws.amazon.com.

Read more about AWS Workshop - Title TBC

IMEC Workshop Title - TBC

Location: Room 207
Duration: 1 hour

IMEC

Website: https://www.imec-int.com/en

Imec is a world-leading research and innovation center in nanoelectronics and digital technologies. Imec leverages its state-of-the-art R&D infrastructure and team of more than 5,500 employees and top researchers for R&D in advanced semiconductor and system scaling, silicon photonics, artificial intelligence, beyond 5G communications, and sensing technologies.
As imec’s application-specific IC (ASIC) division, imec.IC-link serves start-ups, SMEs, OEMs, and universities by supporting the full ASIC development process from design and IP services to production, packaging, and testing. It realizes over 600 yearly tape-outs in CMOS, Gan-on-SOI, and silicon photonics technologies.

Read more about IMEC Workshop Title - TBC