Accelerate LLM fine tuning and production deployment with NVIDIA NIM and Domino
Domino Data Lab Domino Data Lab
3.13K subscribers
32 views
0

 Published On Oct 10, 2024

Using generative AI and large language models (LLMs) with proprietary data holds immense potential. Yet the time- and resource-intensive development process makes it difficult to scale these LLMs into production workloads.

While commercial LLM APIs offer convenience, they lack customization, security, and cost-effectiveness. Part of NVIDIA AI Enterprise, NVIDIA NIM inference microservices is a set of containerized inference microservices for accelerating the deployment of foundation models tailored to your specific needs and data — securely and cost-effectively.

In this webinar, we’ll demonstrate how deploying a NIM model alongside Domino’s Enterprise AI Platform can accelerate your AI development lifecycle through optimized resource utilization, enhanced performance, reduced operational costs, and industry-leading governance.

What you will learn:

Impact of NVIDIA NIM: Learn how NVIDIA NIM accelerates the pathway for deploying foundation models across various infrastructures – including local workstations, cloud environments, and on-premises data centers.

+Optimized resource utilization and performance: Run a NIM container and Domino in the same cluster for efficient resource allocation and cost savings.

Robust governance and security: Centralize governance by securely connecting to a NIM foundation model using Domino AI Gateway for controlled user access and comprehensive LLM activity logging.

Advanced model training and deployment: Use NVIDIA NeMo in Domino to fine-tune a NIM foundation model using parameter efficient fine-tuning (PEFT) - with full reproducibility.

Scalable, flexible deployment: Deploy a fine-tuned model back to NIM for flexibility to leverage hybrid and multi-cloud environments for resilient AI infrastructure at production scale.

show more

Share/Embed