Deployment and Usage Guide for Running AI Workloads on Red Hat OpenShift and NVIDIA DGX Systems with IBM Spectrum Scale

An IBM Redpaper publication

thumbnail 

Published on 30 November 2020

  1. .EPUB (0.6 MB)
  2. .PDF (1.7 MB)

Apple BooksGoogle Play Books
Share this page:   

ISBN-10: 0738459097
ISBN-13: 9780738459097
IBM Form #: REDP-5610-00


Authors: Simon Lorenz, Gero Schmidt and Thomas Schoenemeyer

    menu icon

    Abstract

    This IBM® Redpaper publication describes the architecture, installation procedure, and results for running a typical training application that works on an automotive data set in an orchestrated and secured environment that provides horizontal scalability of GPU resources across physical node boundaries for deep neural network (DNN) workloads.

    This paper is mostly relevant for systems engineers, system administrators, or system architects that are responsible for data center infrastructure management and typical day-to-day operations such as system monitoring, operational control, asset management, and security audits.

    This paper also describes IBM Spectrum® LSF® as a workload manager and IBM Spectrum Discover as a metadata search engine to find the right data for an inference job and automate the data science workflow. With the help of this solution, the data location, which may be on different storage systems, and time of availability for the AI job can be fully abstracted, which provides valuable information for data scientists.

    Table of Contents

    Chapter 1. Overview

    Chapter 2. Proof of concept environment

    Chapter 3. Installation

    Chapter 4. Preparation and functional testing

    Chapter 5. Deep neural network training on the Audi Autonomous Driving Dataset semantic segmentation data set

     

    Others who read this also read