Making Data Smarter with IBM Spectrum Discover: Practical AI Solutions

A draft IBM Redbooks publication

Updated 03 September 2020

cover image

IBM Form #: SG24-8488-00
(150 pages)

More options

Rate and comment

Authors: Ivaylo B. Bozhinov, Isom Crawford Jr., PhD, Joseph Dain, Mathias Defiebre, Maxime Deloche, Kiran Ghag, Vasfi Gucer, Xin Liu, Abeer Selim, Gauthier Siri, Christopher Vollmar

"IBM Spectrum Discover: Insights into your files for better TCO with IBM Spectrum Archive EE" presented by Isom Crawford Jr., PhD, Client Technical Specialist IBM USA (See Chapter 5 for a detailed discussion of this scenario.)


More than 80% of all data that is collected by organizations is not in a standard relational database. Instead, it is trapped in unstructured documents, social media posts, machine logs, and so on. Many organizations face significant challenges to manage this deluge of unstructured data such as:

  • Pinpointing and activating relevant data for large-scale analytics, machine learning (ML) and deep learning (DL) workloads.
  • Lacking the fine-grained visibility that is needed to map data to business priorities.
  • Removing redundant, obsolete, and trivial (ROT) data and identifying data that can be moved to a lower-cost storage tier.
  • Identifying and classifying sensitive data as it relates to various compliance mandates, such as the General Data Privacy Regulation (GDPR), Payment Card Industry Data Security Standards (PCI-DSS), and the Health Information Portability and Accountability Act (HIPAA).

IBM Spectrum Discover is a modern metadata management software that provides data insight for petabyte-scale file and object storage, storage on premises, and in the cloud. This software enables organizations to make better business decisions and gain and maintain a competitive advantage.

IBM Spectrum Discover provides a rich metadata layer that enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of unstructured data. It improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.

This IBM® Redbooks® publication presents several use cases focused on artificial intelligence (AI) solutions with IBM Spectrum Discover. This book helps storage administrators and technical specialists plan and implement AI solutions using IBM Spectrum Discover and several other IBM Storage products.

Table of contents

Chapter 1. IBM Spectrum Discover overview
Chapter 2. Generic imagery use cases
Chapter 3. AI pipeline using IBM Spectrum Discover
Chapter 4. Using artificial intelligence in medical imaging - JFR Challenge
Chapter 5. IBM Spectrum Discover integration with IBM Spectrum Archive Enterprise Edition
Appendix A. Product details: IBM Spectrum Scale, IBM Spectrum Archive and IBM Tape libraries

These pages are Web versions of IBM Redbooks- and Redpapers-in-progress. They are published here for those who need the information now and may contain spelling, layout and grammatical errors. This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. Your feedback is welcomed to improve the usefulness of the material to others.

Follow IBM Redbooks

Follow IBM Redbooks