Tuesday, January 29, 2019

Accelerating AI and Deep Learning with Dell EMC Isilon and NVIDIA GPUs - Dell EMC Certifications


Over the last few years, Dell EMC and NVIDIA have established a strong partnership to help organizations accelerate their AI initiatives. For organizations that prefer to build their own solution, we offer Dell EMC’s ultra-dense PowerEdge C-series, with NVIDIA’s TESLA V100 Tensor Core GPUs, which allows scale-out AI solutions from four up to hundreds of GPUs per cluster. For customers looking to leverage a pre-validated hardware and software stack for their Deep Learning initiatives, we offer Dell EMC Ready Solutions for AI: Deep Learning with NVIDIA, which also feature Dell EMC Isilon All-Flash storage.  Our partnership is built on the philosophy of offering flexibility and informed choice across a broad portfolio.

To give organizations even more flexibility in how they deploy AI with breakthrough performance for large-scale deep learning Dell EMC and NVIDIA have recently collaborated on a new reference architecture that combines the Dell EMC Isilon All-Flash scale-out NAS storage with NVIDIA DGX-1 servers for AI and deep learning (DL) workloads.

To validate the new reference architecture, we ran multiple industry-standard image classification benchmarks using 22 TB datasets to simulate real-world training and inference workloads. This testing was done on systems ranging from one DGX-1 server, all the way to nine DGX-1 servers (72 Tesla V100 GPUs) connected to eight Isilon F800 nodes.

This blog post summarizes the DL workflow, the training pipeline, the benchmark methodology, and finally the results of the benchmarks.

Key components of the reference architecture shown in figure 1 include:

  • Dell EMC Isilon All-Flash scale-out NAS storage delivers the scale (up to 33 PB), performance (up to 540 GB/s), and concurrency (up to millions of connections) to eliminate the storage I/O bottleneck keeping the most data hungry compute layers fed to accelerate AI workloads at scale.
  • NVIDIA DGX-1 servers which integrate up to eight NVIDIA Tesla V100 Tensor Core GPUs fully interconnected in a hybrid cube-mesh topology. Each DGX-1 server can deliver 1 petaFLOPS of AI performance, and is powered by the DGX software stack which includes NVIDIA-optimized versions of the most popular deep learning frameworks, for maximized training performance.


Benchmark Methodology Summary


In order to measure the performance of the solution, various benchmarks from the TensorFlow Benchmarks repository were carefully executed. This suite of benchmarks performs training of an image classification convolutional neural network (CNN) on labeled images. Essentially, the system learns whether an image contains a cat, dog, car, train, etc.

The well-known ILSVRC2012 image dataset (often referred to as ImageNet) was used. This dataset contains 1,281,167 training images in 144.8 GB[1]. All images are grouped into 1000 categories or classes. This dataset is commonly used by deep learning researchers for benchmarking and comparison studies.

When running the benchmarks on the 148 GB dataset, it was found that the storage I/O throughput gradually decreased and became virtually zero after a few minutes. This indicated that the entire dataset was cached in the Linux buffer cache on each DGX-1 server. Of course, this is not surprising since each DGX-1 server has 512 GB of RAM and this workload did not significantly use RAM for other purposes. As real datasets are often significantly larger than this, we wanted to determine the performance with datasets that are not only larger than the DGX-1 server RAM, but larger than the 2 TB of coherent shared cache available across the 8-node Isilon cluster. To accomplish this, we simply made 150 exact copies of each image archive file, creating a 22.2 TB dataset.

Conclusion


Here are some of the key findings from our testing of the Isilon and NVIDIA DGX-1 server reference architecture:

Achieved compelling performance results across industry standard AI benchmarks from eight through 72 GPUs without degradation to throughput or performance.
Linear scalability from 8-72 GPUs delivering up to 19.9 GB/s while keeping the GPUs pegged at >97% utilization.
The Isilon F800 system can deliver up to 96% throughput of local memory, bringing it extremely close to the maximum theoretical performance limit an NVIDIA DGX-1 system can achieve.
Isilon-based DL solutions deliver the capacity, performance, and high concurrency to eliminate the IO storage bottlenecks for AI. This provides a rock-solid foundation for large scale, enterprise-grade DL solutions with a future proof scale-out architecture that meets your AI needs of today and scales for the future.

Our experts say about Dell EMC Certification Exams



3 comments:

  1. Thanks for the Post it is Very Informative But I want to Share my Experience with you that i Pass my Dell EMC Exam at first attempt by getting Best Exam Preparation Material Form Dumpshq.com They Really were Amazing and I recommend you to Get Certification Exam from Dumpshq.You Can Also get Dell EMC Exam Information Storage and Management dumps.

    ReplyDelete
  2. there are many platforms that are providing IT exam practice materials, but if you want to pass Dell EMC DES-3611 Exam on the first attempt then i will recommend visit Certificationsblog . they suggest and provide most valid exam preparations materials with 100% money back guarantee.
    DES-3611 Exam Questions

    ReplyDelete
  3. This review roundup covers 10 group buy seo: Ahrefs, AWR Cloud, DeepCrawl, KWFinder.com, LinkResearchTools, Majestic, Moz Pro, Searchmetrics Essentials, SEMrush, and SpyFu. The primary function of KWFinder.com, Moz Pro, SEMrush, and SpyFu falls under keyword-focused SEO. When deciding what search topics to target and how best to focus your SEO efforts, treating keyword querying like an investigative tool is where you'll likely get the best results.

    ReplyDelete