Today is a big day. Amazon AWS launched new deep learning / AI focused instances featuring the NVIDIA Tesla V100 GPUs. The NVIDIA Tesla V100 GPUs have NVIDIA’s latest technology including the new Tensor Core. Suffice to say, these are monster chips and are highly anticipated for the deep learning / AI crowd. At the same time, the V100 is very costly so if you are infrequently training models the $150–180K going price for an 8x NVIDIA Tesla V100 server may be less cost-effective than running the workload in the cloud.
Each NVIDIA Tesla V100 Volta-generation GPU has 5,120 CUDA Cores and 640 Tensor Cores. That provides 125 TFLOPS of mixed-precision performance, 15.7 TFLOPS of single precision (FP32) performance, and 7.8 TFLOPS of double precision (FP64) performance.
The Amazon AWS EC2 P3 instances also include NVLink for ultra-fast GPU to GPU communication.
Amazon AWS EC2 P3 Instances Specs and Pricing
Here is a table with the three instance types and hourly pricing based off of the Tokyo list.
Not every AZ has the P3 instances at the time of publication.
If you do have a need for AWS EC2 P3 instances on a regular basis, a 12-month all up-front reserved term is only $176,601us compared to our estimate of $150–180K for an 8x Tesla V100 server plus power, cooling and networking.
Make no mistake, NVIDIA is shipping the Tesla V100. As AWS launches the p3.2xlarge, p3.8xlarge, and p3.16xlarge instances with 1, 4 and 8 of the GPUs respectively, we expect Google and others to follow suit in short order.
The bigger implication here is that NVIDIA has been shipping Volta for some time now and we are now well beyond the point when consumers get a new architecture first. We also wonder at what point NVIDIA will have its architectures diverge. The Tensor Core is a big deal from what we are seeing on the deep learning side, but it is less useful in a gaming scenario.
You can read more about the AWS announcement here.