I gave a talk on using public cloud to host grid computing/HPC workloads. The elastic and on demand nature of public cloud is a great fit for spikey workloads like grid computing. I’d had some fun building an autoscaling MATLAB HPC cluster (scale out and scale back) and talked about it at a breakfast breifing with Hentsu.
I ran a project to deploy an HPC cluster using on-demand AWS Elastic Compute Cloud (EC2) resources. The HPC cluster provides researchers with compute resource to quickly run mathematical simulations across very large datasets. This deployment was a replacement for aging on premises HPC hardware and an opportunity to trial Amazon AWS in a hybrid cloud configuration. High security implementation:
One way firewall rules between company network and AWS (company connects out to AWS resources, AWS resources can’t connect in) Encryption of data in transit and at rest AWS Direct Connect connecting company to AWS.
I hit a stumbling block with adding compute nodes to new HPC cluster. If you see the following errors when deploying Microsoft HPC Pack 2012 R2 when joining compute nodes to the cluster:
HPC Node Manager Service unreachable and
System.Runtime.Remoting.RemotingException: An error occurred while processing the request on the server: System.Runtime.Remoting.RemotingException: User identity is not authorized to connect to this endpoint The solution is to add your installation credentials to the administrators group before install HPC pack on compute nodes.