CapeSym
Tools for Materials Characterization
Software
Cloudformation can be used to launch a simple cluster of xSYMMIC in the Cloud instances on Amazon Web Services.
AWS Cloudformation Service provides an easy way to create and launch Amazon EC2 resources through a JSON-formatted template. The template creates a Virtual Private Cloud (VPC) that defines the contiguous block of IP addresses for a private network inside the cloud. Each subnet of the VPC resides within a single availability zone (e.g. us-east-1a), but a VPC can span multiple zones within a region (e.g. us-east-1). For a cluster based on xSYMMIC in the Cloud, all instances are assigned private IP addresses within a single 8-bit subnet. For example, for a subnet located at 10.0.0.0, private IP addresses 10.0.0.4 through 10.0.0.254 are available.
The public IP address for one of the instances may be used to administer the cluster from the user's desktop through a Secure Shell (SSH) connection. A security group is used to block all inbound traffic except for SSH access from the user's desktop. A placement group is used to place instances together on the underlying hardware to achieve low network latency and high network throughput. A local network file system provides the file input/output for cluster computations.
The Cloudformation launch template makes it easy to set up all these resources. The launch template will not only create the VPC, subnet, security and placement groups, it will also run scripts on each instance that find the other members of the cluster and configure the files for communicating between them.
From the AWS console (console.aws.amazon.com), go to Services >
Management & Governance > Cloudformation, which will bring up the
Cloudformation console. NOTE: The AWS console must be set to one of the
U.S. regions where SYMMIC in the Cloud is available, and you must have already subscribed to
the SYMMIC in the Cloud product
(xSYMMIC in the Cloud)
in order to use Cloudformation to launch a SYMMIC cluster.
To begin creating a cluster, click on Create stack.
The first step in creating the Cloudformation cluster is to upload or provide a link to the template file.
CapeSym provides an example template which may be accessed from our S3 bucket:
https://s3.amazonaws.com/symmic-cloudformation-templates/SYMMIC-AWScluster.json
or downloaded by current customers from the downloads area on our website.
This example template will allow you to launch a cluster of instances that may then
be accessed from the Remote Run dialog in the SYMMIC GUI to run simulations in the cloud.
In the next step, you must enter a unique name for the Cloudformation stack you are creating.
There are a number of additional details that must be specified on this page. These are all parameters defined in the template. The first parameter is to select an availability zone within the current region. Not all instance types are available in every zone, so the choice should consider the desired instance type as well.
The cluster will consist of a single "master" node and multiple "compute" nodes. SYMMIC will access just one of the instances to run a simulation, typically the "master" node. Within the VPC, the template defines a subnet of 256 addresses. Instances are assigned private IP addresses in this subnet. Up to 251 instances can be assigned to the subset, so there can be up to 250 compute nodes.
Every instance in the cluster will be of the type specified in InstanceType. NOTE: Your AWS account has specific limits on how many instances of each type you can run simultaneously, so check your Limits from the EC2 Dashboard to make sure AWS will allow you to create the cluster.
To provide access to the cluster you must select an existing public-private key pair from your AWS account. All existing key pairs for the current region will be listed in the pull-down list. If you have not yet defined any key pairs yet, you will need to do so before you can use the Cloudformation template.
The final three parameters include the size of the temporary storage volume used by the instances. This will only need to hold the input and output files for the simulation run(s), so a large volume is not usually needed. The default size of 256 GB is usually sufficient for most problems. One exception may be when running a long transient simulation on a very large mesh.
You will access the cluster from your desktop using the SSH protocol with public-private key pair authentication. You specify the IP addresses which are allowed to access the cluster in the SSHLocation parameter. The entered value must be in CIDR block format. For example, 0.0.0.0/0 would give access to the entire internet, while 127.33.30.0/24 would allow only the 8-bit subnet at 127.33.30.x to access the cluster. We recommend giving access only to your desktop. To do this, google "My IP address" and enter that value followed by "/32" to make the CIDR block consist of only one address.
The final parameter is the address of the VPC. Each VPC you create should use a different 8-bit address, and this address must come from one of the three ranges reserved for private internets. By default, the VPC will be created in 10.0.0.0/24. If you create a second cluster you should give it a different address, such as 10.0.1.0. This parameter is not a CIDR block, the "/24" will be appended automatically.
There is no need to configure any stack options, so you can just skip the next page and scroll
to the bottom of the Review page of the launch wizard.
Press Create stack at the bottom of this page to actually create the cluster.
Once the creation process has begun, the creation events will be visible in the console, as shown below. You may need to press the refresh button occasionally to monitor the progress. If an error occurs during stack creation, the error message will be displayed in the Events tab and the partially created resources will be automatically rolled back.
When the entire stack has been created, the final event will be the stack name followed by CREATE_COMPLETE.
After the stack has been successfully created, the Outputs tab will be filled from the outputs specified in the Cloudformation template. Here you will find a reminder of all of the important cluster parameters. Of particular interest will be the Public IP address of the master node. Copy this address to the clipboard for pasting into the Remote Run dialog.
Open the SYMMIC GUI on your desktop, open a template, and then open the Remote Run dialog. Paste the IP address from the Cloudformation stack Outputs tab into Host IP Address of the dialog. Under Remote Machine Type select the AWS Cluster radio button. This should immediately update the dialog with the correct username and working directory. Then enter the private key file name for SSH access to the cluster. You may now press the Connect button to establish the SSH connection with the master node of the cluster.
Whenever an SSH connection is established with an AWS cluster the Remote Run dialog will automatically upload the private key to the master node and download the list of private IP addresses of all the nodes. The IP addresses of the cluster are stored locally in a file called ipaddresses which is automatically added to the dialog's hostfile edit box. This file can then be used as the hostfile for the cluster.Select the MPI parameters appropriate for the cluster. For example, if the cluster consists of 4 instances of the r5.24xlarge type with 48 cores each, you can easily spread 64 processes over the 4 machines by assigning 16 process per machine, as shown in the figure below. Once these parameters are specified, simply press the Launch button to start the remote command execution.
By default when running on an AWS Cluster, the command option to use the PETSc libraries will be set, so pressing the Launch button will show that the xSYMMIC command is issued with -usePETSc flag. To use a different solver, change the Options in the dialog before pressing the Launch button.
When the solution has been computed, it will automatically be downloaded to your desktop as shown in this figure.
After the solution has been downloaded, you may use Load solution from the Solve menu to display and examine the results. Don't forget to return to the Cloudformation console to Delete the Stack when you are done using it. As long as the cluster still exists you are being billed by AWS, even if you are not running any simulations.