Klaus' Log

Mo 21 September 2020

Searching for Mersenne primes with Amazon Web Services

Posted by Klaus Eisentraut in howto   

Since around 2008 I have sporadically contributed to the Great Internet Mersenne Prime Search project. Recently, I wanted to extend my AWS skills, in particular for EC2 instances which I have never used before. Therefore I set myself the goal to check one Mersenne prime candidate for primality with AWS EC2 compute instance.

GIMPS software

The software for GIMPS can be downloaded from their website. In order to contribute to GIMPS, you usually get a candidate for a prime and then run a long primality test. The calculation itself is not requiring any internet connection and can be interrupted at any time. Such a payload is therefore ideal for EC2 spot instances, it can be interrupted at any time and should run with the lowest costs possible.

In this artice, I decided not to test a previously untested candidate which is currently somewhere in the exponent range of 2^99XXXXXX-1 (two to the power of 100 million). Those will take up to a few weeks of calculation and there is a second worktype: verifying previous results, a DoubleCheck in GIMPS terminology. The double-checks are lagging behind many years of progress because less people want to do them. The double checks are not as likely to find a new Mersenne prime because the only chance to find a new Mersenne prime is that the previous run had an error in it. However, for testing the setup those are better because the exponent is smaller (currently in the 2**55.XXX.XXX-1 range) and because of the smaller exponent, the amount of required calculation is much smaller and the primality test will finish in a few days instead of a few weeks.

Preparing AWS permissions

We will be using us-east-2 because it seems to be the cheapest region for c5 EC2 spot instances. This can change in the future, so please use the Spot Advisor of AWS to check for yourself. Please note that we are able to choose our AWS region and availability zone without any restrictions. Low network latency is not required at all for this type of calculations.

In the first step, I created an IAM user named mersenne which we will be using for this task. Then, I got the IAM access codes for it. I have more than one IAM user which I want to use with the AWS CLI. So I created a new profile in the local AWS CLI configuration of my laptop:

$ cat ~/.aws/credentials

[mersenne]
aws_access_key_id = ABCDEFGHI1234567833X
aws_secret_access_key = sOgFzBZoaGx...redacted....rOq8o/rXzjopmQB 
$ cat ~/.aws/config

[profile mersenne]
region = us-east-2

The second step was to create a S3 bucket named klaus-mersenne in the us-east-2 region. The IAM user was granted full access on this bucket. In order to do this, I created the following policy s3-access-for-klaus-mersenne and attached it directly to the IAM user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::klaus-mersenne",
                "arn:aws:s3:::klaus-mersenne/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "*"
        }
    ]
}

This gives the IAM user mersenne full access to his bucket, but this is not enough. I also attached the (Amazon pre-defined) policy AmazonEC2FullAccess to the IAM user. In order to be able to use the decode-authorization-message for debugging purposes, I created a new policy sts-read-access and attached it, too:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "sts:GetSessionToken",
                "sts:GetFederationToken",
                "sts:DecodeAuthorizationMessage",
                "sts:GetAccessKeyInfo",
                "sts:GetCallerIdentity",
                "sts:GetServiceBearerToken"
            ],
            "Resource": "*"
        }
    ]
}

Configuring mprime

Now, I downloaded mprime from the GIMPs download page and ran mprime -m once on my local computer in order to get an assignment and configure everything locally. I configured mprime to get a LL DC assignment, not to recieve any CERT work and immediately quit GIMPS after finishing the current assignments (option NoMoreWork=1). Then I used aws s3 cp * s3://klaus-mersenne/ in order to copy everything to the bucket. That's all I needed to do for preparation and now I was able to actually start the number crunching!

Starting the EC2 spot instance

I wrote a little bash script which downloads the mprime executable and its savefiles from S3 when the instance is started. It also stops mprime and backups everything in case of EC2 spot termination notice or after the maximum runtime is reached. The script is pretty self-explanatory and printed below:

$ cat startup.sh

#!/bin/bash

MAXTIME=$((24*60*60))
TIME=0

function saveprogressandend {
    cd /root/mersenne
    killall mprime
    sleep 10
    ls | grep -v mprime | grep -v libgmp | while read -r i; do 
        aws s3 cp $i s3://$BUCKET/$i
    done
    poweroff
}

mkdir -p /root/mersenne
cd /root/mersenne
aws s3 cp --recursive s3://klaus-mersenne/ .

chmod +x ./mprime
./mprime -d | tee -a output.log &

while true
  do
    if [ -z $(curl -Is http://169.254.169.254/latest/meta-data/spot/termination-time | head -1 | grep 404 | cut -d \  -f 2) ]
      then
    saveprogressandend
        break
      else
    if [ $TIME -gt $MAXTIME ]; then
        saveprogressandend
    fi
        # Spot instance not yet marked for termination, check again in 5s.
        sleep 5
    TIME=$(($TIME+5))
    fi
  done

Launch instance

Now, we have everything setup and finally are able to launch an instance! This is done with a long single aws-cli command printed below:

#!/usr/bin/bash

aws --profile mersenne \
    ec2 request-spot-instances \
    --spot-price "0.0205" \
    --instance-count 1 \
    --type "one-time" \
    --launch-specification '{
"ImageId": "ami-07c8bc5c1ce9598c3",
"IamInstanceProfile" : { "Arn" : "arn:aws:iam::401234567891:instance-profile/mersenne" },
"InstanceType": "c5.large",
"Placement": { "AvailabilityZone" : "us-east-2c" },
"KeyName": "laptop-privat",
"SecurityGroupIds": [ "sg-d8cde2a9" ],
"UserData": "'$(cat startup.sh | base64 -w 0)'"}'

This creates and launches a EC2 spot instance which runs our startup.sh script. The spot instance will stop itself after the maximum time or if it gets terminated. Tracking progress can be done on the GIMPS website where you will be able to see your assignments. Additionally, you can look into the output.log file in the S3 bucket.

Costs

Unfortunately, even when using the cheapest region and spot instances, there is a simpler and much cheaper way to run a primality test: just do it on the hardware which you already have.

My Acer Aspire F15 F5-573G-5371 laptop is from 2016 and has a Intel i5-6200U @ 2.30GHz CPU which takes around 10ms per iteration for a double check in the 57,XXX,XXX exponent range. The total time for a double check test is therefore around 6 days. With the high electricity cost in Germany of around 30ct/kWh and a power consumption of the whole laptop of ~33W, the cost per minute is only a little more than 1.0 ct/h. This gives a total of around 1.60 Euro for six days of computation or a single double-check.

On a c5.large instance I will get the almost the identical computation speed, however the price is 2.05 ct/h. Therefore, doing the calculation on my private laptop is only half as expensive than the cheapest AWS spot instances. This calculation is a little flawed (it does not price in the laptop costs), but 50 percent savings are quite a lot.

Therefore, I won't do primality testing on AWS in the future, but it was a nice project to learn about EC2 :)