Hi there, I would like to switch to the Fusion file system in my existing AWS Batch infrastructure configuration (which includes specific IAM roles with appropriate policies, custom AMIs with and without EBS-autoscaling, computer environments and job queues able to use SPOT instances or EC2 on demand, etc. ) but I don’t fully understand how to change all this to make it compatible with Fusion. Are there specific instances I should use? And what should I do about EBS autoscaling? It is also not obvious to me how to specify the tmp
in the NVMe storage. Thank you for your help!
If you use Fusion and set process.scratch = false
then you don’t need EBS-autoscaling.
You can use Fusion with EBS or with NVMe disk (recommended way).
-
With EBS: you only need to provide a good enough EBS boot disk. Good enough depends on the load and kind of pipelines that you run, but I’d recommend to test with a ~325MB/s throughput and ~100GB size EBS gp3 disk.
-
With NVMe: highly recommended for cost efficiency (specially when using big files). Then you need three things:
- To use EC2 families with NVMe disk (recommended
s6id
,m6id
,r6id
families) - To format and mount all disks at the EC2 launch template into a host path (ex:
/mynvmedisks
) - To mount that host path inside the containers as
/tmp
. (ex:aws.batch.volumes = '/mynvmedisks:/tmp'
)
- To use EC2 families with NVMe disk (recommended
Thank you very much @jordeu for the reply, that makes it much clearer! Can I just ask you what you mean in the point 2 when you say format all disks? You mean running something like the following in the launch template?
sudo file -s /dev/nvme1n1
sudo mkfs -t xfs /dev/nvme1n1
sudo mkdir /mynvmedisks
sudo mount /dev/nvme1n1 /mynvmedisks
Thanks!
Here a more complete example that can manage multiple NVMe disks:
mkdir -p /scratch/fusion
NVME_DISKS=($(nvme list | grep 'Amazon EC2 NVMe Instance Storage' | awk '{ print $1 }'))
NUM_DISKS=${#NVME_DISKS[@]}
if (( NUM_DISKS > 0 )); then
if (( NUM_DISKS == 1 )); then
mkfs -t xfs ${NVME_DISKS[0]}
mount ${NVME_DISKS[0]} /scratch/fusion
else
pvcreate ${NVME_DISKS[@]}
vgcreate scratch_fusion ${NVME_DISKS[@]}
lvcreate -l 100%FREE -n volume scratch_fusion
mkfs -t xfs /dev/mapper/scratch_fusion-volume
mount /dev/mapper/scratch_fusion-volume /scratch/fusion
fi
fi
chmod a+w /scratch/fusion
You need to install yum install nvme-cli