Cloud instances perfomance - running as cluster ?
Posted: Fri Feb 09, 2024 9:07 am
Hello
We are evaluating running Syncovery running in AWS cloud. Currently we have a docker instance running in a 4vCPU / 4GB FARGATE for testing
Last night I started a test between a AWS S3 bucket and a GCS bucket with 220 files with a total of 4TB
The job is running 5 files in parallel
The transfer speed runs around 80-90MBytes/s which leave us with 50 hours for that big bunch of transfer
Unfortunately these performances are too slow for our objectives as we are looking to be able to transfer continuously ~ 200 files 4TB/day
While looking at the CPU and memory usage, everything is in control with less than 20% used from the 4TB memory and CPU lower than 50% most of the time with some peaks at 75%
We are looking on how we could increase the performance.
We can't run multiple independant Syncovery instances as we are copying from 1 source to 1 destination so we are afraid that multiple instances running the same job definition would risk to transfer the same files in each instances creating dupplicates, increasing the problem instead of solving it.
So we are wondering if it is possible to run multiple Syncovery transfer agent in a cluster under the control of a single job scheduler which would distribute a lot of copy across those agents.
Other alternative is to try running on a big EC2 with 10Gb network interface, 1TB of temp storage but that would be less flexible than a cluster which we could scale up and down as needed.
Thanks for any suggestion
We are evaluating running Syncovery running in AWS cloud. Currently we have a docker instance running in a 4vCPU / 4GB FARGATE for testing
Last night I started a test between a AWS S3 bucket and a GCS bucket with 220 files with a total of 4TB
The job is running 5 files in parallel
The transfer speed runs around 80-90MBytes/s which leave us with 50 hours for that big bunch of transfer
Unfortunately these performances are too slow for our objectives as we are looking to be able to transfer continuously ~ 200 files 4TB/day
While looking at the CPU and memory usage, everything is in control with less than 20% used from the 4TB memory and CPU lower than 50% most of the time with some peaks at 75%
We are looking on how we could increase the performance.
We can't run multiple independant Syncovery instances as we are copying from 1 source to 1 destination so we are afraid that multiple instances running the same job definition would risk to transfer the same files in each instances creating dupplicates, increasing the problem instead of solving it.
So we are wondering if it is possible to run multiple Syncovery transfer agent in a cluster under the control of a single job scheduler which would distribute a lot of copy across those agents.
Other alternative is to try running on a big EC2 with 10Gb network interface, 1TB of temp storage but that would be less flexible than a cluster which we could scale up and down as needed.
Thanks for any suggestion