Scalr.net Performance Improvements
If you used Scalr.net these last few days, you probably experienced some mysql connection errors. These were due to a very large amount of concurrent connections, typically when a large farm (100+ instances) is launched, as it triggers too many requests too quickly, before Scalr can react to load.
This was more of an architectural flaw, so we worked to reduce the amount of requests every instance makes. In many cases we got it down to a single request. For example, when an instance requests a list of instances of a role, it now gets all the information for all roles in the single initial request. Same goes for the config_opts queries from instances, equally optimized (we brought down the amount of requests to rebuild /etc/aws/hosts from 5 requests to a single one).
The next thing we did is tune mysql to handle thousands of connections, and over 100 other settings and sysctl options, then optimized our db structure (incl. added new indexes).
We also moved some stuff higher up in the stack to nginx, to be served faster.
We took the occasion to rewrite the client dashboard so that logs load instantly. You'll notice this when you first log in.
Bottom line is that things are faster for you, and put less load on us.
October 14th, 2009 - 14:10
Good to hear !
Would you share some of your my.cnf ? Curious about your mysql fine tuning.
Jonathan
October 22nd, 2009 - 02:25
There’s some security risk to sharing the information, sorry.