As you know Azure came out with the Traffic Manager which is essentially another load balancing layer on top of their standard round robin based load balancing layer.
Before the Traffic manager one workaround was to use RoleEnvironment.StatusCheck event to get the latest state of your service, and call SetBusy to take a particular instance out of load balancer queue for 10 secs. Similarly, there were few other work a rounds that required you to keep track of what your VMs are doing, and accordingly re-direct the traffic to prevent any one instance from being bogged down and become unresponsive.
But yet still no load balancing technique that actually balances by some load diagnostic such as cpu usage.
So what happens in the following case:
Normally if you have multiple requests coming in from the same ip address, and have "Keep Alive" set on, all those requests would be sent to just one instance, right?
Yes, and this round robin technique might be ok for a standard website, but not for an application where tons of data is being sent to it from many single point locations, and must be optimized for speed.
Taking keep alive off would solve the problem but each request would have to establish a new connection and the performance hit is about 10 fold.
The good news is that this is a big difference with the Traffic Manager round robin pattern. For some reason we get the performance gain of setting "Keep Alive" on, and the round robin is dispersed by request and not by source location, which basically simulates a cpu usage based load balancing technique. Hooray!