Tuesday, June 08, 2010

Current challenges for Application Performance Engineering

Application performance engineering is a discipline encompassing expertise, tools, and methodologies to ensure that applications meet their non-functional performance requirements. Performance engineering has understandably become more complex with the rise in multi-tier, distributed applications architectures that include SOA, BPM, SaaS, PaaS, cloud and others. Although performance engineering ideally should be applied across the lifecycle, we’re seeing more factors that unfortunately push it into the production phase, typically to resolve problems that have already gotten out of hand. That clearly a tougher challenge, so how did we get to this point?

In the client-server past, performance optimization was something that folks in the IT department typically figured out through trial and error. Developers learned to write more efficient database queries, database administrators learned to index and cache, and system administrators monitored CPU and memory to upgrade when needed.

As application architectures started to get more complex, the dependencies increased and it was harder for one team track down problems without chasing their tail. More organizations adopted something that was previously only used by enterprises with highly scalable, reliable mission critical applications – the performance testing lab. Vendors like Mercury created popular load testing tools like LoadRunner, and organizations invested millions in lab hardware and software in an attempt to recreate production environments that they could control for testing purposes.

Unfortunately, these performance labs became very difficult to cost justify. First, it always seemed to take too much time and money to setup the realistic test environments you’d like, particularly as apps became more distributed. Next, projects were often already behind schedule when it came time to test, and so lab times often had to be cut short. Factors like these minimized the lab’s value, but the real killer was the high maintenance costs for all that hardware and software, along with the data center and staff.

This put many IT organizations in a tough spot. With limited means to perform system-wide performance testing, and the inclusion of more SaaS/PaaS/cloud services in their architecture, they had to make due with whatever subsystem level performance testing they could get. After that, its finger-crossing and resigning yourself to further optimization in production.

Unfortunately, production can be a very frustrating place to try and optimize performance, particularly when you have performance problems and growing complaints from customers, partners, etc. It’s in these pressured environments where you need true performance engineers that follow a methodical and systematic end-to-end approach. Performance bottlenecks can reside in a myriad of places in highly distributed architectures, and you need to follow a disciplined methodology to analyze dependencies, isolate problem areas, and then leverage the best of breed tools to trace, profile, optimize, etc each of the tiers and technologies in the application delivery path. This takes a lot of skill and expertise.

In short, the challenges faced by today’s application performance engineer in production settings is a far cry from the client-server days of in-house tuning and experimentation. We expect that the role of Performance Engineer will grow in importance as SOA, BPM, cloud, and SaaS/PaaS implementations increase, and until more viable pre-production system performance testing options are available to rise up to the challenge.