Monthly Archives: July 2011

So Your Boss Wants You to Performance Test…The Practicalities of Getting Started

Expectations have been established about Why, What, Who, When and How. Now it’s time to move from the definition stage to planning the effort. Planning a performance benchmark or even a small-scale performance test is no cup of tea. It’s challenging from a time, resource and skill perspective. Not all of the resources are in-house to take on the feat. Of course it’s not like you have hours and hours of spare time to assemble the work.

There are some basic tenants of performance benchmarking and testing that can be followed to pull off this feat. It starts by setting goals, targets and thresholds which I will describe later. You will subsequently need to define both functional scenarios and data requirements for test interactions. No performance test is complete without writing test automation. Don’t fall hard for using “real users” to perform your tests. Why might you ask? Well a) it’s not scientific b) it’s not repeatable and c) it’s pretty tough to get hundreds of users to participate in a synchronized activity. Before a single test is run, you will need to plan how the scenarios will be distributed, sequenced and balanced, as well as the attributes of a test such as duration, workload, arrival/departure rates, think time, timeouts and abandonment.

Category Description
Goals, Targets and Thresholds Goals, targets and thresholds are lumped together, but are not the same thing. A goal is just another way of saying performance objective. It should be measurable and traceable. The best goals align to the vision of and direction of the business. If goals can be called-out as a performance/scalability requirement you are better off. Good goals often will reference a response time percentile, throughput metric, a workload condition and a data condition.A target is measurable as well. Targets are more aligned to resources such as CPU, Memory, I/O or Network that can be presented as utilization or saturation metrics. They can also be aligned to throughput metrics such as database transactions or IOps. The key point is that they influence performance goals, but do not define them. Goals/Objectives are about the user experience, whereas targets are about the user conditions. A CPU running at 95% doesn’t affect a user necessarily. It may affect whether 100 users can receive sub 1s response times or whether an additional 100 users can interact with a sub 1s SLA.A threshold is essentially one or more acceptance criteria values. It too must be measurable. It’s conditional in nature. Typically thresholds are negative based, such as failure rate or HTTP 400/500 counts. They can also be response time variables in which abandonment might be influenced. The key point is a threshold is the boundary between what you are willing to accept or not accept.
Functional Scenarios You might only need 1 or you may have many. Whatever the case, defining what you are functionally testing is more than click here…then here…and submit this. A lot goes into planning what you are functionally testing. You have to align your functional scenarios to your performance objectives. For instance, if you have initiative around supporting the submission of online assessments/exams by students in a lab-based scenario, it makes sense to build a scenario that accomplishes that goal. It doesn’t necessarily make sense to define scenarios that are unrelated, not relevant, not time or seasonally appropriate, or wasteful. Sometimes you will hear about “noise scripts”…well my advice is avoid them like the plague. Define the functional scenarios that you want, need and have to have.A lot goes into a functional scenario. You have to be able to define navigation and page interaction of course. Those are a given. You also have to define conditions of data. For example if your script assumes that an exam hasn’t been taken, then your user can’t attempt to retake a submitted exam, unless you script for that condition. If you are to take a 25 question test, don’t provide a 20 question test or a 50 question test. Make sure that the data aligns appropriately to your functional needs. Otherwise, your scripting requirements are going to be very complex and tricky.
Test Bed (Data Model) I personally consider the test bed to be the most important part of a performance test or benchmark. My main reason is that your performance goals should define your data conditions that affect the workload conditions of a test. I want to test 1000 concurrent exams of 25 Questions each with no more than 3s response time for submission. Your functional scenarios ultimately have to match-up to the data set just like piecing together a puzzle.With all of this mind, don’t go into a performance testing or benchmarking exercise without synthesizing your test bed. You can have background data, but the data you test with has to be controlled and organized to accommodate the conditions of a test. If all you are doing is testing 1000 users taking the same assessment, then prepare a single assessment. Don’t forget to provide a means to either restore the data or clean it up. Never let your tests run against the same data repeatedly. It’s just not a valid test. If your test has to run through multiple iterations, make sure to either setup enough unique accounts unable to repeat the scenario, or provide additional data for the same users to interact.Sometimes your tests require your data to be setup in a transformative state, meaning that your functional scenarios require the data you are interacting with to be presented in a variety of lifecycle states. For example, you may have a functional scenario to take an exam for the first time. This would require your test bed data to be pristine. You may also have a scenario where you are scoring the results of a completed exam. This will require you to create tests with submitted responses that are not graded. Finally, you may require a scenario where a teacher reviews scored assessments in the grade center. Each of these represents a transformative state of the data. It’s important to know what you need in terms of the structure of the data and the lifecycle state of the data.
Scripting Choosing the right scripting framework and tool can be quite complicated. First you need to identify what you can afford and what you are capable of leveraging. A lot of scripting tools are written in scripting languages like C, Python, JavaScript or Perl. There are some that use their own proprietary languages and then others that use object-oriented languages like Java or C#. Whatever the case, picking the language/framework is often a secondary exercise to picking what you can afford.There are commercial tools like LoadRunner, MSVSTS, SOASTA, Rational and Silk Performer. They are very similar in nature, but offer different features, capabilities and most importantly price points. There are open source projects like JMeter, Grinder, OpenSTA, Multi-Mechanize and Curl-Loader. Remember with open source tools you get what you get and you don’t get upset. Then there are even rich client and browser tools like Browser-Mob, Selenium, SOASTA, LISA and WebPageTest.Whatever you choose, consider what you can afford and what you can support. One last point is to beware of “Record and Playback” tools. I’ve never been a fan of those tools, which the commercial vendors often tout as their differentiator. With “Record and Playback” you lose a lot control. A lot of times you get garbage in your scripts that can cause your scripts to be functionally invalid.
Scenario Planning (Interaction Model) Scenario planning is different then functional scenario definition. Essentially functional scenario definition is about defining what you are going to do and against what data model condition. Scenario planning is about defining when your scenarios will run, how often they will run (frequency), dependencies on other scenarios running, etc…It’s important to see this activity as orchestrating the sequence and order of test or benchmark.
Load Test Definition The load test definition is a critical piece in setting up your actual performance test or experiment. It covers the key attributes of the test such as the workload of users, time of test (duration), some sequencing of scenarios, think time, arrival/departure rates, timeout conditions, etc…Keep in mind that there are many ways to run performance tests. You might choose a soak test, a steady-state (staircase) test, a fixed arrival rate running for a predefined duration. Not only do you define the attributes of the test workload, but you define what’s being measured and instrumented at a particular rate, as well as what systems are sending the load.