Category Archives: BbWorld

Slides for So Your Boss Wants You to Performance Test Blackboard

Thanks to all of those who attended my session on “So Your Boss Wants to Performance Test Blackboard” at DevCon 2011. It was a delight to meet so many customers with such great questions.

You can access the slides here:

Advertisements

So Your Boss Wants You to Performance Test…The Practicalities of Getting Started

Expectations have been established about Why, What, Who, When and How. Now it’s time to move from the definition stage to planning the effort. Planning a performance benchmark or even a small-scale performance test is no cup of tea. It’s challenging from a time, resource and skill perspective. Not all of the resources are in-house to take on the feat. Of course it’s not like you have hours and hours of spare time to assemble the work.

There are some basic tenants of performance benchmarking and testing that can be followed to pull off this feat. It starts by setting goals, targets and thresholds which I will describe later. You will subsequently need to define both functional scenarios and data requirements for test interactions. No performance test is complete without writing test automation. Don’t fall hard for using “real users” to perform your tests. Why might you ask? Well a) it’s not scientific b) it’s not repeatable and c) it’s pretty tough to get hundreds of users to participate in a synchronized activity. Before a single test is run, you will need to plan how the scenarios will be distributed, sequenced and balanced, as well as the attributes of a test such as duration, workload, arrival/departure rates, think time, timeouts and abandonment.

Category Description
Goals, Targets and Thresholds Goals, targets and thresholds are lumped together, but are not the same thing. A goal is just another way of saying performance objective. It should be measurable and traceable. The best goals align to the vision of and direction of the business. If goals can be called-out as a performance/scalability requirement you are better off. Good goals often will reference a response time percentile, throughput metric, a workload condition and a data condition.A target is measurable as well. Targets are more aligned to resources such as CPU, Memory, I/O or Network that can be presented as utilization or saturation metrics. They can also be aligned to throughput metrics such as database transactions or IOps. The key point is that they influence performance goals, but do not define them. Goals/Objectives are about the user experience, whereas targets are about the user conditions. A CPU running at 95% doesn’t affect a user necessarily. It may affect whether 100 users can receive sub 1s response times or whether an additional 100 users can interact with a sub 1s SLA.A threshold is essentially one or more acceptance criteria values. It too must be measurable. It’s conditional in nature. Typically thresholds are negative based, such as failure rate or HTTP 400/500 counts. They can also be response time variables in which abandonment might be influenced. The key point is a threshold is the boundary between what you are willing to accept or not accept.
Functional Scenarios You might only need 1 or you may have many. Whatever the case, defining what you are functionally testing is more than click here…then here…and submit this. A lot goes into planning what you are functionally testing. You have to align your functional scenarios to your performance objectives. For instance, if you have initiative around supporting the submission of online assessments/exams by students in a lab-based scenario, it makes sense to build a scenario that accomplishes that goal. It doesn’t necessarily make sense to define scenarios that are unrelated, not relevant, not time or seasonally appropriate, or wasteful. Sometimes you will hear about “noise scripts”…well my advice is avoid them like the plague. Define the functional scenarios that you want, need and have to have.A lot goes into a functional scenario. You have to be able to define navigation and page interaction of course. Those are a given. You also have to define conditions of data. For example if your script assumes that an exam hasn’t been taken, then your user can’t attempt to retake a submitted exam, unless you script for that condition. If you are to take a 25 question test, don’t provide a 20 question test or a 50 question test. Make sure that the data aligns appropriately to your functional needs. Otherwise, your scripting requirements are going to be very complex and tricky.
Test Bed (Data Model) I personally consider the test bed to be the most important part of a performance test or benchmark. My main reason is that your performance goals should define your data conditions that affect the workload conditions of a test. I want to test 1000 concurrent exams of 25 Questions each with no more than 3s response time for submission. Your functional scenarios ultimately have to match-up to the data set just like piecing together a puzzle.With all of this mind, don’t go into a performance testing or benchmarking exercise without synthesizing your test bed. You can have background data, but the data you test with has to be controlled and organized to accommodate the conditions of a test. If all you are doing is testing 1000 users taking the same assessment, then prepare a single assessment. Don’t forget to provide a means to either restore the data or clean it up. Never let your tests run against the same data repeatedly. It’s just not a valid test. If your test has to run through multiple iterations, make sure to either setup enough unique accounts unable to repeat the scenario, or provide additional data for the same users to interact.Sometimes your tests require your data to be setup in a transformative state, meaning that your functional scenarios require the data you are interacting with to be presented in a variety of lifecycle states. For example, you may have a functional scenario to take an exam for the first time. This would require your test bed data to be pristine. You may also have a scenario where you are scoring the results of a completed exam. This will require you to create tests with submitted responses that are not graded. Finally, you may require a scenario where a teacher reviews scored assessments in the grade center. Each of these represents a transformative state of the data. It’s important to know what you need in terms of the structure of the data and the lifecycle state of the data.
Scripting Choosing the right scripting framework and tool can be quite complicated. First you need to identify what you can afford and what you are capable of leveraging. A lot of scripting tools are written in scripting languages like C, Python, JavaScript or Perl. There are some that use their own proprietary languages and then others that use object-oriented languages like Java or C#. Whatever the case, picking the language/framework is often a secondary exercise to picking what you can afford.There are commercial tools like LoadRunner, MSVSTS, SOASTA, Rational and Silk Performer. They are very similar in nature, but offer different features, capabilities and most importantly price points. There are open source projects like JMeter, Grinder, OpenSTA, Multi-Mechanize and Curl-Loader. Remember with open source tools you get what you get and you don’t get upset. Then there are even rich client and browser tools like Browser-Mob, Selenium, SOASTA, LISA and WebPageTest.Whatever you choose, consider what you can afford and what you can support. One last point is to beware of “Record and Playback” tools. I’ve never been a fan of those tools, which the commercial vendors often tout as their differentiator. With “Record and Playback” you lose a lot control. A lot of times you get garbage in your scripts that can cause your scripts to be functionally invalid.
Scenario Planning (Interaction Model) Scenario planning is different then functional scenario definition. Essentially functional scenario definition is about defining what you are going to do and against what data model condition. Scenario planning is about defining when your scenarios will run, how often they will run (frequency), dependencies on other scenarios running, etc…It’s important to see this activity as orchestrating the sequence and order of test or benchmark.
Load Test Definition The load test definition is a critical piece in setting up your actual performance test or experiment. It covers the key attributes of the test such as the workload of users, time of test (duration), some sequencing of scenarios, think time, arrival/departure rates, timeout conditions, etc…Keep in mind that there are many ways to run performance tests. You might choose a soak test, a steady-state (staircase) test, a fixed arrival rate running for a predefined duration. Not only do you define the attributes of the test workload, but you define what’s being measured and instrumented at a particular rate, as well as what systems are sending the load.

So Your Boss Wants You To Performance Test…The Realities of Expectation Setting

You have received your marching orders from your boss to move forward with a benchmark of the next release of Blackboard. At first glance you might consider this a fairly easy task. You might have even done this before with another application or even with a previous release of Blackboard. Unless it’s your day to day job, tackling this problem might be the greatest challenge you have ever faced.

No matter how you look at this some money is going to be spent. You are going to experience costs in terms of gaining skills, paying for consulting, tools, or simply taking your time away from other priorities and projects. Time costs money and time is what it’s going to take to get this task complete. Accomplishing this kind of project takes more than grit and determination. Rather it takes skills and capabilities. Do you or your team have the right stuff to get the job done?

Then of course your boss has some expectations of how this is going to go in their head. He or she wants this to be a flawless and seamless exercise. The notion of any small or big problem isn’t exactly in their head. That’s actually the worst case scenario, because in their mind it means the software vendor or the developers of the project were wrong. It also means your boss might have put their reputation on the line in order to take the latest and greatest feature sets before the community was ready for them, or worst before the product was ready.

Setting Expectations

It’s important to understand why you are asking or being asked to go through a performance/scalability testing exercise. There has to be some set of transparent drivers for going through such an expensive project. Are the goals offensive or defensive in nature? Are we looking for greater accuracy in determining the deployment? An exercise in testing won’t necessarily provide precision or accuracy with the deployment. The outcome of a testing project should be learning experiences and planning, not a guarantee.

Personally, I use testing in my lab to tell me what I can’t do and not necessarily what I can do. It doesn’t mean I can’t increase my confidence in what I can do, but the results of the testing that comes out of my lab certainly do not give me a guarantee that I’m ready to broadcast out to my fellow system administrators. Start simply by asking these very elementary questions below:

  • Why are you going through this exercise?
  • What do you expect to get out of it?
  • Who will be working this effort?
  • When will it be accomplished?
  • How much will it cost?

The road ahead will be long and tiresome. Putting together a benchmark takes a lot of work in terms of preparation, execution and analysis. The best place to start is by figuring out what you can and can’t do to accomplish this project. Once you have a better idea of what you can’t do and a somewhat cloudy perspective on what you can do, you are at a point of plugging your gaps.

Beyond Project Expectations

There’s a lot more to a project like this then the actual preparation, execution and analysis work. Identifying measurable goals is a difficult and challenging exercise. Good attributes of a performance goal would identify criteria around page responsiveness (performance) and workload/data conditions under exponentially increasing load (concurrency/parallelism).

Performance Goals need to be measurable and traceable. Goals need to align to the vision and direction of your organization. Goals must be attainable and realistic.
Response times are necessary for performance. Avoid averages, maximum and minimum values unless using standard deviation. Try to use percentiles as data points.
Throughput measurements are needed for scalability. Capture key data points over consistent time intervals. For example evaluates bytes send and received per second, per minute, or any comparative time sample that makes sense and does not mask the impact of latency.

Be prepared for more than measurable goals. You will also need to define acceptance criteria, both positive and negative. When I test, I accept a very small percentage of error in my results. The percentage is less than 1% for business transactions and less than .01% for HTTP 400’s. We never accept any HTTP 500’s. Ideally I want no errors, but if that was the case, I would be testing infinitely.

You can approach this a couple of ways. You could evaluate based on business transactions. If you tested 1000 samples of different transactions, you would not accept any test results that yielded more than 10 failures. Another approach would be to evaluate HTTP 400’s and 500’s. As I mentioned above, we never accept any HTTP 500’s as they imply something is not working correctly in our application. Let’s say the same 1000 business transaction test produced 20,000 HTTP 200’s and 30,000 HTTP 300’s. If I accept .01% of all HTTP 400’s, then in this case I would not accept more than 5 HTTP 400’s.

What to Avoid

A poor way to approach defining performance and scalability goals would be to define system and resource utilization requirements such as the system shall consume 80% CPU utilization, or the JVM will reach 4GB of memory. Even worst would be to use ambiguous words such as pages have to be fast and the system can’t throttle. Be as discrete and descriptive as possible to define your goals.

So Your Boss Wants You to Performance Test (Intro)

So here’s the problem. Your boss comes to you and says we are going to go live with the next version of Blackboard and it better be fast and scalable. He sat in a bunch of marketing presentations that all say this is by far the most scalable version of Blackboard ever! Expectations are high and the demands are even higher.

There’s a lot of pressure from faculty, students and administrators on both him/her as well as yourself to make this the most stable and fast performing release ever.

You have had a few bumpy years administrating Blackboard in the past. With this upcoming release you plan to have more users for distance learning, heavy adoption of social tools, mobile integration and of course no solution would be complete without full online exams.

Funding is also a little tight. Resources are sparse and spread across 19 other IT projects. Oh and by the way…we are going to take on the new version in 9 to 12 weeks…so time is of the essence.

You have to do this all on your own or with the student intern.

Good luck…You will need it!

BbWorld 2010 Presentations

Wow…I’m the biggest blogging slacker on the face of the earth. I do apologize to my 5 loyal readers 😉 for not posting more over the past few months. I do have a ton to write…just need to convert my posts from my internal Bb blog over to this one. I wanted to post links to my presentations from BbWorld 2010 in Orlando. It was a great conference. I can’t wait until next year. I’m hoping to have an entire performance track next year in Las Vegas with every intention to cover Load Testing, Code Profiling, Designing for Performance, Query Analysis, System Optimization, Product Optimization, etc..If anyone has any thoughts on presentations they would like me, my team or even yourself to present, through them my way!

DevCon Presentation: Deploying a Highly Available Blackboard Solution

Day 1 Session: Best Practices for Optimizing Your Blackboard Learn Environment

Day 2 Session: Scaling Blackboard Technology for Large Scale Distance Learning and Online Communities