Monthly Archives: March 2011

Top 10 Users…Why This is Important

I’ve been meaning to write this blog for quite a while. Procrastination has definitely held me back…that and the fact that what I want to do is no simple task. When you want a complex task to be done, you better have your thoughts organized.

Over the years, I’ve considered so many different ways to forensically study what people do in the system. I’ve looked at logs. I’ve looked our ACTIVITY_ACCUMULATOR table. I’ve looked at aggregates of data as well. I’ve brought in tools like Coradiant Dynatrace and Quest User Performance Management. None of these tools has ever met my real needs. The reason is that I haven’t been able to articulate what I am really in search of.

I think I’ve had a few eureka moments as of late with what I’m interested in seeing. I know that I want to see what is being done when in our product. I know that I want to understand the sequence of events and the orientation of where events happen in the system. I want to understand the probability of something happening. I want to see the frequency of something happening. In the case of frequency, I want to understand the volume related to frequency. I think all of this data is relavent because it will give us more insight into predicting patterns of usage of a system.

Where a lot of this has come from centers around coversations I’ve had recently about Assessment performance. A lot of customers have been complaining about high-stakes assessments in which they have hundreds of students taking tests all within a lab. They have been complaining about both memory issues (makes sense) and I/O issues (inserts/updates on QTI_RESULT_DATA) which also makes sense. In the case of I/O they didn’t really call them out. Rather, after discussing, I called-out that there likely were some I/O issues based on the behavior an assessment. One of the things I’ve been suggesting to customers was to query the QTI_RESULT_DATA table to get a resultset of row’s inserted versus modified. Then put it in a scatter plot (from an isolated period of time) to see the volumes of inserts versus updates to see when the timeslices of these events were occuring. From that data, then go into their I/O sub-system and graph their IOps for those same periods of time and overlay the two charts…

SQL> desc QTI_RESULT_DATA;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 QTI_ASI_DATA_PK1                          NOT NULL NUMBER(38)
 PK1                                       NOT NULL NUMBER(38)
 POSITION                                           NUMBER(38)
 ASI_PLIRID                                         VARCHAR2(255)
 ASI_TITLE                                          NVARCHAR2(255)
 DATA                                               BLOB
 BBMD_RESULTTYPE                                    NUMBER(38)
 PARENT_PK1                                         NUMBER(38)
 BBMD_DATE_ADDED                                    DATE
 BBMD_DATE_MODIFIED                                 DATE
 BBMD_GRADE                                         NVARCHAR2(32)

 

Back to My Point

So all of this talk about using scatter plots to isolate time of when certain events happened in mass, got me thinking about why I wasn’t getting what I really wanted (aka…my rambling above). What I really wanted to create an identity of a user. I didn’t care about their name, just their role. I would call them “Insanely Ambitious Student” or “Constantly Connected Teacher”. It really doesn’t matter. What matters is that you can start building profiles about these users. Before you can build the profile, you have to have a starting point.

My starting point is to look at every entity in the system. I would like to be able to directly or indirectly trace back a row of data to a user. It’s not as simple as you think. First off, not every table has a foreign key relationship to USERS. Some tables have a tie back to COURSE_USERS, which is not a problem per se, but it’s not a straight-up look at each table with USER_PK1 foreign keys.

As a starting point, I would like to do a gap analysis to determine what entities can be directly tied back to the user. From that, we need to know whether the row entry can be presented as time/date value. In some cases, the entity can even show the initial INSERT versus an UPDATE. We really need to understand this system-wide, which means yes we could/would touch the monster ACTIVITY_ACCUMULATOR table.

We could even start with a single entity as a starting point. I would even compromise for an entity that stores USER_PK1 in it. It has to be a table that can present a many to one reference of rows to a user. A good example might be MSG_MAIN as a starting point since it covers all of the criteria.

We could easily look at time series data by user, as well as aggregate statistics. Both are relevant, but obviously time series is a little more visual. I think you need aggregate statistics or at a minimum binned data (binned by time series per user) like aggregate counts by user over each week as a key data point.

SQL> desc MSG_MAIN;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PK1                                       NOT NULL NUMBER(38)
 DTCREATED                                 NOT NULL DATE
 DTMODIFIED                                         DATE
 POSTED_DATE                                        DATE
 LAST_EDIT_DATE                                     DATE
 LIFECYCLE                                 NOT NULL VARCHAR2(64)
 TEXT_FORMAT_TYPE                                   CHAR(1)
 POST_AS_ANNON_IND                         NOT NULL CHAR(1)
 CARTRG_FLAG                               NOT NULL CHAR(1)
 THREAD_LOCKED                             NOT NULL CHAR(1)
 HIT_COUNT                                          NUMBER(38)
 SUBJECT                                            NVARCHAR2(300)
 POSTED_NAME                                        NVARCHAR2(255)
 LINKREFID                                          VARCHAR2(255)
 MSG_TEXT                                           NCLOB
 BODY_LENGTH                                        NUMBER(38)
 USERS_PK1                                          NUMBER(38)
 FORUMMAIN_PK1                             NOT NULL NUMBER(38)
 MSGMAIN_PK1                                        NUMBER(38)

Static Code Analysis of LoadRunner C Code

One of the things I would like for our new Performance Test lead engineer to work on this year is improved C coding for our LoadRunner library. There’s very little we do in terms of proactive code management. We obviously have a lot of functions. Many of the functions need to be deprecated. At the same time, all we do are light code reviews. Even those we struggle with providing very basic guidance. What I would like to do is implement static code analysis for our C code similar to the way we are doing this with our Sonar project.

Here’s the good news…It appears that we can actually integrate directly with Sonar. I’m not sure what tool is the engine for the Rules Engine they use. It looks like a custom rule set. We could also look at the following tools to see if they integrate with Sonar.

 

What Was the Best Advice You Ever Got?

What was the best advice you ever received? Who was that sage person that dropped a little nugget of life on your ears? That’s the question in my mind this morning. Sadly, I can’t recall who gave it to me, but I certainly remember the message. I think I was 15 years old and finishing up my first year of high school. I was trying to get through biology, which isn’t really all that tough of a class. It was a challenge to me for the first time in my life, or better yet it was a challenge that I internally struggled whether to quit and do poorly or take the class by the reigns and do well.

The message was pretty straightforward…It was work hard now and have fun later, or have fun now and pay later. The point was I needed to put in the work now in order to see any rewards in life later.

So now that I shared my message with you, what was your advice you received?

 

Moving from E2E to EoE

I wrote my first reference to E2E back at the Hotsos Symposium in 2009. Quoting that blog…

“I was thinking we would then as part of our DOE, try to understand E2E response time from the Client to the Web/App to the DB layer and visually overlay response times with the diagram. This would be a great way to show design anti-patterns in a quantifiable way.”

The whole point of using the term E2E was to explain end-to-end response time and resource breakdown for the purpose of identifying software anti-patterns. I had the right idea, but it was the wrong term.

What I really meant to say was EoE (EoE=Execution of Experiment). Patrick helped me realize this recently that E2E just didn’t make sense because it implied everything was End 2 End, meaning a UI experiment was required. Not everything requires a UI experiment. Some experiments are run by non-UI mechanisms and therefore it might cause confusion.

So the solution going forward is to change E2E to EoE. The execution of experiment is intended to be a full analysis from end to end, but the dependency on a UI simulation is really driven by the test type and approach.

That Got Me Thinking About Our DoE

There are a few things missing with our DoE’s right now. First, our DoE’s do not necessarily have goals. By goals I’m really talking about non-functional requirements for performance and scalability. I think we need to address this gap in Sprint 7 going forward so that we have a goal to strive toward and then attempt to go beyond. Otherwise, how do we really know when the DoE is complete?

The other thing that’s missing is our testing approach. We should really justify why we are going to use a particular test tool. I was talking with Patrick the other day and I had mentioned that we have all of these different ways to run a test. We could run a Selenium test. We could execute a batch script. We could run a JUnitPerf. We could run a SQL script. We could even use LoadRunner. I’m sure the team is laughing at that one because I’ve been against using LoadRunner for DoE tests. I now want to make it such that any tool (so long that it could automated from Galileo) can be used for EoE as long as we justify it in the DoE. We really should justify our test approach in the DoE document.

 

Welcome 2011…I’m Back

So it’s been a few weeks since my last blog. I was lucky enough to take the last week and half off of 2010 for some much needed R&R. Going into Monday, I was pretty ready to start the new year off with a bang and lucky me had a high fever (~101) which was not so good for coming into the office. Turns out I had a little virus and some fluids (ie: Gatorade and Water) did the trick.

  

As I said, I’m back and ready for business. I’m going to be putting together some informative blogs together over the next few days about some of the changes I would like to introduce to the team in 2011. So keep an eye out for thos blogs. I will leave you with some pictures of my favorite Christmas present my sister-in-law gave me.

 

The Mind of a Performance Hacker

I had an unusual eureka moment in my car this evening. I get them now and then. Well actually according to Confluence, I’ve had 5 other “eureka” blogs since 2007. Apparently I had none this year. I had (3) in 2007 which must have been a good year for the team. One each in 2008 and 2009. So I was definitely due…

Within the last few months I have had the opportunity to reframe my technology perspective with the addition of Stephanie Tan on our team. Stephanie runs our Security practice. As I attempt my hardest to provide her with modest leadership and direction, I’ve found myself engaged like a mad man trying to minimize my learning curve in software security. I’ve re-read some old books that I had on the shelf. I’ve subscribed to a few periodicals. I’ve found myself at times scouring Google for hours upon hours researching new terms and concepts so that I have more context when I talk with Stephanie and others about security.

While I’ve been doing this, it finally hit me that security engineering (the safe kind we want to practice) is just a clandestine form of performance engineering. Now before you challenge me or turn the page, hear me out. If we say the platform of a security engineer is to engineer or construct solutions with the intent of penetrating, breaking or dismantling a software system, then I think it’s safe to say that a performance engineer has an almost identical focus. The performance engineer should be intent on breaking the system to affect responsiveness and/or scalability. In my mind they are one and one the same…both are trying to break the system. We are trying to break the system from a fairly positive perspective. We want to determine when users will abandon, when processes become unbearable and when the system shuts down.

 

Is that what we are doing today? “Not really“…says my inner voice. I don’t think we ask the question or questions about breaking a use case, component or system. I think deep in the back of our mind we want to ask this question. Our intention during our SPE exercises is to ask questions that should or could lead us to that ultimate question or question set about breaking a system. Unfortunately we don’t ask the question. The reason I know we don’t ask the question is that we don’t build experiments with the intention of breaking the use case, component or system. Lately we have been building experiments that tell us how fast or how slow something is. That something is usually a microscopic view into the use case, component or system. It typically interacts with “realistic” data attributes under “realistic” or normal behavioral characteristics.

 

Should we stop doing that? Well, not really. We need to take a step back and lead from the alternative. I say alternative because today our comfort zone is asking questions about “normal”, “realistic” and “common” attributes and characteristics tied to what we are building. It’s OK to ask those questions as they are essential to giving us better context on our product. The alternative questions have to be skewed toward “performance hacking”, meaning how can break the user experience from a responsiveness and scalability perspective. We don’t care about making a motherboard fry or burning a CPU to the ground (figuratively speaking). We do care about use cases that can single handedly bring a system to its knees. We care about processes that aren’t predictable and run from undisclosed periods of time. We need answers about how we can literally make a page unresponsive or a process die midstream. We need to see if we can force out of memory exceptions, deadlock a row or table or even cause a thread to lock. We should look at a scheduled task meant to run ever hour and try to figure out if we can make each iteration run for 65 minutes, then what?

 

We need to put our performance hacker hats on and figure out our performance and scalability vulnerabilities…

Building a Case for Some Fast Path Principal Implementations

For a while now I’ve had this idea about providing new ways to improve time on task. This is more of a usability concept than anything, but from the perspective of performance, I’m specifically talking about providing a mechanism to shorten the critical path and reduce the time and effort a user spends performing a task in the system. So the idea is nothing spectacular, but it happens to be something that just doesn’t exist in the system. It’s something I would like for PerfEng to model and prototype so that it could conceivably be brought into the product in a future release.

What’s My Big Idea

Over the years the thing I’ve learned more so about Blackboard than anything is that while the system is intended to improve the learning and livilihood of students, it’s really a system designed from the perspective of teachers and instructional designers. What we seem to do real well is provide a solid canvas for structuring and organizing content. In fact, there are several fast paths for authoring or manipulating context such as:

  • Edit Model
  • Context Menus for Constructing Content and Other Artifacts

Where I see an opportunity is to provide fast path capabilities for teachers to assess content interactions, content contributions and even user activity by students and/or class participants. Imagine I’m a teacher and I’m stepping through my course. I decide to go into one of my discussion forums and threads to monitor participation, or quite possibly reply to a thread. I’m presented with a screen below with a list of participants in the class. Most are students, maybe my TA or even my own posts are listed as well.

 

Now, I’m given a mechanism to perform other operations associated with the user because I have a drop-down context menu next to their name, or I’m able to roll-over their name and hovering window is presented (either is possible). The menu/window allows me fast path views directly associated with the user. For example, I roll-over Amina Brook’s name and a context menu gives me the ability to do some of the following operations:

  • Drill into Amina’s Personalized View of the Grade Center
  • Send Amina a message
  • Review unread messages from Amina
  • See a 360 view of Amina’s activity
  • Look at Amina’s Course Map
  • Drill into Amina’s Discussion Posts for the Course
  • See other tools (maybe from B2’s) that present data from Amina

The possibilities are endless. The idea centers around giving teachers the ability to quickly access the content that has the most relevent purpose that moment. I maybe working on Amina’s end of year comments (another interesting feature) to send to her or if she’s in K-12 to her parents. I’m trying desperately to get a more wholistic view of Amina’s performance and contributions to the course. In this case, I need a fast way to aggregate data, as well as drill into areas of the application to inspect participation.

On Code Quality

Centering the Problem

So I’m a little confused by some of the mixed messaging. I would like the opportunity for each of you to weigh in on the actual problem set in front of me. In Jessica’s email for the off-site, she has asked for me and Mike to spend roughly 2 hours discussing Automated Unit testing. That’s fine and dandy, but after reading through Steve’s Quality Plan, discussing one on one with Mike and another one on one with Marc Nadeau, I’m not convinced that everyone is aligned on the problem.

Each of you has one or more arguments about unit testing. In general each of you desire greater coverage of our APIs through more unit testing so that we can hand-over a more functionally stable product. From my conversation with Marc, he’s concerned that the unit tests today are not only lacking in coverage, but failed to be executed by engineering on a consistent basis. Rather, the only execution occurs during the nightly builds and is then managed and controlled by QA. Essentially, our engineers are throwing their code over the wall with the promise of the nightly testing loops to catch their mistakes. When I discussed this further with David Ashman and ad hoc with a few engineering managers, there was a confirmation that unit tests were not being run on any consistent basis. The reason…the API harness takes too long to run. So it’s safe to say that there are opportunities for improvement with unit testing in categories such: coverage, decomposition, execution and accountability.

When I talked with Mike, he’s of a slightly different opinion about the problem set I am to tackle. He called-out the problem related to improving the reliability and timing of code handoffs. This is covered to a degree in your quality plan as item #3 (Requirements – Collaboration, Validation & Traceability). I’m not quite sure if this a problem that I can get my hands around as I might be too removed from the Sprint lifecycle and fairly address the problem.

My request would be to focus on the framework process of improved quality as part of the engineering lifecycle. Below are some of my initial thoughts…

Code Quality, Unit Testing and Process Changes

Steve’s slide 17 does a great job at calling out the key issues of our unit test approach: coverage gaps, quality issues, time management, execution and measurement. I think they are a good place for us to start. My only change would be to separate static analysis and even dynamic analysis from unit testing, and keep it under the umbrella of code quality.

Coverage: I personally have not performed a coverage analysis effort as of yet. So ideally I would like to have the time to perform an audit, whether that be from an independent inspector or by a small team responsible for providing thorough measurements. I would love to be able to go back and introduce more coverage where we are lacking. We should, as we always intended empower the module/product teams to do such, while at the same time augment the coverage through some professional services. The coverage problem is really about the go-forward plan. Steve calls out an interesting point about unit tests not being accounted for with more precision during the sprints. I would add an additional point that unit tests are not collaboratively discussed and transparently presented as sprint deliverables.

Talking Point 1 on Coverage: Recommend we bring in outside consulting firm with our $133k to help us measure our true coverage, the quality of our tests and help us work through the problems of collaborative unit test development. There are companies out there like Stelligent, or we could bring the former president of Stelligent, Andrew Glover into work on this problem with us. I actually grew up with Andrew and we have traded a few emails over the past 3 years. He’s got a lot of ideas and could be real helpful in addressing our deficiencies. If a consultant comes in, it’s going to take time away from team members, whether those team members are Engineering Managers, Architects, Engineers or Performance Engineers. We also invested in Clover, but are we seeing a return on that investment?

Talking Point 2 on Coverage: We have said over and over that the teams would be tasked with going back and adding coverage. It’s been lip service to now. We haven’t seized any opportunities in our schedule to take this on. In fact, we haven’t even put it as measureable output for the development teams. I propose that we put together a measureable objective for development managers to plan for unit test coverage windows throughout 2011. We put coverage objectives for all legacy APIs in the plans for all contributing engineers. We have to make time and set targets. This might require us to do a proof of concept with a product team for a short period of time to gauge feasibility and effort, as well as work out the wrinkles in rolling something of this magnitude out to the team.

Talking Point 3 on Coverage: We only get so far with Talking Point 2 as our developers are consumed with new functional development and maintenance. I suggest we invest in some of our professional services funds to tackle back coverage. I’m not sure how much Talking Point 1 will cost. I would assume at least $50k, so that might only leave us a little over $80k to spend. That’s not a whole lot and might only add up to about 1 or 2 full time off-shore developers. This might be good for chipping away at consistent contributions, but how productive would 1 or 2 contractors really be? It may make sense to try to obtain enough funding for each product/module team. If a contractor(s) is/are used, it’s going to require management of that person, meaning planning, training, reviewing, etc. from one or more people.

Talking Point 4 on Coverage: As it relates to new development or even maintenance development for that matter, I strongly advise that we have each team be 100% transparent about their unit tests. I might have to explain this in person, but I would propose a scenario in which each engineer would have to produce a unit test plan as part of the sprint. The plan could obviously change each week depending on the changes in requirements and use cases. You make the plan a measureable artifact that the engineer has to produce to the architects and development managers. This would be accessible to the entire team including QA, UX, etc…The team would have to account for the plan, as well as completing the work in their estimation and development efforts. So in order to do something like this, we need to figure out how to make it fit without taking away functional development time. My main motivation is to empower the engineers on the team to be responsible for more than just their code. They have to present a plan on the testability of their code and execute on that plan.

Unit Testing: If we look beyond coverage, there are some tangible problems for us to tackle. The first problem that’s more engineering oriented is the issue over the time it takes to execute unit tests. It appears that the unit testing framework that exists today is not well decomposed. For example, if I’m working on the Grade Center, I can’t just run a suite of tests in an automated fashion tied to the Grade Center unless I manually target the one or two tests that I want to run. If I run a sub-set of tests, then I obviously don’t have to wait an hour. Considering their is belief that the coverage we do have his minimalistic (less than 30%), if we add more tests, we are only making the process last longer. Second, there is evidence that unit tests are being avoided on some of the most complex pieces of code in the product. Third, not all code should be tested by a JUnit test. We may have to consider alternative testing approaches such as HTTPUnit, JSUNIT, DBUnit, or even our own Selenium scripts. We can’t look at testing as the sole responsibility for our QA organization.

Talking Point 1 on Unit Testing: I think this is a solvable problem, but one that I don’t have firsthand experience with. So I would need to either run the tests myself and see how the API test harness has been coded, or assign one of my engineers. We could establish a project that tackles making the harness faster, as well as more decomposed for better decoupling of tests within the suite.

Talking Point 2 on Unit Testing: As I noted above in Talking Point 4 on Coverage, we have to make a change in the way we do business by making our teams accountable for the transparency of their unit tests and approaches to testability as part of the planning/design during a sprint. The teams should agree that the most complex areas of code and the areas with the greatest risk should in fact have the greatest amount of engineering testability. This will require executive motivation and measurement with accountability.

Talking Point 3 on Unit Testing: We need to not look at every engineering test evaluation effort as a JUnit test. We need an open framework and training for engineers to make use of alternative unit testing tools. If I’m writing complex SQL, I should be able to write a DBUnit test. If I’m writing a JavaScript interaction, I should write a JSUnit test or a Selenium script. You probably get my point that JUnit is not the answer to every problem.

Talking Point 4 on Unit Testing: I mentioned the need for transparency and collaboration of unit testing plans. I also mentioned that the more complex or greatest risk, the stronger the need for a unit test. That still doesn’t address the problem that QA has in that engineering is not running their tests. So I am of the mindset that we need to establish some process changes around code check-ins. We mandate code reviews. As part of that process, or as a supplementary process we need to establish a review of Unit tests new and old the affected area or component that is being developed against. It might be as simple as providing the results of the unit test as part of the check-in changelist. It may be as simple as having team members run the suite and authorize the check-in based on success with the unit test. In either case, engineers have to be accountable for running their tests and measured as part of their code commit lifecycle.

Code Quality: I would prefer to not lump code quality and static analysis under unit testing, but rather group unit testing and static analysis under the category of code quality. It’s clear that we failed as an organization to implement static code analysis as both a formal process, as well as an enterprise tool suite. The fall of code project that engineering undertook missed the mark. The project was more of a tool evaluation which never productized into our lifecycle. It never formalized into our methodology. It was sadly a tool project. I believe static analysis has to be a process and workflow with decision points and accountability. No one tool can solve all of our problems. Each team should have different stakes and interest. For example, we in performance engineering make use of 2 tools (findbugs and PMD as well as custom rules. Security engineering will attempt to use those tools as well, but we plan on using IBM’s AppScan source code suite as well as a handful of other tools to help us meet our static analysis goals. Engineering may need additional tools, so too may UX. Right now we aren’t doing anything formal from a UX code quality standpoint other than accessibility testing. We are doing some light JSLint, but nothing real formal on our CSS.

Talking Point 1 on Code Quality: My first point about building a process and not simply selecting a tool set is absolutely critical. We need to figure out how static analysis can be incorporated into our lifecycle. How and when do we define rules? Who and when are responsible for responding to rules? How to guarantee that these tools are simply not ignored, but are built into our workflow? How do we make teams accountable for responding to their issues?

Talking Point 2 on Code Quality: The tool set does become important when we have established processes. We need to make the tool sets more enterprise oriented, meaning that they are not running on someone’s desktop only. Desktop execution is for IDE execution only. Server side continuous analysis is what I am talking about. We need to integrate these tools into a formal configuration management process. The workflow discussed in Talking Point 1 on Code Quality has to establish rules and procedures for maturing rules, phasing them out when they are not applicable and most importantly responding to issues.

Talking Point 3 on Code Quality: Any team contributing code and/or inspecting code such as DBAs, UX, Perf and Security have to identify the appropriate static tools for inspection. We can’t approach a one tool fits all model. We have to be open to the idea that the tools are interchangeable and the process is maturing, but consistent.

 

Getting More Out of LoadRunner

Internal Blog from August 2010…

Maybe I’m getting antsy with SOASTA or maybe I’m just generally frustrated with how little we get out of LoadRunner. I’m not really sure. What I do know is that all morning long I’ve been thinking about pulling my investment out of SOASTA and putting it somewhere else. The somewhere else is probably in licenses of Dynatrace…but maybe even back into more LoadRunner.

Nonetheless, I’m generally frustrated with LoadRunner. I need to do something about it. I’ve started by adding Shlomi Nissim, Director of HP’s Center of Excellence to my LinkedIn Network. I’ve also put in a request to Amy Feldman, part of the Product Marketing team at HP as well. Hopefully, she will figure out if we are related and accepted my request. My intention of linking to them is truly to be able to develop a relationship between Bb and HP.

I’ve got a few other things that I am planning on doing as well. First, I’ve reached out to this guy in Colorado named Piotr Trzeciak who does all of our renewals. I’ve made Piotr pretty rich over the past 3 years by doing my renewals early. Little has shown from it other than a price break here and there. So I’ve asked Piotr to talk later today with the intention of helping me figure out who’s my account manager. I want to then talk to our Account Manager on-site and figure out how we can make more of the massive investment we have already made. Years ago we purchased Performance Center, but we abandoned it almost immediately when it didn’t meet our needs at the time.

There were a number of factors that influenced us somewhat abandoning Performance Center at the time. First, we couldn’t do unattended LoadRunner tests through Performance Center. Our only hope was to treat our Performance Center controllers as though they were LoadRunner controllers. Second, Mercury was just sold to HP. The company was in major flux. They sent a consultant (actually two) to help us implement and in both cases we struggled and abandoned.

I want to solve more problems with LoadRunner. For example, I want to address some of our scripting flaws with richer interfaces. I don’t feel as though we do this particularly well on our own. I want to at least evaluate how LoadRunner is solving these problems. They claim to have better recording capabilities that deal with AJAX more efficiently. Another example is I want more out of our monitoring. This has always been a fatal flaw of our LoadRunner infrastructure. We have used the minimum. Why not implement HP Diagnostics? This is obviously something we need to learn more about. I don’t know if we get it because we have Performance Center, or is it something that we have to buy? If it’s something that we have to buy, is it really worth buying?

Last but not least, I think we need to make a trip to HP Software Universe 2011. We missed it this year and it was literally in our own backyard. To our credit, we didn’t get a single email or old fashioned letter to invite us. Who’s fault is that? Next year is Vegas right before BbWorld which is also in Vegas. I don’t think I could justify 2 trips within 8 weeks of each other. Someone else on the team could go…

LoadRunner in the Cloud

One of the lessons I’ve learned about the SOASTA experience was that I still want and need to be able to test from the cloud. Why? Two reasons…first the tests are outside our firewall. Second, we could get more VUsers. What I’ve learned is that it’s really difficult and expensive to move 7 years of load testing capabilities to the outside without paying for it.

So I want to learn more about LoadRunner in the cloud. Check out this video.

 

More Cowbell Please

I think Will Ferrel is one of the funniest guys in the world. There was a Saturday Night Live skit he did years ago about “More Cowbell” for “Don’t Fear the Reaper”…It was by far the funniest skit I’ve seen in a while. For some strange reason I was thinking about that cowbell skit…I guess it was in the context of a past blog when I was pushing for the team to start thinking and better yet implementing weekly test loops and monthly test plans outside of the scope of the PVT.

I wanted to talk more about that blog and why it’s important. Each SPE that’s on a product or project team is still trying to sort out their identity on the team. I’ve said the token line…”Each of you is the owner of performance and scalability for your product or project team!”…but the clarity behind what that means is still not there.

I’m not trying to answer that for each of you, but rather I’m trying to get each and everyone of you on the team to collectively answer this question. What I do believe though is that that our #1 goal has to be able to provide feedback to the development team. Feedback comes a lot of ways. It can come from reviewing requirements to escalating antipatterns to raising bugs to sharing performance test data. One big kink in our feedback armor is that we are not testing out of cycle. The times between cycles are simply too big or long, whichever way you look at it. They are too big from a change perspective and too long from a time perspective.

So I see more testing (ie: weekly test efforts and monthly test plans) as that cowbell. More testing leads to more analytics…More analytics leads to more awareness of our subsystems and use cases…more awareness of our product area leads to more feedback because we will be able to share with confidence our thoughts and concerns about the performance and scalability of our areas of ownership.

“I’ve got a fever…and the only prescription is MORE COW BELL!”