Broken Window Theory…Exceptions and Log Messages

When I first read the Tipping Point years ago, I remember Gladwell’s fascinating story about the NYC police and how they cracked down on crime by replacing broken windows and painted over graffiti every single day/week. So if a window was broken by a stone, a brick, a bullet, whatever…it was replaced as quickly as possible. If graffitti was on a building wall or a city vehicle, it would be quickly painted. The core idea was that if vandalism was left in place, disorder would invite even more disorder-that a small deviation from the norm would set into motion a cascade of more vandalism. When addressed in a timely manner, it established a quality standard for vandalism to not be tolerated.

So what does that have to do with Blackboard Learn?

Good question Steve. Let’s start by talking about Exceptions and Log Messages. I’ve written two blogs, one about the cost of an exception and the second about using Dynatrace to evaluate exceptions and logs. Both of which discuss the great insight we can see quite easily with Dynatrace.

In the visual below you have a case of neglect. Yuck! We have broken windows everywhere. The visual is a snapshot of almost 1 million exceptions that our PVT test, which runs for about an hour. This test had about 4100 unique user sessions who performed about 56,000+ page requests. You saw what I wrote right? I said 1 million exceptions from an hour worth of user activity only by 4100 unique user attempts. That’s over 230+ exceptions a second. Holy smokes batman!

 

I call out the broken window theory because I see this problem of throwing exceptions as epidemic in the product. Fortunately, I’ve got the Flex team starting to begin looking into why are we throwing so many exceptions. We are wasting a ton of memory and CPU time by throwing these exceptions. Our goal is to get to no exceptions. This could, should and will become the standard going forward.

The visual below is also from the same PVT test. What we are showing here are all WARN, ERROR and SEVERE log messages (stack traces) that were raised during the test. It’s about 1100 log messages, which is about 1 log message every 4s. Not good…and yes another standard that we need to address.

 

Advertisements

One thought on “Broken Window Theory…Exceptions and Log Messages

  1. David CT

    Steve,

    Really enjoy your log. In the vein of “broken windows” – I haven’t tested 9.1 sp6 yet, but I know that up to that version, Blackboard ships with CSS that references non-existent files.

    One of the performance optimizations I take is to look at the logs for 404 errors. Since a 404 can’t be cached it represents a cost to both the server and the end user.

    Thanks

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s