Digging Into Dynatrace Exceptions and Logging

We have started to make use of Dynatrace in our day to day lives now as performance engineers. I’m trying to do my part as well so that I can contribute to the test analysis lifecycle. Today I’m going to start by providing an overview of two types of Run-Time Events: Exceptions and Logging. Both are critical features of Dynatrace as they provide us aggregate statistics of problems that are so-called “under the covers”. Exceptions are used to provide error-handling capabilities for applications. DynaTrace makes use of run-time bytecode instrumentation to capture throwable classes, messages, and stack traces of java classes that extend java.lang.Exception. Logging APIs and Frameworks allow the developer to control which logging information is processed with arbitrary granularity. The Logging dashlet shows an aggregated view of the logging events, grouped by the class and method that issued the logging event, the DynaTrace Agent that recorded the logging event, the message string, severity, and name of the logger.

Quick View Into Exceptions

The visual you are looking at below comes from a Windows PVT run. It appears that we have nearly 800k exceptions that were generated during the course of a test. The first two exceptions were responsible for 550k exceptions as well. We don’t quite know what the cost of an exception is to date. Dynatrace is still working that problem. I did some informal correlations. I’m still working the problem, but there are a couple of data points that I’m willing to share.

 

Let’s say you take 2 identical PurePaths and perform a PurePath comparison. The first PurePath is riddled with exceptions and the second one is completely clean, meaning no exceptions. In my example I was looking at a recorded session called “Win20minscalib_run”, specifically looking at (1) transaction: “LS Grade Center: Grade Detail Attempt Save And Continue Action”. We had 171 samples of this transaction according to Dynatrace. The worst transaction took 5.7s. It had a total of 17 exceptions (6 key not found and 11 constraint violations for NULL values.

 

What’s amazing in my mind is that when you review the exceptions in question, none of the columns throwing exceptions are actually “NOT NULL” columns. It’s a completely wasted exception.

SQL> desc attempt;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 GRADEBOOK_GRADE_PK1                       NOT NULL NUMBER(38)
 PK1                                       NOT NULL NUMBER(38)
 QTI_RESULT_DATA_PK1                                NUMBER(38)
 LINKREFID                                          VARCHAR2(255)
 SCORE                                              NUMBER(15,5)
 GRADE                                              NVARCHAR2(32)
 STUDENT_COMMENTS                                   NCLOB
 INSTRUCTOR_COMMENTS                                NCLOB
 FEEDBACK_FORMAT_TYPE                               CHAR(1)
 COMMENT_PUBLIC_IND                        NOT NULL CHAR(1)
 EXEMPT_IND                                NOT NULL CHAR(1)
 OVERRIDE_IND                              NOT NULL CHAR(1)
 INSTRUCTOR_NOTES                                   NCLOB
 NOTES_FORMAT_TYPE                                  CHAR(1)
 STATUS                                             NUMBER(38)
 ATTEMPT_DATE                              NOT NULL DATE
 DATE_ADDED                                         DATE
 DATE_MODIFIED                                      DATE
 LATEST_IND                                         CHAR(1)
 GROUP_ATTEMPT_PK1                                  NUMBER(38)
 STUDENT_SUBMISSION                                 NCLOB
 TEXT_FORMAT_TYPE                                   CHAR(1)

SQL> desc gradebook_grade;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 GRADEBOOK_MAIN_PK1                        NOT NULL NUMBER(38)
 PK1                                       NOT NULL NUMBER(38)
 COURSE_USERS_PK1                          NOT NULL NUMBER(38)
 COMMENTS                                           NCLOB
 NOTES_FORMAT_TYPE                                  CHAR(1)
 FOR_STUDENT_COMMENTS                               NCLOB
 FEEDBACK_FORMAT_TYPE                               CHAR(1)
 DATE_ADDED                                         DATE
 DATE_MODIFIED                                      DATE
 STATUS                                    NOT NULL NUMBER(38)
 HIGHEST_ATTEMPT_PK1                                NUMBER(38)
 LOWEST_ATTEMPT_PK1                                 NUMBER(38)
 LAST_ATTEMPT_PK1                                   NUMBER(38)
 FIRST_ATTEMPT_PK1                                  NUMBER(38)
 EXEMPT_ATTEMPT_IND                        NOT NULL CHAR(1)
 AVERAGE_SCORE                                      NUMBER(15,5)
 MANUAL_GRADE                                       NVARCHAR2(32)
 MANUAL_SCORE                                       NUMBER(15,5)
 VERSION                                   NOT NULL NUMBER(38)
 EXEMPT_IND                                NOT NULL CHAR(1)
 EXCLUDED_IND                              NOT NULL CHAR(1)

The second transaction, which is identical from a use case perspective. It was smaller in object size 13 versus 163 for the previouse (another data point to consider btw). It was faster (only 104.52ms) and less CPU time (5.8ms versus 264.27ms). This one only had a single exception, an exception mind you that did not show up in the first PurePath. This exception was “Entity type not found for key: caliper”.

Below is a quick PurePath Comparison of these two PurePaths in question. You can see a completely different structural difference between the two PurePaths. I’m sure some of it was data related. When you see so many more method calls you have to call into question why? There could be a lot of reasons. It could be an application error. It could be a LoadRunner error. It could be an instrumentation error. The conclusion that I’m inferring is that there is a penalty for all of the exceptions resulting in more method calls to account for the exceptions.

 

Quick View Into Logging

During the course of the run we issued close to 800 WARN, WARNING and ERROR messages via log4j and java.util.logging. These are obviously a lot more managable then the exceptions. You can literally sort through ever single exception if necessary and tie the message to a PurePath, which can then help you tie to a LoadRunner transaction (CPID and VUser). From a forensic perspective this is awesome as we can open up a ticket and have someone from Chris’ team dig into each of these.

 

I say Chris and AppLabs as a starting point (Tier 1) to review the issues because most of these appear as though they are scripting inefficiencies. You dig into the exceptions and they look as though we are trying to do something we aren’t allowed to. A fair number of them also look like simple defects scu as the “Not Found” resources listed at the bottom.

Digging Into Logging

My approach was to isolate PurePaths responsible for issue a logging message. The best way to do that is the following:

  • Open Up the Diagnose Events –> Logging for your recorded session
  • Sort Descending by Count
  • Right Mouse Click on the Logging Event and Drill Into PurePaths

For the vast majority you will see that it’s very specific transactions issuing the same logging messages. I would start by evaluating the transaction in question first via LoadRunner to see if the exception can be generated on Demand. I would then evaluate the same use case by hand with careful inspection to ensure I am functionally completing the scenario. I would evaluate the LR code to see that it’s in fact posting/getting everything correct. If I’m unable to determine that it’s a LR issue, meaning I could create the same problem by hand, I would open up a bug.

Advertisements

One thought on “Digging Into Dynatrace Exceptions and Logging

  1. Pingback: Broken Window Theory…Exceptions and Log Messages « Seven Seconds

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s