Tag Archives: scalzo

Hotsos Symposium 2008 Day Two

Day two is in the books…well we still have a dinner/reception tonight, but essentially all of the lectures are complete. Today was far better then yesterday. I’ll give my session by session synopsis below.

Why You Can’t See Real Performance Problems
Cary Millsap gave an awesome presentation on “Why You Can’t See Your Real Performance Problems.” He started the presentation with a visualization of the execution of a critical business process in the form of a sequence diagram. The diagram was awesome because it helped simplify the meaning behind what is a bottleneck.

Traditionally, bottlenecks have been defined as the resource with the highest utilization. So in his visual, image a simple sequence diagram with the following components left to right (Application Server — Local Area Network — Database — Operating System). In his visual (I’ll post it once I get it), the database was the most utilized resource. Millsap defined utilization as the (Amount of a Resource Used / Amount of Available Resource). Well anyways, the point of the visualization was that the CPU was the most utilized resource, but it was not necessarily the bottleneck. The bottleneck was where the greatest amount of latency was occurring. In Millsap’s fictitious example, the OS was where the greatest amount of latency existed.

To figure out the resource demands of each component, Millsap re-emphasized the importance of profiling. A profile is defined as a spanning, non-overlapping account of response time for a given task. So after he went through the example, Millsap redefined bottlenecks. He quoted one of my favorite capacity planners, Daniel Menasce in defining a bottleneck as the component where a task spends most of its time.

So what does this all mean? Well, basically we need to re-prioritize performance improvements based on business needs. Essentially, make sure your system works for your business and not your business working for the system.

I recommend that you read his paper. The powerpoint he presented only touched a small part of the paper. The paper talks mostly about skew. Take a look at his example below:
Example: All of the following lists have a sum of 10, a count of 5, and thus a mean of 2:

A = (2, 2, 2, 2, 2) Has No Skew
B = (2, 2, 3, 1, 2) Has a Little Skew
C = (0, 0, 10, 0, 0) Has a High Skew

Essentially, if we don’t understand our skew factor, whether it be for response times or resource instrumentation metrics, then we are not effectively looking at our data.

Trending and Charting Using Sesspack and Excel
One thing I really like about the Hotsos Community is their sharing and openness. One guy in particular is named Tanel Poder. This guy has written a number of awesome tools that we could easily implement and make use of in our lab. The first tool is called Sesspack. It’s a lot like statspack, but does session level analysis. The tool can be downloaded here. What I like about this tool are the variety of options that can be specified. Below is a snippet from the readme he included about how to use sesspack and the options you can pass to capture data:

list.sql shows you snapshots with brief descriptions
srs.sql shows you the session event deltas between selected snapshots
sesspack.snap_me - snaps current session stats
sesspack.snap_sid - snaps given SIDs stats
sesspack.snap_orauser('') - snaps all given ORACLE user's sessions's stats
sesspack.snap_osuser('') - snaps all given OS user's sessions's stats
sesspack.snap_program('') - snaps all given programs session stats
sesspack.snap_machine('') - snaps all given machines session stats
sesspack.snap_spid('spid') - snaps all given SPID session stats
sesspack.snap_cpid('cpid') - snaps all given client PID session stats
sesspack.snap_all - snaps all session stats
sesspack.snap_bg - snaps background processes sessions stats
sesspack.snap_fg - snaps foreground processes sessions stats
sesspack.snap_sidlist_internal('select sid from v$sesstat where ')
- snaps all SIDs returned by custom SQL

If you don’t want to install the sesstat package, you can run a lightweight version called the session-snapper. He wrote the tool because of an interesting blog about the lack of session level information. You can [download the script here|http://www.tanelpoder.com/files/scripts/snapper.sql]. He’s written a few blogs about snapper.

He also introduced a tool called Perfsheet. It’s not quite finished for use by any of us. I can’t find it on his web site yet. He said he would post the tool by tomorrow. This tool is amazing. Basically, he’s built a very simple program within Excel that can connect to a database. Within the program you can run a query and then see exactly what wait events the sql is spending time on in a visual manner. I’ll post an individual blog on this one when it is posted.

One final thing he showed us was really simple things you can do with formatting in SQL*PLUS. I had no idea you could generate HTML formatted screens with little effort. He provided some scripts for demonstration purposes. You can write this to excel as well.

RAC Be Nimble…RAC Be Quick
I finally got to meet Bert Scalzo from Quest Software. Bert was the author of a pretty awesome Dell/Quest paper on Oracle RAC from a few years back that we used during [our benchmark at the Dell lab|http://www.dell.com/downloads/global/solutions/public/White_Papers/hied_blackboard_dell_whitepaper.pdf] back in 2006. Bert did an OK job presenting his paper, but it could have been a little better. It was a marketecture presentation on the three core tools of Quest: Toad, Benchmark Factory and Spotlight. There was one link I would like to investigate from Quest. Oh and he mentioned a paper on Oracle Waits Defined by Kyle Hailey.

RAC SQL Performance Tax
I finally sat in a presentation by Michael Erwin from Hotsos. He’s considered by many in the Oracle World as the Father of RAC. We brought him to Blackboard last year to work with ASP. Much of his presentation was about parallelism (degree of parallelism) with queries and how it can have a huge affect on RAC performance.

| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | 107 | 2782 | 3 (34) | | | |
| 1 | PX COORDINATOR | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 107 | 2782 | 3 (34) | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 107 | 2782 | 3 (34) | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 107 | 2782 | 3 (34) | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 107 | 2782 | 3 (34) | Q1,00 | P->P | HASH |
| 6 | HASH GROUP BY | | 107 | 2782 | 3 (34) | Q1,00 | PCWP | |
| 7 | PX BLOCK ITERATOR | | 107 | 2782 | 2 (0) | Q1,00 | PCWP | |
| 8 | TABLE ACCESS FULL| EMP | 107 | 2782 | 2 (0) | Q1,00 | PCWP | |
• Table is set to Parallel 2
• PCWP = Parallel Child With Parent
• P->P = Parallel to Parallel
• P->S = Parallel to Serial

The example cost plan above is what you would expect for a simple query with 2 degrees of parallelism. In the example below, we have a situation in which we have a serial to parallel query. Refer to metalink article (263414.1) as to why this is bad. It’s fairly obvious is that we unfairly are taking a serial query and attempting to parallelize. There’s no gain in doing this. It only makes performance worst.

|Id| Operation |Name |Rows|Bytes|Cost| TQ |IN-OUT|PQ Distrib|
| 0| SELECT STATEMENT | | 106| 2862| 3 | | | |
|*1| HASH JOIN | | 106| 2862| 3 |23,01| P->S | QC (RAND)|
| 3| TABLE ACCESS FULL|EMPLOYEES | 107| 1177| 1 |23,01| PCWP | |