Tag Archives: oracle

Hotsos Symposium 2008 Day Two

Day two is in the books…well we still have a dinner/reception tonight, but essentially all of the lectures are complete. Today was far better then yesterday. I’ll give my session by session synopsis below.

Why You Can’t See Real Performance Problems
Cary Millsap gave an awesome presentation on “Why You Can’t See Your Real Performance Problems.” He started the presentation with a visualization of the execution of a critical business process in the form of a sequence diagram. The diagram was awesome because it helped simplify the meaning behind what is a bottleneck.

Traditionally, bottlenecks have been defined as the resource with the highest utilization. So in his visual, image a simple sequence diagram with the following components left to right (Application Server — Local Area Network — Database — Operating System). In his visual (I’ll post it once I get it), the database was the most utilized resource. Millsap defined utilization as the (Amount of a Resource Used / Amount of Available Resource). Well anyways, the point of the visualization was that the CPU was the most utilized resource, but it was not necessarily the bottleneck. The bottleneck was where the greatest amount of latency was occurring. In Millsap’s fictitious example, the OS was where the greatest amount of latency existed.

To figure out the resource demands of each component, Millsap re-emphasized the importance of profiling. A profile is defined as a spanning, non-overlapping account of response time for a given task. So after he went through the example, Millsap redefined bottlenecks. He quoted one of my favorite capacity planners, Daniel Menasce in defining a bottleneck as the component where a task spends most of its time.

So what does this all mean? Well, basically we need to re-prioritize performance improvements based on business needs. Essentially, make sure your system works for your business and not your business working for the system.

I recommend that you read his paper. The powerpoint he presented only touched a small part of the paper. The paper talks mostly about skew. Take a look at his example below:
Example: All of the following lists have a sum of 10, a count of 5, and thus a mean of 2:

A = (2, 2, 2, 2, 2) Has No Skew
B = (2, 2, 3, 1, 2) Has a Little Skew
C = (0, 0, 10, 0, 0) Has a High Skew

Essentially, if we don’t understand our skew factor, whether it be for response times or resource instrumentation metrics, then we are not effectively looking at our data.

Trending and Charting Using Sesspack and Excel
One thing I really like about the Hotsos Community is their sharing and openness. One guy in particular is named Tanel Poder. This guy has written a number of awesome tools that we could easily implement and make use of in our lab. The first tool is called Sesspack. It’s a lot like statspack, but does session level analysis. The tool can be downloaded here. What I like about this tool are the variety of options that can be specified. Below is a snippet from the readme he included about how to use sesspack and the options you can pass to capture data:


Reference:
list.sql shows you snapshots with brief descriptions
srs.sql shows you the session event deltas between selected snapshots
sesspack.snap_me - snaps current session stats
sesspack.snap_sid - snaps given SIDs stats
sesspack.snap_orauser('') - snaps all given ORACLE user's sessions's stats
sesspack.snap_osuser('') - snaps all given OS user's sessions's stats
sesspack.snap_program('') - snaps all given programs session stats
sesspack.snap_machine('') - snaps all given machines session stats
sesspack.snap_spid('spid') - snaps all given SPID session stats
sesspack.snap_cpid('cpid') - snaps all given client PID session stats
sesspack.snap_all - snaps all session stats
sesspack.snap_bg - snaps background processes sessions stats
sesspack.snap_fg - snaps foreground processes sessions stats
sesspack.snap_sidlist_internal('select sid from v$sesstat where ')
- snaps all SIDs returned by custom SQL

If you don’t want to install the sesstat package, you can run a lightweight version called the session-snapper. He wrote the tool because of an interesting blog about the lack of session level information. You can [download the script here|http://www.tanelpoder.com/files/scripts/snapper.sql]. He’s written a few blogs about snapper.

He also introduced a tool called Perfsheet. It’s not quite finished for use by any of us. I can’t find it on his web site yet. He said he would post the tool by tomorrow. This tool is amazing. Basically, he’s built a very simple program within Excel that can connect to a database. Within the program you can run a query and then see exactly what wait events the sql is spending time on in a visual manner. I’ll post an individual blog on this one when it is posted.

One final thing he showed us was really simple things you can do with formatting in SQL*PLUS. I had no idea you could generate HTML formatted screens with little effort. He provided some scripts for demonstration purposes. You can write this to excel as well.

RAC Be Nimble…RAC Be Quick
I finally got to meet Bert Scalzo from Quest Software. Bert was the author of a pretty awesome Dell/Quest paper on Oracle RAC from a few years back that we used during [our benchmark at the Dell lab|http://www.dell.com/downloads/global/solutions/public/White_Papers/hied_blackboard_dell_whitepaper.pdf] back in 2006. Bert did an OK job presenting his paper, but it could have been a little better. It was a marketecture presentation on the three core tools of Quest: Toad, Benchmark Factory and Spotlight. There was one link I would like to investigate from Quest. Oh and he mentioned a paper on Oracle Waits Defined by Kyle Hailey.

RAC SQL Performance Tax
I finally sat in a presentation by Michael Erwin from Hotsos. He’s considered by many in the Oracle World as the Father of RAC. We brought him to Blackboard last year to work with ASP. Much of his presentation was about parallelism (degree of parallelism) with queries and how it can have a huge affect on RAC performance.


----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 2782 | 3 (34) | | | |
| 1 | PX COORDINATOR | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 107 | 2782 | 3 (34) | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 107 | 2782 | 3 (34) | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 107 | 2782 | 3 (34) | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 107 | 2782 | 3 (34) | Q1,00 | P->P | HASH |
| 6 | HASH GROUP BY | | 107 | 2782 | 3 (34) | Q1,00 | PCWP | |
| 7 | PX BLOCK ITERATOR | | 107 | 2782 | 2 (0) | Q1,00 | PCWP | |
| 8 | TABLE ACCESS FULL| EMP | 107 | 2782 | 2 (0) | Q1,00 | PCWP | |
----------------------------------------------------------------------------------------------------
• Table is set to Parallel 2
• PCWP = Parallel Child With Parent
• P->P = Parallel to Parallel
• P->S = Parallel to Serial

The example cost plan above is what you would expect for a simple query with 2 degrees of parallelism. In the example below, we have a situation in which we have a serial to parallel query. Refer to metalink article (263414.1) as to why this is bad. It’s fairly obvious is that we unfairly are taking a serial query and attempting to parallelize. There’s no gain in doing this. It only makes performance worst.


------------------------------------------------------------------------------
|Id| Operation |Name |Rows|Bytes|Cost| TQ |IN-OUT|PQ Distrib|
------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 106| 2862| 3 | | | |
|*1| HASH JOIN | | 106| 2862| 3 |23,01| P->S | QC (RAND)|
| 2| TABLE ACCESS FULL|DEPARTMENTS | 27| 432| 2 |23,00| S->P | BROADCAST|
| 3| TABLE ACCESS FULL|EMPLOYEES | 107| 1177| 1 |23,01| PCWP | |
------------------------------------------------------------------------------

Advertisements

Hotsos Symposium 2008 Day One

Day One is in the books for the 2008 Hotsos Symposium. It was pretty crazy getting here. I arrived around mid-night last night after 9 hours of airplane madness. Who would think that Dallas would have massive thunderstorms the first week of March? Certainly not me…

The conference started off somewhat uneventful. There was no crazy welcome with smoke and flashing lights…just Gary Goodman, the CEO of Hotsos kicked-off the presentation covering logistics and welcomed the keynote speaker, [Cary Millsap|http://carymillsap.blogspot.com/]. For those of you who do not know Cary, well he’s one of the best performance engineers in the world. He wrote Optimizing Oracle Performance back in 2003. This book continues to be my most read performance book in my library. It’s quite frankly my favorite performance engineering book ever.

Cary’s Key Note was a retrospective of his keynote from 2003. Back in the late 90’s and early 2000’s, “Oracle Tuning” was based on trial and error. The old way of thinking was to study the system as an aggregate, make a change (single change), observe the new performance characteristics and start the process again. Much of this process was based on improving percentages and ratios that had nothing to do with how users were actually perceiving performance. That’s where Method-R comes into play. Method-R is about focusing on the operations/transactions that are most important to the business. With Method-R, the idea of tuning is replaced by the process of optimization.

Method-R forces the performance engineer to ask better questions:

* Are the tasks fast? (Quantify/Measure)
* Are the tasks efficient?
* Is the system efficient?
* What would happen if?

Semantic Query Optimization
The first presentation I attended was about Semantic Query Optimization by Toon Koppelaars. SQO is when you have two queries that are semantically equivalent if they return the same answer for any database state satisfying a given set of integrity constraints. The overall point about the presentation is to identify weaker predicates and replace them with more sound/stronger predicates. Definitely take a look at the presentation. It includes tons of example scripts in the zip archive.

Leveraging Oracle’s Extended SQL Trace
The next presentation I attended was [Leveraging Oracle’s Extended SQL Trace Data to Expedite Software Development. The idea behind this presentation is to use the 10046 Oracle trace as an instrumentation tool for software developers. Included in the archive are a bunch of perl scripts to parse the 10046 trace data. The premise of the presentation is to use 10046 to identify software anti-patterns that are not completely obvious, or are incredibly obvious, but don’t make sense for performance.

Dealing with Software Agoraphobia
The third presentation I attended was a Solaris focused system performance engineering presentation on software agoraphobia. This was a heavily sun-focused presentation with an emphasis on prstat, lockstat and d-trace. The emphasis of the presentation was to focus on CPU latency rather then aggregate measures. Take a look at the speaker’s blog.