Velocity Session on Evolving Performance Metrics at MSN

Last session surprisingly went over for such a not so great session! Boo!

Past/Present/Future

Past Primary: Time to Last Byte
Today Primary: Time to visual content
Today Secondary: Time to First Byte, Onload, Page Bottom
Future Primary: Time to Render (First render, above fold) and Time to respond

Looks like major theme from this speaker is about measuring rendering. It’s something that’s lacking and not cross-browser. It appears that measuring responsiveness is limited today. What we really need is a methodology/standardization and tools. We need timings related to initial and continuous responsiveness.

Note to self: Our NFRs need to be in the ms range, not seconds range. Why is it acceptable to have response times in the seconds? We need to figure out how can we get our product to sub-second responsiveness? Would it be possible to get a sub 200ms response time on a page?

Measurement Systems at MSN

Requirements for:

  • Engineering Cycle: measuring prototypes and internal milestones, indepth analysis
  • Real-User Truth: measure
  • Rendering and Responsiveness
  • Geo-Distributed Infrastructure
  • Competitive: measure competitors

Synthetic Tools: Use Performance Lab and Keynote
Real User Measurement: In-page & Server-side instrumentation, plus browser plug-in (toolbar)

A/B Testing

Impact on business metrics is the ultimate truth of whether a change is worthwhile.

A/B testing, split testing or bucket testing is a method of marketing testing by which a baseline control sample is compared to a variety of single-variable test samples in order to improve response rates. A classic direct mail tactic, this method has been recently adopted within the interactive space to test tactics such as banner ads, emails and landing pages.

Significant improvements can be seen through testing elements like copy text, layouts, images and colors. However, not all elements produce the same improvements, and by looking at the results from different tests, it is possible to identify those elements that consistently tend to produce the greatest improvements.

Employers of this A/B testing method will distribute multiple samples of a test, including the control, to see which single variable is most effective in increasing a response rate or other desired outcome. The test, in order to be effective, must reach an audience of a sufficient size that there is a reasonable chance of detecting a meaningful difference between the control and other tactics: see Statistical power.

This method differs from multivariate testing, which applies statistical modeling by which a tester can try multiple variables within the samples distributed.

Key Point: A small percentage (less than 1%) can be huge in the scheme of things.

Interesting Notes

First note is that MSN uses JQuery. One thing they do is load it from a CDN. Second thing to note is that they load it asynchronously by loading a small early stage JS library. Ends up being a zero net size increase to inline JS. Curious if we could do a similar approach with our JS library.

Another note is about embedding thumbnails. Use data URI’s to embed thumbnails within the base page. At end of HTML (w/ chunked transfer encoding) to avoid blocking rendering of textual content. Eliminated round-trips and extra TCP connections. About 200 to 500ms gain in performance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s