Tag Archives: quality

Continuous Delivery…Continuous Integration…Continuous Deployment…How About Continuous Measurement?

I spend a lot of my free time these days mentoring startups in the Washington, DC and Baltimore, Maryland markets. I mentor a few CEO’s, who are building software for the first time, as well as a few folks in the roles of VP of Engineering or Director of Development. It’s fun and exciting in so many ways. I feel connected to a lot of these startups and personally feel a lot of satisfaction mentoring some really great people, who are willing to put it all out there for the sake of fulfilling an entrepreneurial spirit.

I’m not just partial to startups. I enjoy collaborating with peers and colleagues that work at more tenured companies. I think it’s important to get alternative perspectives and different outlooks on various subjects such as engineering, organizational management, leadership, quality, etc…

http://www.miraclegroup.com/images/easyblog_images/205/lean.gif

For about four years or so there’s been a common theme amongst many of my peers and the folks I mentor. Everyone wants to be agile. They also want to be lean. There’s a common misconception that agile = lean. Yikes! I’ve also noticed that a lot of them want to follow the principals of Continuous Delivery. Many assume that Continuous Delivery also means Continuous Deployment. The two are related, but they are not one and one the same. Many of them miss that Continuous Integration is development oriented while Continuous Delivery focuses on bridging the gap between Development and Operations (aka…the DevOps Movement). Note: DevOps is a movement people…not a person or job.

The missing piece…and I say this with the most sincere tongue by the way…is that there still remains this *HUGE* gap with regards to “What Happens to Software In Production?”. My observation is that the DevOps movement and the desire for being Continuous prompted a lot of developers and operations folks to automate their code for deployment. The deployments themselves have become more sophisticated in terms of the packaging and bundling of code, auto-scaling, self-destructing resiliency tools, route-based monitoring, graphing systems galore, automated paging systems that make you extra-strong cappuccinos, etc…Snarky Comment: Developers and Operation Engineers can’t be satisfied with deploying an APM and/or RUM tool and calling it a day.

gadget  monkey

Continuous Measurement is really what I’m getting at. It’s not just the 250k metrics that Etsy collects, although that’s impressive or maybe a little obsessive to say the least. I would define Continuous Measurement as the process of collecting, analyzing, costing, quantifying and creating action from measurable data points. The consumers of this data have to be more than just Operations folks. Developers, architects, designers and product managers all need to consume this data. They *need* to do something with it. The data needs to be actionable and the consumer needs to be responsive to the data and thoughtful going forward for next generation or future designs.

In the state of Web Operations today, companies like Etsy or Netflix make a tremendous amount of meaning from the data they collect. The data drives their infrastructure and operations. Their environments are more elastic…more resilient and most of all scalable. I would ask some follow-up questions. For example, how efficient is the code? Do they measure the Cost of Compute? (aka: the cost to operate and manage executing code)

Most companies don’t think about the Cost of Compute. With the rise of metered computing, it’s amazing to abstract the lost economic potential and the implied costs because of inefficient code. Continuous Measurement should strive to balance that lost economic opportunity (aka…less profit). Compute should be measured as best as can be from a service, feature and even at a patch set level.

A lot of software companies measure the Cost to Build.  Some companies measure the Cost to Maintain. Even less measure the Cost to Compute. Every now and again you see emphasis placed on Cost to Recover. Wouldn’t it be a more complete story with regards to Profit if one was able to combine the Cost to Build with the Cost to Maintain and the Cost to Compute?

Maybe the software community worries about the wrong things. Rather than being focused on speed/delivery of code and features, maybe there should be greater emphasis placed on efficiency of code and effectiveness of features. Companies like Tesla restrict their volume so that each part and component can be guaranteed. Companies like Nordstrom’s and the Four Seasons are very focused on profit margins, but at the same time they value brand loyalty in favor. I used to think that of Apple, but it’s painfully obvious that market domination and profitability have gotten in the way of reliable craftsmanship. I love my Mac and IPhone, but I wish they didn’t have so many issues.

http://www.toonpool.com/cartoons/buy%20magic%20beans_53582

I have no magic beans or a formula for success per se. I would argue that if additional emphasis was placed on Continuous Measurement, many software organizations would have completely different outcomes in their never-ending quest to achieve Continuos Delivery, Continuous Integration and Continuous Deployment. It just takes a little bit of foresight to consider the notion that Continuous Measurement is equally important.

Advertisements

Quality is Free

I’ve been thinking a lot lately as to why everything I have to assemble for my daughter comes with extra parts. It used to be that you would buy something that required assembly and by the time it arrived at your house it was DOA because of a missing or broken part. Nowadays it’s pretty difficult to find a doll house or a toy car requiring assembly come broken or missing parts. Why might you ask? Well, because it’s a lot harder to get away with that in today’s consumer marketplace. If the quality of a product or good isn’t up to snuff, then the consumer is going to go elsewhere.

Do we feel as though we are immune to consumers making another choice? I don’t think we intentionally do, but often we neglect to realize that Quality is Free. (Note to self: I didn’t invent the phrase Quality is Free, but rather it’s the name of book by Philip Crosby). Well, maybe it’s not totally free, but it’s a whole lot cheaper.

Let’s take an example…today we received a series of emails from Engineering Services asking us to benchmark a Solaris Cluster of Vista 4.2 because of a reported issue only seen on Solaris. Ordinarily this kind of request wouldn’t been too outrageous. If the request came on Linux, it could have been taken care of in minutes. Because it was on Solaris, which we have limited equipment, it required some juggling of equipment and re-arranging schedules so Anand could work on the problem. Needless to say, as I write this blog Anand is still having trouble getting an environment up and running.

There’s nothing out of the ordinary with this example, unless you ask the question “Why are 10 clients reporting this issue and we never saw it in our own lab?” Well if you understand the equipment on hand (limited Solaris equipment) and the amount of time it would take to do cycles on Solaris, you would know that we have done very little testing on Solaris. Most of our Unix work has been on Linux. The main reason is that our Solaris environments cost 2 to 4 times more then our Linux environments.

So let’s add up the costs. If we had purchased Solaris PVT servers, which we would need a minimum of 5 (~$9,000 each) for a total of $45,000. You factor in that we would need to run PVT cycles for Solaris, which would cost us about $3,000 to have one engineer perform what would be an additional month of work during a release. We will throw in an additional $2,000 in miscellaneous expenses to make our grand total about $50,000 in expenses.

From a cost perspective $50,000 isn’t all that much considering we spent several 100k on hardware as a department. What does the dollar value add up to handle these issues after the fact? Let’s forget about all of the expenses that we would have endured with Support Engineers and Engineering Services had to put up. Let’s also forget that one of our resources had to stop working on his current assignment in favor of this assignment.

If we solely focus on the affect this issue could have in terms of contract value, let’s just hypothesize that this issue becomes the final issue that breaks the camel’s back. We’ve had 10 clients (large and small) report this issue affecting their semester. If we take a low-end average of $10,000 per contract value (we all know that these contracts are probably 2 to 5 times larger…but for fun we will go with the 10k), then we end up losing in year 1 $50,000. If the contract value compounds over four years (which is the life we usually get out of our hardware), we would have lost $350,000 assuming no other clients left and these particular clients did not change their license level or fees. We wouldn’t have spent more then $50,000 on the equipment. According to FASB accounting standards, we could amortize the capital expenditure over the period of 4 years. That means we would have only spent about $12,500 per year. So instead of the lost revenue being $50k in year one, it’s more like $86.5k.

There are some people who are going to read this blog and say to themselves that our problem is we didn’t budget money to handle Solaris PVTs. That’s not the total message I am trying to convey. Rather, I am trying to say that making the effort to address quality long before our product reaches the consumer market is heck of lot cheaper then waiting until the fire engulfs us. So then why do we neglect quality? We do it every day…without even realizing…we forget to put instructions in our packaged materials…we put that cracked piece of wood at the bottom of the box…or we forget to include all of the screws and washers.

Ask yourself every chance you get…”What do we get by sacrificing quality?”