Category Archives: Uncategorized

Seven Months Into Startup Life…Tell Me Again Why I Haven’t Been Blogging About It?

Yep…dead giveaway with the title. I’m starting my eight month at Contrast Security¬†this week. It’s been pretty hectic, fun, tiring, all-consuming. Is there another word to describe that startup life has basically taken over my life, my wife’s life and even my kids’ lives. I wouldn’t trade it for the world. I figured I would use this blog to catch-up my precious readers of what’s going on.

Hiring Some Great Talent

I waited a few months before I brought in former colleagues shared a lot of my technical and business ideology. I think it was the right thing to do as i established some credibility with my team and formed a good foundation with the team. I did nonetheless bring in some excellent colleagues who worked with me in the past. So far I’ve brought four members from the past teams, as well as two other colleagues from other companies that I valued for their hard work and dedication.

I really should start by saying I really inherited great talent. The folks on my team that were with Contrast when I joined in September were/are very talented. Who says technical talent can exist out of Silicon Valley? I can attest after 15 years in the business that great tech talent can be found all over the world. Keep an open eye and an open perspective…then you will find them.

Move to AWS…It Really Will Make a Difference

The best part about technology is ubiquitous access. Nearly all of our technology that we use and build is cloud-based. In fact, it’s 100% cloud based. I don’t think I would ever want to host technology again given what I know today about costs, management and simplicity.

Right now we are about 30% Atlassian Cloud, 30% AWS cloud, 30% Firehost Cloud and the remaining 10% at various point technology clouds like Office 365, New Relic and Balsami (yes…Balsamiq has a SaaS version of their product). Our goal is to make our AWS investment over 60% of our cloud by summer. Shocking that Amazon still isn’t profitable.

Building an Office

It’s been a long and tireless effort to get us an office space. We started looking at building a space, but settled on a great location in an even cooler building at the Natty Boh Tower in Canton.

nattyBoh

Both co-founders live in Baltimore near me. They have opinions about what they want. I have to say, I really agree with every opinion they have. The opinions kind of vary from our CEO’s opinion btw…He’s out in Palo Alto, the mecca of startups and tech companies. He outfitted a spacious open floor plan in Palo Alto’s industrial district. He was able to get furniture for like 10 cents on the dollar. No such luck in Baltimore. It’s hard to get nice furniture on a dime in Baltimore in bulk.

Don’t get me wrong, you can find items here and there. To find one dozen desks, some chairs and a few conference rooms worth of goodies, is tough unless you make huge compromises like going to Ikea, or somehow stumble across the goose the laid a golden egg.

We were lucky. We found the amazing people over at Hyperspace to help us outfit our furniture. They’ve been great…I gave them a budget and they managed it to a tee.

We haven’t moved into the space yet. We get the keys on Friday, May 1st. We haven’t settled on our furniture either. I’m hoping that happens this week. Once we move in, I’ll be sure to post some pics of the place…

Why It’s Important to Build a Better Product

As an engineer I’ve always felt critical of the various applications I’ve purchased or downloaded for free. If I’m going to take the time to use an application, I really want it to work. I want it to work all of the time. I don’t want to have to be super technical to figure some ridiculous workaround. I certainly don’t want a kludgy experience. I simply want the software to work.

So I have these expectations about the software developers who build the products that I use. I expect them to build the best products. I just assume their code is tested. I assume it scalable and responsive. I maybe take for granted that it’s secure, but with a blind eye I assume it is secure. When I report issues, I guess I expect them to drop everything they are doing and fix my issue. Why do I have those expectations? My attitude is that if your core product doesn’t work correctly, why are you building more product? Shouldn’t your core product work first before you introduce new product?

buggysoftware

Motivation for this Blog

For weeks I’ve been reaching out to various contacts we have at the company and then through my own personal network to find customers who are great at giving us feedback. Feedback is essential for software companies looking to expand and grow. When you are lucky to find someone who will tell you that your product has issues, but they will continue to use it…well those kinds of customers are like the equivalent of gold.

What Features Are Considered Most Important That We Stop Developing New Features?

When I joined Blackboard in 2003, we had a really feature rich product. We had about 30 sub-systems (web apps). Each of those sub-systems had dozens to some cases hundreds of use cases. We had bugs….lots of bugs. Customers reported a lot of those bugs. We had so many bugs that in my first year, we had more support engineers than we had real engineers because we couldn’t keep up with the in-bound calls from customers. Some customers would log 10+ issues a day.

stronbadtechsupport

We knew we had a huge problem on our hands as an engineering team at Blackboard because we wanted to build more features. We didn’t want to really touch old code. We were focused on the new features that would help us get newer, cooler, better paying customers. On the one-hand that’s great to want to get those bigger fish. On the other hand, you have a pool of dedicated customers who quite frankly were your early adopters. They took a risk on us. Some took a small risk financially and some took a bigger risk financially. All have taken the risk of their time in using our product. We need to treat time like money.

I bring all of this up because when we attempted to filter through the minutia of issues, the team came to a collective agreement about two things. We agreed that we would be responsive to customer escalations when issues tied to a set of core features and use cases were sent to support. These core features were our bread and butter features. They were features that absolutely had to be fixed. We had to be responsive, because we knew we simply couldn’t spend the time improving our test coverage. In the early days we followed 3 core principals:

Step 1: Identify our Most Important Features and Capabilities

As a team (Engineering, Product Management and Support) we sat down and categorized every feature and use case in the product. We missed a few here and there, but met again and re-prioritized the list. We defined escalation paths for reported issues that we provided to our Support team to leverage when customer issues were reported.

Step 2: When an Issue Tied to these Most Important Features were Reported…We Agreed to Re-Prioritize Our Work and Fix Them

Since Product Management was in the discussions from the beginning, they understood that new features would sometimes be delayed due to buggy legacy features. They would review the issues coming in as well and would more often then not give the thumbs up to delay a feature in favor of making a core feature more robust and reliable.

Step 3: Always Make the Product Better

As we were forming this 3-step model, every single person chimed in that we should stop development in lieu of test automation. Let me just say this once and only once…”That my friends will never happen…”. Software has to be fluid. We can’t stop producing software and releasing software because it makes our software stale. Customers want small variations of change (not huge changes) on a reliable and consistent cadence. If the train was to stop, getting it back on the tracks is really…really hard.

We need carve out time in three ways:

a) When a customer issue is reported and we fix it…we need to add automated test coverage to minimize re-introducing issues.

b) When a customer reports one issue…we should dig into shared parts of the code and make sure nothing else is broken.

c) When we build new features, we need to incorporate the time to include better automated test coverage.

Bees With Machine Guns

bees

So apparently I’m late to the party. A really cool tool is making the rounds this week on one of my favorite podcasts (Reply All) called Bees with Machine Guns. The kind folks at the Chicago Tribune have created this little app to generate tons of EC2 micro-instances for load-testing. Who needs JMeter, Apache AB, Grinder or even commercial tools like LoadRunner or Soasta when you could get away with a nice little tool like this. The tool is a few years old based on this blog. It’s gone through 2 releases now. It might be a worthy project to make a pull request against.

Take a look at this blog here for a full breakdown of setup and running of the tool.

Time to Start Blogging Again…Hopefully More to Come

I have been meaning to post a blog for quite some time. I’ve been busy growing my new team, building new product and establishing a new identify for me in the AppSec space. It’s been more than rewarding to say the least. So I figured I would put a quick blog for those few readers out there who have wondered where I’ve been all of these months.

Let’s Start with Velocity 2013

I got a call around 6:30am PST on June 21, 2013 from my former boss at Blackboard. She had been my boss for the better part of a decade and she knew I was west coast, so I was surprised to get a call so early. It was one of those quick calls. It lasted 2 minutes. Basically, she said she was leaving Blackboard to move onto another job and that some changes were coming and that I would be meeting my new boss in a couple of days.

Gene Kim and DevCon 2013

A few weeks after Velocity was our own conference in July. I had been planning it for months with a few of my colleagues at Blackboard. I was to co-emcee the Developer Conference with Mike McGarr with special guest Gene Kim as our keynote. The conference was my defining moment at Blackboard. I had given dozens of talks over the years at DevCon and BbWorld, but this was a conference I had been dreaming of putting on for years. It went without a hitch. Mike and I nailed it…our customers were psyched…Gene Kim rocked the house!

Carbon Based Units and Lifestyle Jobs

When we all got back from Vegas, many of us who were part of Bb’s core LMS division realized we were not in Kansas anymore. There was new leadership in town. The company was going to change whether we wanted it to change or not.

I don’t want to go into too many details. From BbWorld 2013 to when I left before BbWorld 2014, I saw some of the best engineers in the world…many who had spent the better half of the decade working with me, decide to pick-up and leave. Most if not all of these engineers worked long days and delivered good work. In the eyes of some, they were compared to CBUs (Carbon Based Units) and their work was considered a lifestyle job because they worked from home or in remote offices.

For those of us who were close to the ground and had an ear on the pulse, we knew that this was a talented bunch of engineers, passionate about their work and loyal to the brand. I saw more engineers walk out the door than walk in. I tried to bring in some new engineers and for a short period, I was successful. In the end, I too succumbed to being miserable and had to pack-up 11 years worth of books and memories.

Summer of Steve

I took the whole month of July off before I would start work again. My wife was kind and patient enough to let me do my own thing. It was the best 3 weeks a guy could ask for. I would start my days taking my kids to swim practice. I would go to the gym or hangout with the other parents. On Monday’s I would volunteer with my daughter’s golf practice. Tuesdays through Thursdays I would volunteer at tennis. Everyday I played golf. I got in 19 rounds over the course of 3 weeks.

The best part is that every day I was excited to wake-up and do it all over again…

6 Weeks of Consulting

I took a 6-week tour for a gig that I will look back upon and say I was a really good consultant. I wasn’t a consultant though…I went to work with one of my longtime colleagues from Blackboard. Sadly, I feel bad that I let him down. The job just wasn’t for me. My expectations were different than theirs. They wanted me to do one thing…I thought I was brought on to do something else…In the end, when Contrast came calling, I realized it was better to cut bait at 6-weeks before it was too late.

Starting Over Has It’s Perks

The “new normal” as my wife called it was all about re-establishing the passion in my career. For 9.5 of my 11 years of Blackboard, I woke-up every day excited to take on the world. Those last 18 months were quite possibly the most painful 18 months I’ve ever gone through. I won’t allow that to happen again…Mark my word ūüėČ

It’s Feels Like I’m 28 All Over Again

The best part of my day starts around 6am when my alarm clock goes off. I love to get up in the morning. I love to look over my left shoulder and see my beautiful wife. I love waking my kids up to get them ready for school. As it relates to me…well, I absolutely love to get going on work. Every new day at Contrast brings a different challenge and an excitement that reminds me of when I was 28 and started to build the PerfEng team at Blackboard.

Growing Our Team…Building Some Cool Products

I was fortunate enough to inherit a good size engineering organization (9 engineers) considering the size of our company. Most of our employees are engineering. We have a rich, engineering culture. I’ve added to it in my short-time. I’ve brought on 4 engineers so far. I have 2 more starting in a couple days to get the count to 6 new engineers. It’s the right size for where we are now. We are lean…nimble…We can do almost anything we want.

A Promise…Of Sorts

It’s been entirely too long since I last blogged. It may be an empty promise, but I am going to do my best to keep blogging. This blog was more for me than anyone…If anyone got something out of it, I’m glad I could help.

The Power of Rundeck

A big part of the DevOps movement is the passion and commitment to “automate everything” and provide as much self-service as humanly possible. I’m a big believer in automation…not because I’m lazy, but rather because I have a desire to make all things repeatable, reliable and robust. I call those the “Three R’s” and they are a huge part of why I became a big believer in a 4th “R” which is called Rundeck. Note, I’m not the author of Rundeck. The awesome guys at SimplifyOps were the authors. I’m just a user, fan and admirer of cool, easy to use technology. Below is a quick passage about Rundeck…

Rundeck is an open-source software Job scheduler and Run Book Automation system for automating routine processes across development and production environments. It combines task scheduling, multi-node command execution, workflow orchestration and logs everything that happens. Access control policy governs who executes actions across nodes via the configured “node executor” (default for unix uses SSH) and does not require any additional remote software.[1] to be installed on them. Jobs and plugins can be written in scripting languages or Java. The workflow system can be extended by creating custom step plugins to interface external tools and services.

Wikipedia Comparison of Open Source Automation Tools

I’m a big fan of Rundeck for a number of reasons. My first reason is pretty straightforward. Basically, it’s a simple web application that provides the basic controls and workflow for self-service. The simple web gui is just so easy to use that anyone can understand how to use it with little training. My second reason is that it pretty much can make use of any automation/scripting framework out on the market. Third, it gives developers, operations engineers or even support staff a simple workflow for doing work on a server without ever logging into the server. Fourth and certainly not last is that it provides an audit and tracking system. There are other key things such as scheduling and reporting, which are super easy-to-use features to enjoy as well.

Source: Rundeck.org

Long before the SimplifyOps guys built Rundeck, my old team at Blackboard¬†built an automation engine we called Galileo. My team built it in Groovy/Grails. It was a lot like Rundeck, but not as simple to contribute and extend. It served a great purpose during its time. It helped us achieve so many of the needs I listed above. It required a listener on each destination client. Rundeck works without an installation on the client system. All that’s needed is an SSH key or simply passing login credentials for a trusted user within the script.

Crowd-Sourcing Development

One of the cool things that the SimplifyOps guys do is crowd-source their development via Trello, which is one of the best kanban boards available (for freemium btw) on the market. Their board is public for everyone to follow, vote and even contribute.

Making Time for a Side Project Using a Commitment Device

My team is getting used to my style and attitude about work. One core value I believe in is making time for other work (that’s relevant to one’s career) outside of the normal velocity of a sprint to accomplish additional learning or work. If you have a chance, take a look at my presentation about PTOn which is about applying a commitment device (ie: scheduling of time for Paid Time On) to ensure that the work is accounted for and is not disruptive to a team’s work velocity. ¬†

A really good friend of mine (David Hafley) sent me this article today which is directly in line with my presentation about PTOn (Paid Time On). Teams (and individuals) need time to work through a work problem (project of role) or a problem that could yield incredible inspiration (project of passion). The challenge that I see with software development (engineering teams in general) is that¬†teams focus on scheduling every ounce of time imaginable. If the team has 12 months in a year and they follow a 1 month velocity, then they have 12 units. The same applies to a 2-week velocity in which the team works off a 26 unit schedule. What I’m really getting at is that software teams tend to build utilization models that account for work and vacation. Occasionally, these work models account for training or an off-site. You get my point which is teams tend to over-schedule their team members like they are a bunch of Carbon Based Units.

CBLF meaning - what does CBLF stand for?

If you read the article closely, you will see it emphasizes creating “personal time” which I personally find difficult. I have a wife, kids, hobbies, etc…I will agree that finding personal time is important, but in the same grain, I would suggest that in the 40+ hours we spend at work (some 60+), we need to “find work time” for learning.¬†

 

Balancing Testing versus Measurement

One of the advantages of having a SAAS application is the ability to capture true production telemetry. This telemetry consists of functional and non-functional (performance and security) data points. These data points can be and should be optimized for use by our team to make us a more informed development team about the quality of our product. This by no means implies that live production metrics should be leveraged 100% in lieu of testing. There should be a balance of testing and measurement.

octopusK

I covered my testing philosophy in one of my earliest blogs in which I stressed and advocated for the need for robust build/test pipelines complete with quality inspection (unit, static, integration and acceptance). This pipeline is nothing original or unique that I’m proposing. The pipeline is a component of Continuous Integration in which developers commit early and often. The pipeline grows in complexity and maturity in an iterative fashion with each day as the team’s commits becomes a robust product or module ready for deployment. Consider this early phase more of an incubation phase in which the product is nothing more than executable code, but not deployable ore useable. When code is being incubated, teams should be placing more emphasis on testing and evaluation. This testing is more Unit and API, not acceptance testing.

11LEFTHANDED

If the product is ready for acceptance testing, then the product is ready for a deployment (synthetic or production). If the product is deployed, then it should be measured with deep telemetry (dynamic analysis) such as RUM (Real User Measurement), APM (Application Performance Management) and ASM (Application Security Management). Artifacts such as log files and live telemetry from component systems (Queuing Systems, Ephemeral Caches, RDBMS and Non-Relational Structures) should be captured and used. Why…Because the data is there. Why ignore passive data that can be analyzed, captured and organized in an automated fashion?

I can’t really explain why the data often gets ignored. It simply does because so many development organizations focus on the discrete activities of testing. They often fail to capture the more meaningful data that comes from embedded telemetry into the development process. That same telemetry data that can be captured in the testing process can be captured from live production systems. It’s like a golden egg that gets laid every day. The team has to take advantage of this goldmine of data.

cloud_22

I had the chance to talk with¬†Badri Sridharan from LinkedIn about a year ago. Badri and I both ran Performance Engineering practices in our careers. We were exchanging perspectives on the current and future of Performance Engineering. During the call, Badri shared insight into a system called EKG that the Development and Operations teams introduced at LinkedIn. The blog was written by the Operations team, so it shows a lot of infrastructure data points visually. If you look toward the bottom of the blog, you will see the reference to exception counts and a “variety of other metrics”. Those other metrics as Badri explained are functional verification data points. Teams at LinkedIn can get live production data for their Canary and A/B deployments before they promote code throughout the whole system

EKG compares exception counts, network usage, CPU usage, GC performance, service call fanout, and a variety of other metrics between the canary and the control groups, helping to quickly identify any potential issues in the new code.

I’m still learning what telemetry exists in our systems right now. I’m eager to hear from all of our teams about what data is captured, where it is stored, how it’s made actionable and how the data is brought back into the development process.¬†