December 20th, 2020
Over my career in software engineering management, there are many things that I found out the hard way. Many lessons have been learnt through trial and error and first hand experience in delivering software. I have used these lessons to streamline my development teams workflow and ability to deliver effectively.
At the end of this article, you will learn how to remove friction from your development process, hold folks accountable for their contribution and enable your teams to deliver software more effectively.
Ok so I know software development doesn't involve horse's or any other animals (apart from the rubber duck), but this lesson is important, and the tacky analogy is permitted. Many organisations want the focus to be on delivering software, which seems reasonable. However, to deliver software effectively, some software needs to be re-written or refactored to accommodate the new requirements or to remove code that always causes problems. When teams keep ploughing ahead without taking the time too refactor or replace bad code we call this flogging a dead horse.
Code is complex
Software systems that have been around for a long time have a lot of code. Not all code is created equal. We have some good code, and we have some not so good code. Code is organised into methods and functions. These methods and functions are organised into classes or libraries. These classes and libraries can be used in many different projects to provide the features that software needs to operate.
As a result, a complex web of dependencies can be created in software systems. Much of which the original developers would not have envisaged when they cut the first code. Each new developer re-uses these methods or functions in slightly different ways to deliver new functionality. Sometimes these functions may be changed to deliver the requirements that the business now desires.
Changes become difficult
Successive generations of these changes result in situations where things don't quite work as they should. The developers always seem to need to make little hacks or workarounds to get the code to work as it should. Making changes to this code becomes hard without totally breaking something else.
Changes become expensive
As a result, whenever you need to make changes to your software, it ends up being much more expensive as a result of the additional test time required, not to mention the number of bugs that slip out into production.
In these situations, you need to bite the bullet and replace the problem code in question and move on. The more time you tolerate code that is toxic to your developer velocity, the more time and money you are throwing down the toilet.
How do we go about doing this? I hear you say? Well, there are a few things that need to be addressed before you jump into cutting code.
Understand the scope of the change
It is important to have a full understanding of the scope of the change. This issue may have been logged as a technical debt item. In which case, a reasonable amount of analysis is completed. If not, then we need to ensure that your engineering leads have performed a proper analysis and design exercise to ensure they have full visibility on the change.
Design a solution
Once we understand the full scope of this problem, ensure your team takes time to do a proper solution design. Make sure all the requirements are documented upfront with particular attention taken with non-functional requirements such as security, performance and load requirements. Once this is complete, your team can go ahead and come up with a solution design. Ensure this design is peer-reviewed by other engineering, quality, architecture leads to ensure we have ticked all the boxes in terms of our solution design.
Ensure adequate test coverage
When the change is fully understood and our design is complete, we need to ensure we have adequate test coverage (unit-testing, integration testing, automated regression) in place before we start cutting code. It is important to ensure we fully understand what's expected of the code's functionality, so the new code can fully match and perform as expected. Without full test coverage in place (especially unit tests) big changes to the code base become problematic.
Go ahead and make the change
Once all this prep work is complete, the team can go ahead and implement the change. Of course, your team will have their procedures in place and with the prep work completed, these changes will run much more smoothly.
The next lesson is to ensure you have a streamlined development workflow in place. Development workflow refers to the tooling, environments and processes required to get a change completed and deployed into production. Some people would refer to this as DevOps, and I would be happy for that moniker to be applied. Taking cues from the lean movement, especially around removing waste from a system we are looking to make the pathway from code complete to production as short and frictionless as possible.
Now we are not advocating any short cuts to our quality processes, but rather ensuring that all of these processes are as automated and streamlined as possible and we have suitable environments and test data in place to facilitate testing and validation.
The key measure of the speed of the development workflow is what I call developer velocity. Developer velocity refers to the speed with which software changes flow through the development workflow and into production.
Now many things can affect developer velocity. I have cherry-picked a couple of these here to discuss in more detail.
Environments / Test data
The lack of suitable environments and test data to support the test process is a major killer developer velocity if the developers need to wait for an environment to be provisioned, or if the quality folk need to create test data from scratch each time a change is completed, it is a major drag on developer velocity.
Documentation is outdated
I know that many people feel that in the age of agility that documentation is no longer important. But I am here to disagree respectfully. Developers, business analysts and quality folks all need documentation that is up to date to do their jobs. When this documentation is outdated or non-existent, developer velocity is slowed because we need re-confirm requirements or business processes before work starts, this takes more time to do during especially when this documentation should be ready well in advance.
Complex version support
One real killer of developer velocity is when the developers need to deploy new changes to old versions of the codebase. This is often the case with legacy software vendors and is not so much of an issue with cloud vendors. Having to port new features to old software versions is a nightmare for developers and testers alike. To be done correctly, it requires more environments and test effort for very little gain in terms of the software as a whole.
In this instance, we need to make changes to the development workflow to remove as much waste as possible. Below I have listed a few of the major items which will improve your development workflow.
Automation is your friend
Automation is a great place to start to streamline the development workflow and thereby improve development velocity. Below I have listed several of the key areas of automation needs:
Streamline version support (extra points for cloud)
A bit of a business decision here, but it doesn't make much sense to try to support multiple version of your software for different clients. Standardisation is the key here. Make some difficult choices around the number of prior versions and potentially different platforms to support (non-web).
Documentation is kept up to date
A key piece of analysis is required to determine the state of the documentation available for your systems. Also, we need to define the documentation that will be updated and by whom. Documentation updates should then become part of the team's definition of done. Documentation should be stored in a standard repository that is accessible by all stakeholders.
Measure developer velocity
Keep tabs on your developer velocity and look to further improve and streamline. It's a simple metric which measures the time from requirement being delivered to production deployment in days. The lower the number the better. Keep your measurements meaningful by ensuring a like for like measurement removing public holidays and weekends etc. Make this one of your measures you track. What types of things are affecting your team's developer velocity?
The third lesson I have learned is that we cannot afford to hide folks who are poor performers. Everyone is hired to perform a given role. Their job description should be well defined, their objectives understood, so to remove any confusion around their contribution. If underperformance continues as a manager, it is your role to step in and address this.
Business today more so than ever are lean and mean, so when people don't perform, it has a massive impact on the entire team. All teams have set objectives that they need to meet. When we are missing the contribution of any members of the team, it becomes difficult and almost impossible to meet those objectives.
Now where this becomes nuanced is performance issues that stem from someone being bullied or otherwise discriminated in the workplace. These types of issues are separate to that of performance, but nonetheless affect the team's ability to meet their objectives.
There are a few different scenarios where we see poor performance in our teams. As always, this is not an exhaustive list and I will walk through these below.
Doesn't enjoy their role/team
Often we see performance issues from people that are not enjoying their role. Folks that don't enjoy their work or indeed don't enjoy working with particular people on their team will find it difficult to perform at their best. These people are otherwise very capable folks and do not have any trouble performing the work required. They just hate doing it. It is a case of a good person but the wrong role.
Can't do the job - ever
In this category, the person behaviour isn't a problem; it is just they cannot do the work required as part of their job description. No matter of coaching or training can make this work. It could be a case of someone fibbing on their resume or potentially someone attempting a stretch role which doesn't work out. Again we have a situation where we have a good person doing the wrong role.
Can't do the job - now
Now this person is very similar to the person above. Again their behaviour isn't a problem. It is that they cannot do the work required as part of their job description. No matter of coaching or training can make this work. Now in this instance, the role or technology has changed, making their job more difficult or different from what they signed up for originally. Again we have a situation where we have a good person doing the wrong role.
Personal or health issues
In this instance, we have the case where we have great people doing a great job initially. However, this falls away due to health or personal issues taking their toll. These issues keep going for an extended period and severely impact the contribution that a person makes. We have a good person in the right role but via circumstances outside of their control are struggling to perform to the level agreed in their job description.
Problems with poor performance need to be addressed before they drag on for too long and affect the team's ability to deliver against their objectives. Time is the only thing that you cannot purchase more, so once the time has passed, it is extremely difficult to make up for the lost productivity and output.
Set expectations around contribution
Set clear expectations, responsibilities and contribution to the overall success of the team and company. A good way to do this is via job description and objectives.
Following on from the above point, all of your folks should have a current job description for their role. A current job description is non-negotiable. If you think about it for a second, it is very difficult for someone to achieve in their job if they have no idea what their job is all about.
Make sure the job description (as a minimum) spells out clearly:
There are more things you can fit into job descriptions, but it is important not to over complicate things too much.
Similar to job descriptions, objectives provide a firm understanding of the expected contribution for the coming year. Objectives are a bit of fine art in management circles. Some objectives are not detailed enough and don't push the team hard for results. Whilst others are too detailed and strict, which also can stifle achievement and demotivate staff. The best way to tackle objectives is to have an overall vision for what the team will need to deliver and allocate objectives on this basis.
Talk to them
Issues with staff performance should be addressed early and discussed often. A great way to manage staff and provide feedback is via regular one on ones. These regular discussions should be open and frank around expectations, allowing both manager and staff member to be on the same page.
Don't hide poor performers
Make sure you are also having regular conversations with your own manager around your directs performance. Make sure you mention problems early to avoid your manager being surprised when things go bad. Your manager may have some great ideas to help you get this person back on track, so it is important to involve them early on.
Create an action plan
During your discussions with your folks around their performance, an action plan should be created to break down the improvements required into manageable chunks to make its easier to achieve. The action plan should focus on items that the staff member will do; also, the support you will provide. This support could be in the form of development opportunities/training et al. A important thing to note is that dates and deadlines should be added to the action plan to provide a timeframe for improvement.
If we see no improvement in the performance of the staff in question, we should look towards the various HR discipline actions. These vary from company to company and indeed based on various legislation for different countries. HR discipline is indeed a difficult step to take. However, we owe it to the team and the company, in general, to ensure these problems are resolved.
Health / Personal issues
For health and or personal issues look for options to support the person in question. Ideally, these folks could be backfilled to enable you to keep on track with delivery, while they get themselves well. If this is not an option, those discussions you were having with your line manager come in very handy. Continue to make your case around backfill and assistance required.
The fourth lesson I will cover concerns test coverage. Now, as everyone knows, there is no way we can get to 100% full test coverage for testing purposes. So we need to be selective with what parts of our code we focus on for testing purposes. The most high-risk areas of our system must be covered under testing. Not having your most high-risk areas protected with test coverage is like a builder working on a tall building without the appropriate scaffolding.
I define high-risk areas of a system as those areas that will cause the biggest impact to our users in case of failure. These areas should have test plans/test cases and be included in regression suites. In addition, they should also be targeted with unit testing at the code level and automated testing where possible. To be able to do this, we need to understand what these high-risk areas are.
The main problem we face here is that for most software teams, it is impossible to test every piece of code we have. We don't have enough hours in the day or resources for a full suite of automation testing. Sometimes (actually all the time) we inherit codebases which have poor test coverage. But we still need to deliver business value in the way of code changes to our systems.
Only so many hours in the day
We live in a world with a set number of hours each day. We cannot test every single line of code we have. It is true that by setting up unit testing at the code level and test automation can get us closer to that full test coverage we desire. However, this will never equate to full test coverage.
Some code needs more attention
In all systems, there are areas of code which needs more attention. This is the code that always has bugs after a release. This is the code that always causes problems before a release. This code always seems to command the attention of the developers more so than other areas of the system. But is it really important?
Some code is not important
Some code can command the attention of the developers by always breaking or being difficult to maintain. However, the business can survive without having this function or feature available to them. While not a total waste of time, focusing effort in these areas is less important than other areas of the system which have a bigger impact on the business.
Some code is important
Some code, on the other hand, is very important for the business to operate. This code relates to functionality that is business critical and needs to operate correctly 100% of the time. These functions not operating correctly can have a bottom-line impact to the business or in some cases, depending on the system could put lives at risk. It is important to consider
How do we know what is important and what is not?
So in the danger of repeating myself here, we need to ensure that our software system has all the high-risk areas under test coverage. So this means test cases written fully testing these areas, these test cases included in regression test suites or test planning documents. Additionally, these areas of code need to be fully unit tested and covered with automated integration tests to ensure problems don't slip through the cracks.
The process we should follow in this regards is to perform a risk analysis on our code base and rank code (features/functions) in risk order, with the highest risk items at the top. Then we need to work our way down this list and ensure all our highest risk code is tested properly. So where do we start?
Inventory of features
Have a clear inventory of all code functions in the system. I know this a tough one to implement because some systems are very big and difficult to do a full inventory. I like to use user story map's to give me a high-level view of the system under development and use this as an easy way to visualise the inventory of features. Once you have this inventory of system functionality together, we can start to rank according to risk.
Perform risk analysis on code inventory
Work through the functional inventory and for each item, perform a risk assessment based on your company's risk framework. Speak to your corporate services department to get these details. Risk frameworks will vary from company to company, but they will identify various risks and associated classifications for these.
Go through the entire list of features and rank them all according to your companies risk framework. Order the list with high-risk items at the top and the lowest at the bottom. This becomes the order for your quality team to work through. One final step you should undertake is to run this list past your business stakeholders to ensure they agree with your team's assessments. This could uncover some differences here based on their experience/point of view.
High-risk items covered
Ensure all high-risk functions of the system are sufficiently covered under test (unit, regression, etc.). This risk ranking lets the team focus on the important areas of the system without worrying so much about the non-important items.
The final lesson I will cover today is that the business should always set the priorities for the development team. As much as we think we know what's right for the development directions of the business systems, we don't. The business folks are the people who are employed to drive the direction of the business. Therefore they are best placed to judge business value and priorities accordingly.
It is a lesson I have learned over the years to ensure the business folks are driving the development priorities. There are a couple of main reasons here behind this. 1) They know what is important to the business because they live and breathe this every day. 2) They will get behind and support the things they want, making change management easier.
I will walk through a couple of scenarios which illustrate this.
In the know
Business folks know what the business needs. They are involved in strategy/planning sessions to plot a way forward for the business. These folks are also on the front line and see the impacts of technology decisions has on their operations. With this in mind, while the business folks are not technically minded to implement a solution, they do have a great understanding of the problems they need to solve.
Support for what's important
The business will support what they need. If there is a critical change required, the business will support pushing this through development, testing and change management. For initiatives that the business deem as unimportant, the business will not find the time to support these changes. So changes advocated by IT will struggle to get traction with the business if these are not a slam dunk in terms of meeting business needs.
Business prioritisation process
The key element here is to establish a process to collect, triage and prioritise requirements with the business. The business should decide on the priorities the development team deliver. These prioritisation meetings should occur on a cycle the matches the development release cycle. In my current company, this is a quarterly cycle, but it can be more regular to meet your companies needs.
Technical debt type items are also important to be raised and prioritised with the business. It is the development teams role to ensure these important items are conveyed to the business with the priority required.
Planning for releases
Once we have the priorities of the business, the development team can plan out how to deliver these into the release cycle. This is an important step because the software won't write itself, you know. It may also involve business priorities ahead of others due to delivery considerations. For example, say the first four items on the business priority list are all large items, but the 5th item is small. In this instance, we could plan to deliver items 1, 2 and 5 in the next release cycle, as this would fit with our resourcing profile. This planning process will inform these decisions.
A key element of the business prioritisation process is ensuring that approved delivery plans have been communicated to all stakeholders. If the entire business is not on the same page here in terms of delivery of these priorities, we will waste time communicating this time and time again. Not to mention to the confusion the business users will feel in being left in the dark around plans. A clear cut communication strategy will address this and ensure everyone is on the same page.
I hope you have enjoyed this article around the lessons from a software engineering manager. This article highlights some key considerations that should always be front of mind for any software engineering manager. These boil down to lessons around developer efficiency, code quality, test coverage and business prioritisation. Applying the key strategies mentioned in each section will put your engineering team on the path to delivering great software.