Agile in IT Support and IT Operation Team
How can we try agile ways of working for IT operation and IT support team?
As a few colleagues, we had a discussion and we tried to capture our experience in the below.
During our engagement we devoted some time to understanding IT Operations and IT support teams, to discover their world, and how different their world from the IT Product Development team.
Few questions we asked during our transformation drive,
- How do they measure their success?
- What element of Agility we can bring with those teams?
- What they think, the way they operate?
- How can we converge these two worlds for maximizing Value?
What we have observed the conflicting goal, the IT Support team measure themselves by bringing Stability (Less turbulence), Product development team measure themselves by producing frequent releases! More release, more chance of Chaos!
We were trying to bring harmony by involving both parties on one page.
IT Support team manages their program by maintaining a Visual management board (Kanban board), which they measure based on the ticket inflow and outflow.
The flow measurement time is critical to those teams. Another one to manage system stability — MTTR (mean time to recovery, repair, respond, or resolve).
We list down a few challenges need to watch out for:
There are many challenges the IT service and IT support team experience.
Most of these challenges are not only concentrated in the IT Support and IT operation areas. These are originated in the upstream of the Organizations which are IT Product Development, IT Business areas. IT Operation and IT Infrastructure teams are highly rigid in any changes to happen. They are good at creating the watermelon effect. Everything is going awesome into this space.
Challenges:
- Silos organizations. IT Product development; IT Infrastructure support and IT Operations are totally different units. They have their own goal and own rules. They have their own separate kingdom! They do not collaborate with each other, either ready to cooperate with each other. Services are fragmented among multiple departments.
- Long cycle time to serve the customer request. As there are many layers within the organization. Horizontal and vertical layers, it takes time to traverse through the long path. Manual lengthy release process, Manual deployment, Manual approval, Manual steps, tests, etc.
- Measurement: IT operation measurement data and IT product development measurement data are different. IT Operation measures themselves based on stability. IT product development measure themselves speed of delivery. IT operation does not advocate too much turbulence but IT product development would like to deploy as many releases as feasible.
- Process: IT operation and IT infra support team run by a heavyweight process to reinforce stability. They tend to follow a sequential, approval-based process. IT product development minimizes process and tries rapid product development with minimizing process overhead. With the traditional process, the overall time to market and respond time slows down.
- Skills: IT operation and IT infra support team members are slow in adopting new automated tools. Most of the time, rely on manual steps. Skills and competencies are not modernized for automation.
- Goal: Conflicting goals in each unit, there is no easy way to align the whole team, team members are distracted in multiple works. It takes time to figure out which is the appropriate-optimized way to work out the ticket faster.
- Mindset: Most of the team members are operating in the traditional organization and traditional management style within IT Operation and IT Infra service. These 2 organizations have completely different cultures compare with IT Product Development.
- Raising tickets, assigning to each other department, tracking, follow up to create a lot of hands-off and waste into the system. Trust deficiency is the issue most of the time. Conflicts are common among IT operations and the IT Product development team.
- Team ownership missed most of the time, activities lying on the departmental border and later blaming each other causes lot of confusion and trust issues.
- Old, outdated infrastructure, tools, applications, a process which are very fragile to operate or deploy. Many defects and IT operation teams are loaded with L1, L2, and L3 issues.
- Traditional command-and-control type organization, leading to the risk-averse climate where mistakes are punished politically. Team members want to be in their comfort zone to avoid making any mistake.
- Never thought through while architecting an application how easy or difficult it will be to operate or maintain the application, cause into waste in IT operation.
- Too many specialist team members cause to bloat team size and silos operation (Missing Cross-functional team members, missing autonomous empowered teams). Outdated Skills, Legacy Fixed mindset, Traditional Leadership style, anti-agile behaviour
- Once Applications are developed and deployed, the IT product development team’s ownership and commitment minimized. Now it is IT Infra and IT operation teams, to figure out all the disruptions happening with the Application. Why IT Product Development teams are not accountable for the application’s failure?
- IT Operation and IT support teams tend to have many team members with specialized skills. Every role is justified and, in less time, there are many team members added.
- Poor customer feedback from the survey. Long delivery cycle time. Less feedback collection after service offering, customer collaboration is minimal, Collaboration among teams is not so great.
- Anything goes wrong, punish the person behind it. The system is filled with fear to take any steps, team members are not psychologically safe to take any steps not written the document or written in the approval.
- There is no Production like Environment with the development team, so, more errors appear when features are deployed in production for the first time. It causes a delay in to release of the features into the market.
- Collect data “As-is” about the systems
1. Release frequency
2. How many changes are deployed?
3. Cycle time
4. Rework time
5. Approval process time, if any
6. Build failure %, deployment failure %
7. Number critical bugs discovered monthly/weekly
8. Recovery time from failure
9. Service Availability
10. Infra uses %
All this information will reflect the inefficiencies are there or not, if any.
In the above figure, a typical Organization composition has been demonstrated. IT Product Development Organization is serving numerous business units. They are aligned with the business to produce new features or complying with new regularity policies, etc.
IT Infra service and IT Operation organization sometimes, one organization, sometimes 2 different organization based on the volume of the work these units have to deal with.
IT Infra service and IT Operations are serving business continuity and ensuring all the hosted applications are up and running all the time. These organizations are back of the business. These applications and systems are directly accessed by millions of customers day and night.
All these organizations orchestrate together to present a better customer experience.
The Agility of any organization is measured by the rapid speed of the offering organizations are providing at any given period of time. All it depends on the agility of the people, systems, structure, process, and culture.
What can we work out?
We require to concentrate mainly on
- People
- Structure
- Process
- Leadership
- Technology
As the above picture shows, all these areas have to be looked upon to ensure these areas are enabling the transformation to happen smoothly. With the help of the leadership team, Agile coaches have to work together to strengthen all these areas for better value generation. As transformation is a journey by taking various actions, we change the Organizational culture.
What we consider we can work out to bring about value for those teams.
In the above picture what could be the ideal high-performance team would be when we have all the team members are coming from diverse expertise groups and form a Squad with a common mission and purpose. There will not be any silos in this model. This Squad will continuously improve to rapidly generate value to the end customer. They will cross collaborate with other Squad to do the dependency management. Such formation and maturity developed after working with several iterations, experiment, and learn from the mistakes. The Squad will have one backlog, one common metric to measure themselves to claim success.
Short term Goal:
- Give training on Systems thinking tools. Coach on applying these tools. How do we visualize whole so that we align all the teams? Minimize frictions and maximize collaboration for a common purpose. What is the common purpose of concentrating on customer-centricity?
- Make tools team centralize. Standardized those tools from central Tools CoE teams.
- Maximize automation, 90% automation target. Look for the opportunity to identify manual steps and promote to automate those steps (Automate testing, automated deployment, etc). Real-time monitoring and automatically detect, alert and resolve issues.
- Encourage security, reliability audit frequently to look at the resiliency of the IT Infra and IT operations
- Coach for maximizing collaboration among IT product teams, IT infra team, and IT operations teams (Mindset, Structure, process, and leadership side)
- Increase customer centricity mindset. Looks for an opportunity where this attribute is missing. Caught people in behaviour that are not aligning with this approach.
- Inject continues improvement culture. Track the monthly actions taken by the teams to improve the overall systems. From Doing to “Being” Agile.
- Drive Value Stream Mapping and find flow time (Find the bottleneck — optimize the flow)
- Look into the defect trend and find a pattern. Take action to mitigate such defects (Issues due to?)
- Find data of Meantime to Detection, Mean time to recover, etc. what all these data tell. Let us go deeper beyond these data. What culture, mindset, people, process say? What can we improve?
- Look for Fail fast culture and mindset. Leadership style needs to encourage to experiment and learn from it. Coach people to perform such steps more.
- Create one team concept, “you build it; you run it” — Everyone is responsible. Minimize hands-off. Create a virtual team from IT Development + IT support + IT Operation. Create a cross-functional, self-organizing, self-driven, autonomous team. Invite all the team members in various iteration events, collaborate as much as possible, align for the common goal.
- Create visual management, information radiator in all the places. Do GEMBA walk with the leadership team and inspect those data. Give the award to the best such teams who are demonstrating improvement.
- Apply Kanban in all the teams in IT Operation and IT Service, watch out for WIP limit, measure lead time and cycle time. Manage the batch size.
- Maintain proper Value, Flow and Quality
- Coach for Pull based teamwork. Look for hero culture and minimize this mindset. Create a culture for collaboration.
- Build a Squad of Transformation evangelists who will drive the culture transformation.
- Conduct Lean coffee sessions to understand the challenges, impediments, and stuff that are working.
- Communicate, Communicate and Communicate. Drive Open Space Agility to solve the complex organizational problem. Involve all to solve the problem. Townhall, Hackathon, blogs etc approach need to conduct to bring changes.
Long term Goal:
- Organizational structural change to create one team concept. Few% capacity reserve for core development, % capacity reserve for support activities. Have the mindset of everyone is capable to do most of the work. Reskill people to that level.
- Look for organization architectural strategy. Look for eliminating monolithic architecture and more microservice architecture. Bring data and evidence to demonstrate that due to poor architecture team members are spending a lot of time in IT operations. Architect for better operation. Architect for security. Architect system for resiliency.
- Reward and recognized people to do rapid value delivery by bringing all the changes which enable organization and teams to do that. (Change the people’s thought, change the process steps, change the tools and infra to achieve this)
- Coaching to adopt One team mindset. Coaching to take ownership mindset. Coaching for overcoming silos mindset.
- Maintain optimum team loading (Most of the time we have seen support team members are working for many applications, chances are there to make many mistakes in these scenarios)
- Maximize automation wherever feasible, build expertise for automation, train team members. Procure automated tools wherever feasible. Minimize human dependency and Manual work wherever feasible. Create a long-term plan on that.
- Look for the opportunity to improve team ownership, team commitment, with less manage more (Work smart). Improve those through the Kaizen approach.
- Involve the support team to know more about the upcoming dev features are coming so that the features are in production it is no more surprise for them. The Product Management team should frequently synch up with the IT support team to know more about each other’s world better.
- What works today will not help us to stay relevant, so what new concepts, policy, process innovation etc we should try? Speed and Quality both have to be achieved.
- Auto monitoring, auto rollback, auto mail trigger, auto-deployment (Reliability, Security, and stability) is the key
- Come out with Metrics & Measurement which helps the teams to reflect the real-time system inflow and outflow status. Educate people to use the data in various governance meetings.
- Identify and coach the people who are not aligned with the Modern Product Development process and comfortable with the traditional product development process. IT operation (to prevent frequent changes to production systems, not enable them) let us change those approaches.
- Create a Learning organization by encouraging people to share knowledge, award them, recognized the team people who are performing remarkable work. Let us add goals into each individual’s target to achieve these learning and sharing attributes.
- Build Capability (People capability, Process Capability…… so on). Measure Maturity of the capability building.
- Push for Infra as a Code automation. We may get rid of application-specific engineering and production support team members.
- New Agile role creation e.g., Site reliability engineers, Automation architects, etc., outdated some of the traditional roles (e.g., Incident manager, Service level agreement manager) update HR policy if required to accommodate those changes.
- Release on Demand, the system is ready to release at any given point of time by the people and it will just work fine.
Coaching Strategy:
- Conduct the Team Agility Maturity assessment, Enterprise Agility Maturity. Identify the gaps in knowledge and skills (Start with High priority teams/High Value teams).
- Creating a right ecosystem for success — Conduct Value Stream -Create Roadmap for Transformation — Establish Governance to measure progress
- Run a discovery workshop to explore the business problem — Current customer/stakeholder experience, Plan to improve the same (Process design), MVP creation, risk, issue and dependencies etc
- Come out with plan & proposal for Training, Mentoring and Coaching Strategy based on the gap analysis done.
- Train & Coach teams on CALMS (Culture, Automation, Lean-Flow, Measurement, Share)
- Look for Team Dysfunction Behaviour (5 dysfunctions of a team) and coach to improve those
- Reformulate some of the team metrics. Define team success criteria.
- Reformulate some of the team’s roles and responsibilities. Reskill the impacted people due to Transformation and Change.
- Coach to establish a cross-skilled self-driven team. Help them to build their backlog.
- Train and Coach team to look into the Automation Strategy
- Train and Coach team to look for Architectural strategy (Legacy applications, Modernization of those application etc)
- Train and Coach team to look for Testing Strategy (Agile Testing Quadrant)
- Coach team for applying Scrum and Kanban in a different situation at IT Operation and IT Service Context
- Look into the IT Service Strategy, Service Design, and other phases where Agility can bring into and up to what extent.
- Make Leaders as Lean-Agile Leaders so that they redesign the whole system in a more Agile way. Leadership to drive all the changes to create new Culture.
- Coaching for continuous delivery and on-demand release (Work on the cultural change these changes to happen)
- Teach Design thinking techniques so that team can Design services and customer experiences, create new service concepts, Identify customer opportunities, etc
Work with the Leadership team to achieve
- Increased frequency, quality, and security of product innovation
- Decreased deployment risk with increased learning cycles
- Accelerated solution time-to-market
- Improved solution quality and reduced lead time for fixes
- Reduced severity and frequency of failures and defects
- Improved Mean Time to Recover (MTTR) from production incidents
In transformation most of the steps are emergent. It works well when all the team members exhibit a growth mindset trait which means we discover better ways of working as and when we progress.
Transformations could be a Bottom-up and Top-down approach, the transformation could also be incremental with many small pilots or transformation could also be a big bang approach. It is the context that drives the decision making what works best to the scenarios.