Modern software engineering represents a fast-paced, collaborative task of extreme complexity. Furthermore, modern Agile development methodologies, designed to ensure the ability of an engineering team to adapt to rapid and often significant change while maintaining project momentum, is predicated on the fact that ever-changing requirements, constraints, and customer needs will continually impact all areas of a project throughout its life cycle [1,2]. Given such an operational environment, a fundamental key to successful outcomes is highly efficient and effective dissemination of information among project team stakeholders. These stakeholders include engineers, designers, quality assurance staff, project managers, customers, and more, depending on project team structure. As information changes, or new information becomes available, existing information repositories and artefacts must be updated immediately to avoid 1) injecting defects into the software being developed and 2) wasted effort spent working with incorrect information. This can be modelled as a communication problem, between a potentially large, diverse group of actors.
———————
In addition to the challenge of accurately recording and communicating changes to fundamental requirements and constraints, there exists a vast amount of information related to source code, testing, tasking, tacit knowledge, system configuration, and other facets of project status and knowledge. This knowledge must be maintained and immediately updated to avoid defect injection or inefficient workflows. Dissemination of information in near real-time can take a heavy toll on project resources, requiring continuous meetings and document creation/ingestion by all team members. This can dramatically reduce productivity of project teams, as communication of status and changes takes a large volume of project effort.
————
Two modern software development movements provide partial solutions to this problem: Data-driven development [3] and Automated DevOps [4,5]. Data- driven development is a software development methodology promoting the continual collection of data throughout the SDLC to inform prioritisation, tasking, and adaptations of the process and product to ensure project success within temporal and budgetary constraints. DevOps is a movement within software engineering that professes to bring software developers and operations staff (those in charge of infrastructure, quality control, packaging, and release of software products) in close alignment, to ensure harmonious tasking and smooth transition of project artefacts through interoperable processes and tools. It should be noted that the authors are discussing DevOps in the context of internal project systems and processes, in addition to the traditional definition of deployment and transition between development and operations staff. Though this is the traditional context of DevOps, iterative software development processes can make use of the same DevOps concepts and tools to enable internal continuous delivery of software, systems, and all other generated project artefacts. Both of these movements can be seen as a push to automate the following fundamental communication tasks of the software development process: 1) record data from the complex system of a running software development project, 2) distill that data into information and knowledge, and 3) communicate required knowledge in near real-time to all actors who rely on it to perform their job functions optimally. The hierarchical concept of data, information, and knowledge are derived from the DIKW Pyramid, a fundamental model of Information Science, where data is distilled into information, and information is used to create knowledge [6,7]. This type of data-driven approach is consistent, to varying degrees, with numerous modern software development methodologies [3,8,9,10]. While data-driven development methodologies espouse the need for broad data collection and information extraction, the DevOps movement provides the automated mechanisms to enable efficient collection and organisation of data, such that knowledge can be determined and efficiently communicated to appropriate actors without high levels of human interaction. In short, specialised automated systems take the place of some human actors in the SDLC, providing effective and immediate extraction and communication of knowledge.
—————
BACKGROUND
———
The following section will provide background on software development processes and practices, leading to a discussion of data-driven project management and automated DevOps that is the motivation for the research presented.
—————
I. The Software Development Life Cycle
The Software Development Life Cycle (SDLC) is a conceptual model used to describe a process for planning, designing, testing and deploying an information system. Various methodologies exist to manage the SDLC, including the well-known “waterfall” model,(Figure 1) which views the process as a sequence of stages moving steadily downward to project completion. [11,12] This sequence describes the output of each stage to be the input of the next. Though simple to conceptualise and plan, this approach is viewed as having significant shortcomings when applied to the practice of software development, [13, 14], specifically the difficulty in adjusting to change or new information throughout the development process. The inability to adapt to change has been seen to increase the risk of failure in many projects. This deficiency has led to other methodologies to gain footing in the modern software industry.
—————
II. Agile Software Methodologies
In response to the drawbacks of the waterfall model and other similar change-resistant software development methodologies, the Agile software development model (Figure 2) was proposed. [15] Agile methodologies are software development methods that focus on iterative and incremental development, often emphasising direct and constant communication with stakeholders, adaptive planning, and ever-evolving requirements. Practitioners believe that designing processes to adapt to change and new information effectively and efficiently leads to reduced project risk and significantly enhanced project outcomes. To constantly adapt to change, teams implementing Agile practices require highly effective communication. These teams often leverage specialised tools and techniques to ensure rapid and robust communication both within teams and between teams and stakeholders.
————
III. Data-Driven Development and Communication
Measuring project progress and status through the SDLC is imperative, especially when teams are expected to quickly adapt to change. Degraded progress must be detected swiftly to allow a team to adjust its processes and remediate issues. Collection of accurate, up-to-date information for a project is a challenge, especially given the rapidly changing nature of information in Agile projects. In response to this requirement, data-driven project management techniques have gained popularity, espousing the constant collection of large amounts of data that can be used to calculate informative metrics on project status. [3] The results of these metrics can be used to guide project management decisions, adjusting project plans and projections quickly to maintain maximal project team performance. It has been found, however, that though data-driven processes offer measurable gains in performance, human actors will lapse in the collection of useful metric data if collection is a manual process [7]. This finding leads to the conclusion that in order to achieve the benefits of data-driven processes, automated system actors must be empowered to collect data without direct assistance from human actors.
————
IV. DevOps
Modern software development teams attempt to leverage engaging, semi-autonomous technologies to facilitate the myriad of data management and communications tasks necessary throughout the SDLC. By sending messages, collecting and storing data, and providing meaningful data visualisations designed for various actor roles within the project, these systems become actors within the SDLC process, performing complex tasks ideally suited to machine automation and reducing the burden of distraction and context switching on human actors.
——
Software development, like any complex operation, requires a large number of tools and information systems to manage data and processes. As these technologies became more capable and more essential to the development process, the management and maintenance of these tools became a challenge to many teams. In most organisations, while development teams participated in the software development process, independent operations teams managed the required assistive tools and technologies. This separation of specialists lead to communication difficulties between development and operations staff, and challenges in organising and prioritising efforts to support an efficient and effective software engineering environment.
—-
Modern evolution of the practice has led to a new concept, termed DevOps, describing the conceptual and operational merging of development and operations needs, teams, and technologies. At the crux of this movement is the integration of operations teams, which support the software development process and also often the testing and release of software products, with software development teams who design and implement the software products. This marriage is intended to maximise the utility of essential software development tools while aligning the priorities of both development and operations staff to inceltivize a more successful partnership in working towards the common goal of successful project execution. The conceptual organisational unity inherent to a DevOps paradigm is naturally extended to providing interoperability of software development and operations tools, to ensure maximal