Machine Learning (ML) is now used in a range of systems with results that are reported to exceed, under certain conditions, human performance . Many of these systems, in domains such as healthcare , automotive  and manufacturing , exhibit high degrees of autonomy and are safety-critical . Establishing justified confidence in ML forms a core part of the safety case for these systems .
We introduce a methodology for the Assurance of Machine Learning for use in Autonomous Systems (AMLAS). AMLAS comprises a set of safety case patterns and a process for:
AMLAS scope covers the following ML lifecycle stages:
In particular, the ML safety assurance scoping and the safety requirements elicitation stages explicitly establish the fundamental link between the system‐level hazard and risk analysis and the ML safety requirements.
That is, AMLAS takes a whole system approach to ML assurance in which safety considerations are only meaningful once scoped within the wider system and operational context. The ML safety requirements are then used to weave the safety considerations into the ML stages in the subsequent phases. For each phase, we define a safety argument pattern that can be used to explain how, and the extent to which, the generated evidence supports the relevant ML safety claims, explicitly highlighting key assumptions, trade-offs and uncertainties.
The overview diagram above shows an overview of the six stages of the AMLAS process. For an ML component in a particular system context, the AMLAS process supports the development of an explicit safety case for the ML component. The AMLAS process requires as input the system safety requirements generated from the system safety process. The assurance activities are performed in parallel to the development process of the ML component. Further, the AMLAS process is iterative, as indicated by the 'Feedback and iterate' thread in the diagram above.
Each stage of the AMLAS process is linked to the ‘Feedback and Iterate’ thread and could trigger the need to reconsider information generated or consumed by other stages. This is also necessary because of the interdependencies between the different stages (e.g. an activity in one stage might use artefacts produced by another activity in a previous stage).
The stages of AMLAS may therefore be performed multiple times throughout the development of the ML component. For example, verification activities may reveal that ML safety requirements are not met by the ML component under some conditions. Depending upon the nature of the findings, this may require that stages such as model learning or data management must be revisited, or even that the ML requirements themselves must be reconsidered.
Each AMLAS stage is structured as follows:
The description of each stage details the activities to be undertaken and the artefacts produced or required by the activities. The description also discusses common issues and misunderstandings relating to each activity; these are generally provided as notes or examples. Importantly, each stage concludes with an activity for instantiating a safety argument pattern based on the artefacts and evidence generated in the stage.
We adopt a commonly‐used definition of a safety case as a “structured argument, supported by a body of evidence that provides a compelling, comprehensible and valid case that a system is safe for a given application in a given operating environment.” . A safety case pattern documents a reusable argument structure and types of evidence that can be instantiated to create a specific safety case instance .