As well as having different characteristics, subject matters affecting the trustworthiness of AI systems can be measured and assessed at different stages in the system life cycle. In some cases, e.g. assessing potential societal harms, an assessment needs to take place ex ante - before the system is deployed, often as a pre-condition of putting a system on the market.
Because impact assessment takes place pre-deployment and future harms are inherently unobservable, impact assessments cannot be used to assess the subject matter with a high-degree of certainty. This uncertainty must be communicated in the assurance conclusion to enable users to have justified trust in the predicted level of impact/risk and put in place appropriate mitigation.
When the system is in use, impact evaluations can increase the level of certainty by taking into account whether there have been any actual incidents of harm and evaluate the scope and scale of these harms.
Similarly when testing the accuracy of a model, the potential for model drift - where the accuracy of predictions drifts over time from the model’s test performance - lowers the certainty with which the conclusion about accuracy can be stated. Ongoing testing post-deployment can increase this certainty by accounting for model drift.
All content is available under the Open Government License v3.0 except where otherwise stated.