The need for standards
Facial recognition technologies are complex and error rates remain significant depending on the imaging process and subject. As deployment and user numbers increase, these errors will become more prevalent without significant modernization of capture procedures.
IEC and ISO work together to develop international standards for ICT through their joint technical committee (ISO/IEC JTC 1). Subcommittee 37 covers biometrics and has begun work on the new ISO/IEC 24358 standard.
e-tech spoke to Patrick J. Grother, who leads the work of SC 37, to find out more about the new standard.
What is facial recognition and where is it being used?
Facial recognition is a process. It starts with taking a photograph of a face. Then a face recognition algorithm, nowadays built with artificial intelligence (AI) technologies, is used to extract identity-related features from the image. These features can then be matched against features previously extracted from other images. These might reside in a database, for example.
Facial recognition is being used in an ever-increasing array of applications. The main ones are in passport and driving license issuance, but it is also used for building access and border control, and in law enforcement investigations.
Why do we need a standard?
Face recognition systems occasionally make mistakes. They can fail to match a known user – a false negative - or they can incorrectly associate different users – a false positive. These outcomes depend on the properties of the input photographs.
In particular, an image can be degraded by image quality aspects such as poor exposure or blur, or by aspects of how the subject presents to the camera e.g. by looking down, or by making an unusual facial expression. These possibilities motivate the new ISO/IEC 24358 standard. It aims to minimize facial recognition errors by defining a new generation of cameras that understand the image they’re trying to collect. The current situation is that often generic “dumb” cameras are used that naively accept poorly presented images.
So this standard conceives of face-aware cameras tightly coupled to image quality assessment measurements made in real-time. In so doing, it aims to bring to face recognition at least the maturity that characterizes fingerprint and iris acquisition. Those modalities benefited from the outset from the use of devices that are aware of the characteristic they’re trying to acquire – friction ridges and circular structures in the eye respectively. Face recognition has only recently begun to see use of face-aware cameras, particularly in e-Passport gates and mobile phones.
So the camera will understand the face, what other capabilities will it have?
This standard is about making a new generation of smart cameras, technically better cameras. A big part of that, for multiple reasons, is to acquire images at higher resolution. We know that a lot of cell phones can take very high-resolution photos and lots of cameras have very high resolution, more than you usually need and that information turns out to be useful for multiple reasons. By requiring collection of higher resolution images, the new standard aims to allow face recognition algorithms to access more fine-grained information in faces. This information supports accurate facial recognition of twins (contemporary systems won’t distinguish between identical twins), improved human adjudication of photos for example to support courtroom testimony, and also better detection of “attack” images (e.g. from spoofing attempts).
What are some of the other drivers?
A growing number of civil identity management and law enforcement applications are using vast numbers of face images, which could later serve as references. There are also new programmes using facial recognition, such as the European Union for biometric exit confirmation. The United States is piloting face for exit in airports, while in India, the Aadhaar programme has started allowing face recognition for authentication.
Some technical issues include:
- Face-blind cameras – Most face images are collected using cameras that are not face-aware. This contrasts with the situation with fingerprint and iris biometrics where sensors enable explicit awareness of the kind of image that should be collected. One simple consequence is that some images include two faces, perhaps from someone in the background or from a t-shirt. Such occurrences can undermine recognition.
- Reliance on imaging design specifications – Faces are largely collected using cameras set up according to a documentary standard, regulating geometry and photography. Also common is for photographs to be collected without any quality assessment, relying only on the photographer to check conformance.
- Quality assessment is separated from collection – In many cases a photograph is collected and later submitted to a backend server, where it is assessed for quality. If poor quality is detected (by human or automated means), re-capture is initiated hours or days later, with attendant expense.
- Poor presentation – The largest drivers of recognition failure arise from subjects not making frontal, neutral expression, eyes-open presentations without eyewear, with their faces in the correct position. Such occurrences are inevitable when using non-face-aware cameras.
- Reliance on gains in face recognition accuracy – Face recognition algorithms are heavily researched and accuracy gains have been documented. However, there is not the same research in face image quality improvement.
Humans involved in the facial recognition process make mistakes, especially when image quality is poor:
- Integral role of human adjudication – In identification applications such as watch-listing, human reviewers determine whether hypotheses from automated search algorithms are false positives or true positives. In verifications, similarly, humans review rejections to determine false or true negative.
- Human role undermined by automated systems – Automated and human face recognition operate with different kinds of images. Humans need high resolution views, whereas automated algorithms are largely built around standardized relatively low-resolution frontal views. Typically, the low-resolution images are used in human adjudication processes because high resolution images were never collected.
As mentioned earlier, this standard specifies properties of next-generation biometric face capture subsystems intended to improve the suitability of photographs for automated face recognition, reduce the variability in those photographs, improve support for human face identification, and impede tampering and illicit modification of photographs.
It also includes specifications for new functionalities for face image capture subsystems that target the quality of images. Its primary role is in collection of pristine face images from cooperating subjects that are suited to reside in an authoritative enrolment database. Additionally, it addresses other issues, for example, it adds support for forensic human adjudication; it formalizes compression; includes protection against image manipulation and tampering; merges printing processes.
Find out more about the work of SC 37.