Police want happy childhood photos to train CSAM AI • The Register
Updated The Australian Federal Police and Monash University are asking netizens to send in snaps of themselves younger to train a machine learning algorithm to spot child abuse in the photographs.
Researchers are looking to collect images of people aged 17 and under in safe scenarios; they don’t want nudity, even if it’s a relatively innocuous image like a child taking a bath. The crowdsourcing campaign, dubbed My Pictures Matter, is open to people aged 18 and over, who can consent to their photographs being used for research purposes.
All images will be collated into a dataset managed by Monash academics with the aim of training an AI model to differentiate between a miner in a normal environment and an exploitative and dangerous situation. The software could, in theory, help law enforcement better automatically and quickly identify child sexual exploitation material (aka CSAM) among thousands and thousands of photographs under investigation, thus avoiding human analysts inspecting each snapshot.
Reviewing this horrible material can be a slow process
Australian Federal Police senior officer Janis Dalins said the resulting AI could potentially help identify victims and flag illegal items previously unknown to officers.
“In 2021, the AFP-run Australian Child Exploitation Center received over 33,000 reports of online child exploitation and each report may contain large volumes of child exploitation images and videos. ‘children sexually assaulted or exploited for the gratification of offenders,’ he said. this week.
Dalins is also co-director of AiLECS Lab, the research collaboration between scholars from the Monash School of Information Technology and AFP that runs the My Pictures Matter project.
“Examining this gruesome material can be a slow process and the constant exposure can cause significant psychological distress to investigators,” he added. “AiLECS Lab’s initiatives will support the police and children we are trying to protect; and researchers have thought of an innovative way to ethically develop the technology behind such initiatives.”
The easiest way to compile a large image dataset is to scrape the open internet. But, as some of the latest AI models, such as OpenAI’s DALL·E 2 and Google’s Imagen, have shown, the quality of this data is difficult to control. Biased or inappropriate images can creep into the dataset, making models problematic and potentially less efficient.
Instead, the AiLECS team believes its crowdsourcing campaign offers an easier and more ethical way to collect photographs of children. “To develop an AI capable of identifying exploitative images, we need huge numbers of photographs of children in everyday ‘safe’ contexts that can train and evaluate AI models intended to combat child exploitation,” Campbell Wilson, co-director of AiLECS and an associate professor at Monash University, said.
By obtaining photographs from adults, through informed consent, we try to create technologies that are ethically responsible and transparent
“But sourcing these images from the internet is problematic when there is no way of knowing whether the children in these images have actually consented to their photos being uploaded or used for research. By obtaining photos from ‘adults, through informed consent, we try to build technologies that are ethically responsible and transparent.’
People just need to send in their personal photos and an email address as part of the campaign. Nina Lewis, project manager and researcher at the lab, said it was not going to log any other types of personal information. Email addresses will be stored in a separate database, we are told.
“The images and associated data will not contain any identifying information, ensuring that the images used by researchers cannot reveal any personal information about the individuals depicted,” she said. Participants will receive updates at each stage of the project and can request to remove their images from the dataset if they wish.
The lofty goals of the project are not technically impossible and are very ambitious, so we look forward to seeing the results, given the challenges faced by image recognition systems, such as bias and adversarial attacks, among others. limitations.
The register asked Monash University for further details. ®
Updated to add June 6
Monash’s Dr Lewis has been in touch with more details. She told us the goal was to create a dataset of 100,000 unique images to train the AI model.
“We will use the photos as training and test data for new and existing algorithms that identify and classify ‘safe’ images of children,” she added. “We will also research how these technologies can be applied to assess whether digital files contain ‘dangerous’ images of children.
“The My Pictures Matter project does not train AI on images of children in dangerous situations. We are studying the opposite scenario: how to create consenting, ethically-sourced datasets for use in machine learning to help to combat the growing volume of child abuse images being generated and distributed via online platforms.”
Responding to some of your comments raising concerns about the capability of machine learning systems, Dr Lewis added: “We recognize that automated tools need to be more than blunt instruments, and that, for example, the presence of ‘a high proportion of skin tone in a visual image does not in itself indicate abuse.’
For those concerned about data privacy safeguards, Dr. Lewis pointed to the “data handling” section on mypicturesmatter.org after clicking “Let’s Go”, which states:
She also stressed that the footage collected for the project will be owned and used by the university, not the cops directly.
“This is not a police dataset, and it will not be owned or managed by AFP,” Dr Lewis told us. “This research is undertaken by Monash University, with formal human research ethics clearance on how the data is collected, used and managed.”