Facial recognition’s ‘dirty little secret’: Millions of online photos scraped without consent

By Olivia Solon

Facial recognition can log you into your iPhone, track criminals through crowds and identify loyal customers in stores.

The technology — which is imperfect but improving rapidly — is based on algorithms that learn how to recognize human faces and the hundreds of ways in which each one is unique.

To do this well, the algorithms must be fed hundreds of thousands of images of a diverse array of faces. Increasingly, those photos are coming from the internet, where they’re swept up by the millions without the knowledge of the people who posted them, categorized by age, gender, skin tone and dozens of other metrics, and shared with researchers at universities and companies.

When I first heard about this story I was annoyed but didn’t think too much about it. Then later down the story, its clear they used creative commons Flickr photos.

“This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild,” said NYU School of Law professor Jason Schultz.

The latest company to enter this territory was IBM, which in January released a collection of nearly a million photos that were taken from the photo hosting site Flickr and coded to describe the subjects’ appearance. IBM promoted the collection to researchers as a progressive step toward reducing bias in facial recognition.

But some of the photographers whose images were included in IBM’s dataset were surprised and disconcerted when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms. (NBC News obtained IBM’s dataset from a source after the company declined to share it, saying it could be used only by academic or corporate research groups.)

And then there is a checker to see if your photos were used in the teaching of machines. After typing my username, I found out I have 207 photo(s) in the IBM dataset. This is one of them:

Not my choice of photo, just the one which comes up when using the website

Georg Holzer, uploaded his photos to Flickr to remember great moments with his family and friends, and he used Creative Commons licenses to allow nonprofits and artists to use his photos for free. He did not expect more than 700 of his images to be swept up to study facial recognition technology.

“I know about the harm such a technology can cause,” he said over Skype, after NBC News told him his photos were in IBM’s dataset. “Of course, you can never forget about the good uses of image recognition such as finding family pictures faster, but it can also be used to restrict fundamental rights and privacy. I can never approve or accept the widespread use of such a technology.”

I have a similar view to Georg, I publish almost all my flickr photos under a creative commons non-commercial sharealike licence. I swear this has been broken. I’m also not sure if the pictures are all private or not. But I’m going to find out thanks to GDPR

There may, however, be legal recourse in some jurisdictions thanks to the rise of privacy laws acknowledging the unique value of photos of people’s faces. Under Europe’s General Data Protection Regulation, photos are considered “sensitive personal information” if they are used to confirm an individual’s identity. Residents of Europe who don’t want their data included can ask IBM to delete it. If IBM doesn’t comply, they can complain to their country’s data protection authority, which, if the particular photos fall under the definition of “sensitive personal information,” can levy fines against companies that violate the law.

Expect a GDPR request soon IBM! Anything I can do to send a message I wasn’t happy with this.

Do you want to know a secret?

Secret

I have installed the Secret app but everytime I look at it, can’t decide if I should sign up or not.

If you don’t know Secret app

Secret is a mobile app (iOS and finally Android) that allows people to share messages anonymously within their circle of friends, friends of friends, and publicly. It differs from other anonymous sharing apps such as PostSecret and Whisper in that it is intended for sharing primarily with friends, potentially making it more interesting and addictive for people reading the updates wondering if its a friend they know.

The problem I have is, do I trust them to keep my secrets secret? First clue is usually in the Terms of Conditions and Privacy statement.

Looking at the ToC and Privacy, theres nothing insane described but I’m sure when Facebook was first described in the EULA it was all smiles but….

We change these Terms of Service every so often. If we make changes, we will notify you by revising the date at the top of the policy and, in some cases, provide you with additional notice

I imagine after a few months the terms will change and suddenly the secrets are less ummmmm secret?

London Backstage Christmas Party, Saturday 9th Dec

Smooth water

So its now no real secret. I'm planning a big christmas party for BBC Backstage. The difference is that rather than going it alone and creating a whole lot of havoc with everyone elses plans. I decided that Backstage should actually talk to the rest of the community groups in London and encourage one big party. This is core to the BBC Backstage values, rather than go it alone we're going to see whats already out there and see if we can help out, encourage more take up or parcipitation. Anyway, so after the emails went out, I started getting into conversations with different groups and got a line up to rival all line ups. And on Friday night at the geekdinner I revealed the party plans.

To make things clearer, I posted up the details on the Backstage blog and then with the advice of Rob, hit Upcoming and Eventful

Yes the rumours are true…

There is a BBC Backstage Christmas Party being planned for Saturday 9th December in London.

Rather than host it ourselves and clash with everyone else's parties. We decided that it would be very fitting to backstage if we collaborated together some of the best groups and communities in London. Then got them under one roof to share in the Christmas Party…

Seemed like a crazy idea, but I would like to introduce our fantastic partners for the Christmas Bash,

Swedish Beers
London Girl Geekdinners
Geekdinners
London Perlmongers
London Webstandards Group
London Ruby user group
Open rights group
London 2.0
Momo Monday

We have an excellent Cuban venue (TBC) all to ourselves deep in the area of Moorgate and Citypoint.

So please keep a note in your calendar, as Saturday 9th December looks to be a fantastic night to be in London.

Comments [Comments]
Trackbacks [0]