IBM Dif project returns the full list of photos scraped without consent

Then I got a further 2 replies from IBM. One of them is IBM asking if I want my GDPR data for everything regarding IBM? But the 2nd one is from IBM Diversity in faces project.
Thank you for your response and for providing your Flickr ID. We located 207 URLs in the DiF dataset that are associated with your Flickr ID. Per your request, the list of the 207 URLs is attached to this email (in the file called urls_it.txt). The URLs link to public Flickr images.
For clarity, the DiF dataset is a research initiative, and not a commercial application and it does not contain the images themselves, but URLs such as the ones in the attachment.
Let us know if you would like us to remove these URLs and associated annotations from the DiF dataset. If so, we will confirm when this process has been completed and your Flickr ID has been removed from our records.
Best regards,
IBM Research DiF Team

So I looked up how to Wget all the pictures from the text file they supplied. and downloaded the lot, so I can get a sense of which photos were public/private and if the licence was a conflict. I think hiding behind the notion of the link is a little cheeky but theres so much discussion about hyperlinking to material.

Most of the photos are indeed public but there are a few which I can’t find in a public image search. If they are private, then somethings wrong and I’ll be beating IBM over the head with it.

Reply from IBM about my online photos scraped without consent

Diversity in Faces(DiF)

Following my post about facial recognitions dirty little secret millions of online photos scraped without consent. I got a reply from Flickr and IBM’s Diversity in Faces project.
First Flickr’s automated email…

Hi ian,

Thanks for reaching out to us!

We’ve received your message and will be responding as quickly as possible. In the meantime, do visit the Flickr Help Forum and our Help Center as the answer to your question may be found there.

We look forward to connecting and will be in touch soon.

Cheerfully,
The Flickr Team

Already Pro? Then expect a response shortly, because you are already in our VIP queue! (Make sure to write to us using the email address on your Pro account.)

Dear Ian,
Thank you for your email.
The Diversity in Faces (DiF) project, referenced in your request below, is a non-commercial, research initiative. The DiF dataset includes a list of URLs (but not the images themselves), linking to publicly available images on Flickr under certain creative commons licenses, along with associated annotations. We have taken great care to ensure that the DiF dataset does not include Flickr IDs or any other Flickr identifiers of individuals.
In order to respond to your request, we will need to locate the URLs in the DiF dataset that are linked to your Flickr ID (if any). To do this, we will need your Flickr ID, along with your express consent to use it for the sole purpose of locating such URLs and responding to your request.  Separately, if you would like us to, we can remove any URLs of images linked to your Flickr ID from the DiF dataset.  Please confirm this by reply.
After conducting our search, we will delete your Flickr ID from our records, and if you so request, we will also remove any URLs and associated annotations from the DiF dataset connected to your Flickr ID. We will confirm when this process has been completed.
With respect to your request to access your personal data processed by IBM outside the DiF project, you will be contacted separately by the IBM Data Subject Rights Operations Team (Email at ibm.com) to proceed with your request.
Let us know if you have any questions or how we can further assist you with your request.
IBM Research DiF Team

Facial recognition’s ‘dirty little secret’: Millions of online photos scraped without consent

By Olivia Solon

Facial recognition can log you into your iPhone, track criminals through crowds and identify loyal customers in stores.

The technology — which is imperfect but improving rapidly — is based on algorithms that learn how to recognize human faces and the hundreds of ways in which each one is unique.

To do this well, the algorithms must be fed hundreds of thousands of images of a diverse array of faces. Increasingly, those photos are coming from the internet, where they’re swept up by the millions without the knowledge of the people who posted them, categorized by age, gender, skin tone and dozens of other metrics, and shared with researchers at universities and companies.

When I first heard about this story I was annoyed but didn’t think too much about it. Then later down the story, its clear they used creative commons Flickr photos.

“This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild,” said NYU School of Law professor Jason Schultz.

The latest company to enter this territory was IBM, which in January released a collection of nearly a million photos that were taken from the photo hosting site Flickr and coded to describe the subjects’ appearance. IBM promoted the collection to researchers as a progressive step toward reducing bias in facial recognition.

But some of the photographers whose images were included in IBM’s dataset were surprised and disconcerted when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms. (NBC News obtained IBM’s dataset from a source after the company declined to share it, saying it could be used only by academic or corporate research groups.)

And then there is a checker to see if your photos were used in the teaching of machines. After typing my username, I found out I have 207 photo(s) in the IBM dataset. This is one of them:

Not my choice of photo, just the one which comes up when using the website

Georg Holzer, uploaded his photos to Flickr to remember great moments with his family and friends, and he used Creative Commons licenses to allow nonprofits and artists to use his photos for free. He did not expect more than 700 of his images to be swept up to study facial recognition technology.

“I know about the harm such a technology can cause,” he said over Skype, after NBC News told him his photos were in IBM’s dataset. “Of course, you can never forget about the good uses of image recognition such as finding family pictures faster, but it can also be used to restrict fundamental rights and privacy. I can never approve or accept the widespread use of such a technology.”

I have a similar view to Georg, I publish almost all my flickr photos under a creative commons non-commercial sharealike licence. I swear this has been broken. I’m also not sure if the pictures are all private or not. But I’m going to find out thanks to GDPR

There may, however, be legal recourse in some jurisdictions thanks to the rise of privacy laws acknowledging the unique value of photos of people’s faces. Under Europe’s General Data Protection Regulation, photos are considered “sensitive personal information” if they are used to confirm an individual’s identity. Residents of Europe who don’t want their data included can ask IBM to delete it. If IBM doesn’t comply, they can complain to their country’s data protection authority, which, if the particular photos fall under the definition of “sensitive personal information,” can levy fines against companies that violate the law.

Expect a GDPR request soon IBM! Anything I can do to send a message I wasn’t happy with this.

Over 15 years of Flickr data

All those files to download

Its been a long haul but finally Flickr is beyond use for me. I briefly tried Flickr pro for a while but theres so many other options now. Its a shame but Flickr went through a lot of trouble at the end but was saved from Yahoo craziness by snugmug. But even looking at the pro account prices, I decided that after…

It was time to leave Flickr and just let it start deleting my photos, which I mainly had backed up in multiple places anyway.

I was quite impressed with Flickr’s data portability option, for example the uploaded files are exactly the same. But it would have been great if they embedded the tags into the original EXIF data. However it seems they kept the tags in account data. So with some work, it would be possible to pull the whole lot together again? I’m actually surprised no ones already done this?

Slack, dataportability done right?

Slack… love it hate it, its seems to be going everywhere…

We recently had to move slacks for reasons not worth mentioning. I was pretty impressed as one of the founders of the data portability group way back, how easy it was to export one slack into another slack. If only more services would take note!

I found the story of slack so reminiscence of Flickr’s inception via game never ending.

Flickr was famously developed as a side feature for the MMO Game Neverendingthat Butterfield was developing with his then-wife Caterina Fake and the rest of their company, Ludicorp. The team realized that the photo-sharing aspect of the game could be spun off into its own service.

Always reminds me of sitting in the audience at the doors of perception 6 in Amsterdam when Stuwart Butterfield talked about the concept and plans.

My photo used in Seattle and Ride Sharing article

Uber Lux in Amsterdam

Ben Metcalfe sent me a link to my photo which was used in a article about ride sharing in Seattle. from when I used Uber in Amsterdam,. Of course theres no problem with it because I mark most of my photos creative commons attribution, non-commercial sharealike.

Google photos vs Flickr Pro for my image backup

Speeding car

It all started when I came back from Tokyo to find my Spideroak storage full. I decided a terabyte of photos which are hardly private in a super secure storage is a little crazy and its time to put them somewhere else so I can make use of the secure storage better.

Originally I did look at using Amazon Glacier but quickly found out that its really not for general use in any shape. I looked at trove again to find trovebox has been shutdown…  but there is a Github community project for those wanting to keep developing.

We’ll be shutting down the hosted Trovebox service on March 31, 2015. We may extend this deadline to help accomodate customers to obtain archives of their photos.

A few of my friends said why don’t I use Flickr, especially since I’m already a pro member and have been since 2004!

I thought about it, because I tend to use Flickr to only upload photos I actively want to share rather than a place to upload all my photos. Basically I never really trust the privacy options and only upload things which I’m happy being public. It was time to trust Flickr’s privacy model but to be fair I’m still only uploading stuff which it doesn’t matter too much if its public.

Started doing that then Google announced at IO 2015, a revamped photo service with unlimited storage (if you are happy with them converting them down a bit).

This has got me wondering, if I should switch?

Flickr Pro is $ 24.99 a year but Google is $1.99 a month for 100gig,

Economically it makes sense to stay with Flickr as its unlimited even on high resolution photos and I have most of my good photos already there (incumbency advantage). But the google space purchase would only be used for photos over 2048x2048px big. Which I guess is quite a few as I switched to 5mpx and above very early on . I guess there’s the option of trusting googles image compression. I guess having the extra space in google drive would be useful but its not a big deal yet.

I’m going to keep uploading the photos and let google photo shake out a little. When the next year of Flickr comes up, I’ll decide then. Even made a google task to remind me. Hopefully there will be flickr to google drive exports or I’ll have gigabit internet and can upload the lot super fast.

Getty pictures go non-commercial but there is a downside

Although I welcome Getty opening up there picture archive for us bloggers to use. However there are concerns or downsides…

  • This isn’t creative commons licensed, Attribution-Noncommercial NoDerivitives would have been good enough but Getty wanted to add additional conditions.
  • All the images are available via an IFRAME tag. And I thought Flickr’s embed was bad!
  • Reading the terms of conditions… They reserve the right to do what they like within that IFrame including as you imagine, advertising…

Be-careful out there people…

Following on… Juliapowles on Twitter mentioned this to me. As Dr. Adam D. Kline says later, the first comment says it all. And it really does sum it all up…

As someone who has grown old and weary fighting Getty’s ‘licence first and clear the rights later, if at all’ business model on behalf of impecunious photographers it is difficult to view this development with unalloyed enthusiasm

Linked data on youtube?

Triples on youtube?

I thought I’d try writing some RDF/Turtle as I see it on youtube.

<http://youtu.be/jP6kzKygaOs>
dc:title "Jason Silva on London Real talks about Vanilla Sky";
dc:publisher "London Real";
dc:subject [
  dc:subject "Vanilla Sky"
  dc:type "Film" imdb:homepage <http://www.imdb.com/title/tt0259711/>
  tmdb:homepage <http://www.themoviedb.org/movie/1903-vanilla-sky>
  ]
cities:city "London".

Kind of reminds me of when people started hacking Triples into Flickr by using Machine tags. Still its interesting to see Youtube adding the ability to add a triple in a nice clean way (if its a well used ontology of course)

My highlights of Thinking Digital 2013

Herb Kim gets TDC13 underway

Herb Kim the founder and creator of the Thinking Digital conference.

This is my usual best of Thinking Digital… Bear in mind I missed half the conference as noted here.

Julian Treasure got us thinking about our hearing and how important it is. I specially liked two statements he made. Sound has a impact on cognition and don’t architects have ears? He pointed to some very nice spaces with apple styled touches. The kind of place most people would agree is nice but once he mentioned the amount of sound bouncing around the surfaces and reflecting off the floor, it was a different matter.

Maggie Philbin is one of those people who you grew up with on screen and she had become a geek hero of many men in the UK. The Tomorrow’s world presenter talked about technology for a bit then got around to her main points about the lack of diversity. Something about hearing it from Maggie really laid it out for lots of people. I had the pleasure of seeing Maggie giving the Perceptive Radio a once over too. What a woman!

Maria Giudice

Maria Giudice was a interesting lady with an interesting story to tell but what I really took away was her DEO idea (Design Executive Officer). She correctly pointed out CEO’s mainly don’t have the background of designers and those who do, generally break through because they are natural disruptor, people centric, intuitive, imaginative, etc. She had a DEO toolkit which included…

  1. Change mindsets about design, Design = change and change leads to radical change
  2. Focus on people and relationships
  3. Think we not me, collaboration is the name of the game
  4. Champion creative culture, Write on the wall make it a creative space
  5. Iterate and change, be open to change
  6. Be true to yourself

I guess if I wanted to know more… I would have to buy the book which is coming out soon…

Aza Raskin
Aza Raskin I have had the pleasure of meeting before years ago when he was working with Mozilla. Then I also got to eat dinner with him during the first night with others. Aza is one of those people who you can’t help but like. It was a really interesting time chatting with him too because his company Jawbone had just bought Body Media for 100million. Aza had no problem with talking about such things and was happy to talk about the quantified self elements of Jawbone including the wrist band Jawbone up. It was even more interesting to me after just been at the Quantified Self europe conference a week before.

Aza’s main point was about Design being the art of turning constraints into solutions. But are we actually asking the right questions? Do we even really understand the problem were solving?

Lots of food for thought… And I’d love to know more about the Jawbone hack!

Sugata Mitra
Sugata Mitra is always impressive and was one of my highlights of last year but with New Zealand teachers Jo Fothergill & Tara Taylor-Jorgensen who had flown 1000’s of miles to come talk at Thinking Digital the talks were even more epic. You can’t help but feel the educational system will be fine when he talks. I also had the joy of hooking up Jo and Tara with mr whirl wind Alan O’Donohoe before they flew back to New Zealand. Chance and opportunity came together at just the right moment I think.

Graham Hughes

Graham Hughes on reflection was maybe the best talk of the conference.

On New Years Day 2009 Graham Hughes, set off on an epic journey from his hometown of Liverpool. He wanted to show that the world is ‘not some big, scary place, but in fact full of people who wanted to help you.’ He used buses, taxis, trains and his own two feet to travel 160,000 miles, 201 countries in exactly 1,426 days – all on a shoestring of just $100 a week.

I’m not usually the biggest fan of the talks about the amazing things people have done but there was something extra special about Graham. He was just a everyday  Jo. He made it into every single country on planet earth without flying not even once. Such a epic story and the story was told so well with some incredible sub-stories, when the videos come out for Thinking Digital 2013, you must see this video. Epic and so down to earth. I also like to think I helped him with an introduction to someone I know at YouTube. I believe his storys are good enough to make him a bit of internet superstar, hopefully the youtube connection will be the start of it. Actually I need to check in and see if anything happened?

Jack Andraka
Jack Andraka I didnt’ quite get at first but as he told his story how he applied his mindset to the problem of pancreatic cancer, after losing a family friend to it. Using just Google he researched a new pancreatic cancer test that is 168 times faster, 26,000 times less expensive and potentially almost 100% accurate. He’s only 17, openly gay and already been described as the Alan Turing of our age. His talk was exactly what you would imagine from a 17 year old guy. All over the place but understanding the gravitas of what he was explaining you couldn’t help but feel how epic his journey has been. I really wished I had stopped and chatted to him in the hotel the next day, could have called a taxi and still have made the train.

Tom Scott burns his top to make a serious point

Tom Scott… What can I say. A talk you can only really do once and once only. Fire and Tom’s hoody, heck what more can you ask for? No but seriously Tom delves into the idea of archiving our memories. This is something I tried to do a while with my old phone. Memories are funny things, and they certainly make you pause for thought. I for example have my yellow Brazil football top, I’m surpised it even fits from 1998! The same year I went to Ibiza and go that crazy Brazil haircut. Maybe I should set fire to it too?

Aral Balkan

I already talked about Aral Balkin in a previous blog but he was rather good even if I disagreed with a lot of what he was saying. Well rehearsed and cleverly put together for the maximum effect each time. I won’t take that away from Aral, well done.

Also worth mentioning…

Chris Thorpe

Rachel Armstrong

Chris Thorpe and Rachel Armstrong for expanding our minds further than I could maybe take at that moment. My notes are pretty flat but I remember being slightly moved by what they were saying.

Alexa Meade

And finally, Alexa Meade for simply stunning pieces of art which I had only seen once or twice. Important never to forget the impact art can have in a new medium. Painting directly on to people is something very special and the time and dedication really impressed me. She was such a lovely lady too. I don’t know if I would ever let anyone paint on top of me. The feeling of uncleanliness would maybe drive me slowly nutty.

Another great Thinking Digital conference, I just wish I had seen more of the first day…

Post Pam Warhurst to Wikipedia

Pam Warhurst

I received this in my Flickr Mail the other day…

Have you thought about uploading one of your Pam Warhurst pictures to Wikipedia? Her profile (en.wikipedia.org/wiki/Pamela_Warhurst) doesn’t have one and I think www.flickr.com/photos/37421747@N00/7323713702 would be a good fit.

FYI: I am publishing a quote using this picture, credited, on Feb. 27 on my blog yahooeysblog.wordpress.com/

Must have been a slight mistake because looking for the quote I had to do a tag search for Pamela Warhurst. Finally I found this page. Right day wrong month.

“There’s so many people that don’t really recognize a vegetable unless it’s in a bit of plastic with an instruction packet on the top.” — Pamela WarhurstHow we can eat our landscapes

To be honest I’d love to have one of my pictures used for the wikipedia entry but its a real pain uploading to wikipedia when you forgot the account details (*smile*). So once I sort out the login, I’ll make the changes to Pamela’s entry. May have to do the same for a few other people…

Done….

Bring your own bucket of photos

A little break from perceptive media, but don’t worry it will be back once we have something solid in that area.

I was intrigued to see two things happen one after another. I’ll have to store this one under the decentralised magic or something.

First Tim sends me a tweet about OpenPhoto which is a decentralised flickr (my words not theres)

The inception of OpenPhoto was a desire to liberate our photos and take back control. Like you, our photos are the most valuable digital files we have. Also like you, we’ve used Flickr, Picasa and Smugmug and wound up with our photos scattered across numerous sites on the web.

So I signed up and will be looking to move my pictures from Flickr to OpenPhoto when the flickr subscription runs out in May. Which reminds me I need to download all my photos using this lovely app on Ubuntu. It requires you to bring your own bucket or storage system. So right now you can either use Amazon S3 or Dropbox

Then almost like magic, Dropbox quietly announced they were doing Google+ like automatic upload of photos to your dropbox account.

Even though the holidays have passed, we’re really stoked to give you guys one more present — a new experimental release! Under the wrapping you’ll find a bunch of new toys, including a brand new Camera Upload feature. Here are the release notes:

• Automatically uploads photos and videos in the background using Wi-Fi or data plan

Together you have a complete solution to sidestep Google’s Picassa, Flickr’s reach and all other solutions. I just hope some of Flickr’s magic has rubbed off on OpenPhoto, as a decentralised Flickr is no easy feat. And I have to wonder how they stay sustainable if the money is going to Dropbox and Amazon.

Update – If OpenPhoto and SpiderOak would work together, I wouldn’t have to move anything at all. Although I’m wondering how permissions work? Wouldn’t want to just hand over my Spideroak authentication, as it has lots of my private backups. However its worth noting SpiderOak does support dropbox-like shares and Oauth, so that could work…

BBC Backstage in the guardian again

So now the ebook is out there, this pretty much spells the end of the BBC Backstage project. The backstage site & blog will go into a deep freeze so none of the links will be broken. It was interesting to read the Guardian wrap up of backstage, there was some good quotes from our interview way back in December 2010. But what got me was after a while was the slideshow from Rainycat. She was so good at documenting things. Of course afterwards I spent about a hour or so going through my own photos tagged bbc backstage.

I’ll say it again, BBC Backstage was an amazing project to be part of and even run. Not just the big stuff but also the small stuff. The good times (the many events we did, the prototypes and finally getting our own backstage playground servers off the ground) and the times when I thought I might be sacked (such as undermining the podcast trial by launching our own using blip.tv or crossing the hacker/BBC divide by sympathising with the DRM protests).

Its been simply incredible and looking back through the pictures, I see a lot of really happy people. Theres no doubt I’ll be chasing the high of working on backstage for quite some time to come. I think I remember a conversation I had with Rain after her attachment came to an end.

"Working for the BBC should be that way, and anything we can do to make that happen is always a good thing."

As always thanks for tags hackers of the new world, hopefully what we do next will be even more exciting that backstage ever would have been.

Discovering lovely workplaces

The old twitter offices

I had message on my flickr mail today. The subject read awesome workplaces. I though it may have been spam at first but I checked it out.

Dear Ian,

I found some great pictures of the Citizen Space workplace here: http://www.officedesigngallery.com/template_permalink.asp?id=154#comments

I would really like to add these pictures to my website: WOVOX.com.

WOVOX is a free and open platform for anyone to show and find workplaces. We want to help people find the workplace of their dreams more easily, but also learn from others and find inspiration for their own workplace.

There is room for credits and a link back to your website or social media profile. Your pictures will be available under an attribute & share alike creative commons license.

I look forward to hearing from you!

Thank you,

Arjen Hoekstra

I totally forgot about the pictures I took of the CitizenSpace, the old twitter offices (see the photo above – yes that how twitter use to look back in 2006ish) and the creative commons office, all in San Francisco. I’ve lot loads more which I forgot about. Good thing about flickr keeping all these photos and of course the creative commons license which is a sign that I’m willing to share.

Wovox.com actually looks pretty sweet. The ethos seems pretty well thought out too. For example heres a bit from the user guideline page.

Authenticity! Better show a few things with spirit than a lot of stuff without depth. A mobile phone pic in the heat of the moment is worth much more than a non-descript €1000 pose shot.

Plus its really good seeing Creative Commons licenses being baked in from day one rather that being an after thought (i’m looking at you mixcloud.com crew).

Seeing all these work places in one place, has somewhat inspired me. I hope to add the BBC media city office to the mix in the near future, it will be interesting to see how it grows as we get more people too.