In “Puppyslugs ‘R Us: Part 0“, I started out quite cheekily on a topic I hope to explore here in a bit more serious detail.
I’ll start with the recent Google “DeepDream” release and the so-called Puppyslug images you’ve perhaps encountered, explaining roughly what those are and how they come to be. I will connect that to AI and algorithms in general and then move specifically to how they already appear in your everyday mobile experience. From there we can paint a picture of what’s in store for us, and why I say the Puppyslugs are Us. I’ll conclude by setting up Part 2, and how all of this lands squarely in the lap of Design to deal with.
A quick background Puppyslugs.
In early June 2015, Google Researchers start showing off some algorithmically generated visuals which they originally label “Inceptionism“, after the DiCaprio movie of a few years ago, Inception. The algorithms in question use a technology known as Artificial Neural Networks (ANN). You can think of ANN as a sort of database, or “bunch of memories” the structure and mechanism of which is inspired by what we currently know of how our biological brains work.
To make an Artificial Neural Network — a.k.a. “to train a model” — you take a bunch of memories (data), run them through the generator and ding you have a self-contained understanding-of-the-world based entirely on the memories you fed it…
Hold that thought… let it dissolve on your tongue. ;p
Two weeks later, the same Researchers open source and release the software code they used in their work. Now because “training a model” requires a lot of data and processing, and ostensibly the buzz they were getting around their work had to do with generating funky images, these researchers also made available one or two such “models”, or Neural Networks, or “artificial minds.”
One of these was generated entirely out of many many pictures of puppies (and apparently some other animals).
What came out of this is what we now call Puppyslugs.
How these images are generated is important.
Not technically how, but conceptually how.
If you ask a person a question, assuming they understood your question, they will answer you based on their knowledge. More specifically, they will formulate an answer to your question out of their Memory, the Situation the question was asked in, and what they believe may be Contextually Relevant in that Situation.
If that person’s memory, knowledge and experience is 100% entirely made up of photos of dogs (and places), and you ask them to describe this photo of me:
well… you get this:
Let me illustrate that.
If, using spoken language, you ask a person about me:
If you ask a Neural Network trained on photos of places and puppies about how it understands a photo of me:
The computer spits out what it comprehends based on what it “knows.”
Now so far, most of the coverage (that I have seen anyways) of the DeepDream and PuppySlug stuff mainly highlights its “nightmarish imagery” producing capabilities. But you may have already noticed that I am taking us along another route. Something I want you to keep in mind is this: these tools were developed by Google Researchers researching Artificial Intelligence & Deep Learning algorithms and processes as a way to visualize what a given Neural Network “knows and understands.” They did not do this “to make trippy visuals.”
(Extra: This is not part of my point per se, but… “Why Google’s new patent applications are alarming“)
Google is not alone here. Everyone’s on this Quest for the Grail: Facebook, Microsoft, Amazon, IBM, every startup who’s pitch deck contains the word “graph”… Also, this is not new. More likely than not, you’ve been exposed to “user experiences” backed by some form of AI before: automated call service systems, spam bots… and “he who shall remain unnamed cough Clippy cough.”
More recently, we’ve gotten to know Apple’s Siri, Microsoft’s Cortana, Amazon’s Echo, and of course Google Now.
The question to ask now is “Why? Why are these companies running after this?” There are about a hundred layers of reasons, with the higher levels being “to make money duh” but that’s too broad. Further down the “why stack”, we might arrive at “in order to give the user what they want when they want it, we need to predict what they might want before they ask for it.“
As my former colleague Sarah Ferber put it, the service needs to be able to answer “how can I be happy?” without the user saying that or typing it in or punching a button that says “HAPPY! NOW!”
(I’ve heard people refer to their smartphones as “joy killers” for precisely failing at this. Every. Time. You. Look at it.)
So, ok. Remember I asked you to hold that thought earlier? Let’s bring that back:
To make an Artificial Neural Network — a.k.a. “to train a model” — you take a bunch of memories (data), run them through the generator and *ding* you have a self-contained understanding-of-the-world based entirely on the memories you fed it…
There is a line of thought, Continuity of Consciousness in Identity, that holds that what makes you, you, aside the meat and bones, is your personality. What makes that personality, what makes you “an individual”, is the accumulation of experiences that have shaped you, all of them somehow stored in your <strong”>memory, and how they manifest themselves relevantly in a given situation.
Now ask yourself: “Who is the one person most likely to know exactly how to make me happy right now?”
Pro tip: It’s you! With all your memories and experience and preferences and behaviors and quirks and history and comments and likes and favs and shares…
Guess who might be even better? An omniscient, omnipotent version of you! A being that knows everything you know… AND knows everything and can reach anything on the Internet.
Getting a bit carried away there. Let me put my point out clearly:
If someone has a record of everything you say and do on the Internet, they can create, using Artificial Neural Networks “AI” versions of you who, while keeping an eye on you, can also go and fetch information, products and services for you as you appear to need them, without your having to ask for them.
While it most likely isn’t quite the case yet, very soon, very possibly, when you talk to Google Now, Cortana, Siri or others, it won’t be some random generalized AI you’ll be talking to. It’ll be yourself.
Your own… puppyslug… self.
(The point is, you would be the basis of your own highly networked serendipity engine, a ready and aware network bot agent … mapping out your possibility space as it appears on the event horizon…)
I’ll let you chew on that for now. In the next Part, I’ll pick up the “this is design’s problem“ thread, as well as explore some of the obstacles still preventing this from happening.
Until then, please enjoy these classic Deep Learning Advertising Sales Team classics:
Ohhh can’t you seeee… you belong to meee…
Don’t point that camera at me!
p.s.: much inspiration for this craziness came from many long conversations with my friendly neighbourhood @samim. :)