Since the works of Jules Verne (and arguably well before him), science fiction has been at its most interesting when it was making predictions about our future. Indeed much of Verne’s writing is often thought of as prophetic rather than simple imaginings. Consider the flip-style communicators in Star Trek that closely resembled the mobile phones of the 2010’s, or the self-driving cars in Total Recall. As genre writers of books, movies and tv shows imagine our future, a few focus their task on making it more believable rather than magical or fantastic.
A great example is the recent hit TV series The Expanse. If you have never watched it, you are missing out. I can say without reservation that it’s one of the best sci-fi tv shows of all time. I could go on for hours about it, seriously. For the uninitiated, The Expanse follows a small group of characters a few hundred years in our future. They all hail from various parts of a new nation-like planetary order made up of Earth, Mars, and The Belt (the asteroid belt, and outer-planet moons). The main characters are shipmates navigating a solar system on the brink of all-out war, when a more existential crisis suddenly emerges: true alien technology is discovered.
Their ship, the Rocinante, is a nuclear fusion-powered master-class in true “hard science fiction”. There are no transporters, shields, or artificial gravity. Our heroes wear pressurized vacuum-suits during space combat because weapons fire almost always results in a loss of internal atmosphere. They inject themselves with a cocktail of drugs (affectionately called “the juice”) to avoid stroke or cardiac arrest during high-gravity thrust. Interested yet? The Expanse has pretty much ruined me for all other sci-fi. It all feels so plausible, and details like a character pouring a drink and lifting the glass to capture the slowly escaping contents of the bottle at ⅛ gravity just remind you how committed the series creators are to providing grounding and grit to the world they have created.
The predictions the series makes feel solid, if a bit dystopian at times. Earth will become over-populated and ruined due to climate-change, humans will colonize Mars with the dream of eventually terra-forming it, and we will again revisit our original sin of slavery by creating a de-facto labor class of human born in and physically altered by The Belt’s micro-gravity, paid to mine asteroids for the inner planets but doomed to never again be able to live on any planet.
During the first season, I found myself wondering about another prediction. Where are the robots? Droids, HAL 9000’s, replicants, Terminators: surely artificial intelligence would be manifest in a few hundred years. This is something people like Elon Musk worry about even today. So, where is the AI to be found in The Expanse? This is the really fascinating part, and I had to re-watch the series to figure it out. It’s everywhere. And it’s nowhere.
Everyone has a hand-terminal (a thin and vastly more capable version of a cell phone), and even though they seem limited to typed input, they are essential to daily life, providing alerts for things like radiation or rapid oxygen depletion. Other computers like the one aboard the Rocinante are clearly able to understand and process vast amounts of information, including the nuances, vagaries, slangs and creoles of all human languages; all while proactively managing the complexities of space navigation, threat assessment, and combat modeling. The only thing missing is the computer’s verbal reply.
Humans don’t have conversations with the computers in The Expanse; all human/computer interactions are verbally one-direction with humans talking and computers displaying data instead of talking back. Watching it a second time, it seems like a glaring omission, but that’s when it hit me. This decision was a very conscious one on the part of the series creators. Computers have clearly evolved in the world of The Expanse, but not in the direction that we in the 21st century seem fixated on.
We want our computers to talk to us, to tell us stories, to entertain us when we’re bored. Sometimes we want them to look like us. We definitely want them to sound like us. Maybe this is why most of our dystopian movies conjure a machine apocalypse with humanoid robots gunning us down; we keep trying to create AI in our own image, and much like Frankenstein’s monster, our creation always turns on us (maybe this is our sign).
The Expanse’s vision is different. In their version of the future, artificial intelligence appears to exist, given the complexity of what their computers do, but it manifests in the background. It augments us, supplements us, compliments us, but remains silent and ambient. It’s a bit jarring when you think about our current focus on digital voice assistants; we have fixated on making them more like humans rather than a tool or background service.
What is the Google Assistant, really?
Today, we are spoiled for choice with AI-powered voice assistants, such that it would be difficult to conceive of a future without them. Siri, Alexa, Cortana and even Bixby compete for our attention from within their various branded devices, begging us to ask them questions. But let’s be honest: Google’s Assistant stands head and shoulders above its competitors thanks to Google’s depth of user data and mastery of machine learning.
But what is the Google Assistant, actually? I mean that question in a literal way, not metaphorically. What are we actually referring to when we invoke the name? I think the answer to that question really gets at the core of what Google is, as the Assistant is so central to everything Google wants to be. In a lot of ways, the Assistant is a personified avatar for Google itself. Ask it a question and it will do a Google search. And like a traditional web search, it tries to choose the best answer based on available data, and more recently, on what it knows about you. But unlike a regular web search, you only hear the best answer Google can find. One answer, not a page full of them. Think about that for a second.
Consider if you Googled something and you only got one result! That’s what the Assistant is providing you when you ask it a question. This is former Google CEO Eric Schmidt’s dream, and it’s already happening! This isn’t the Assistant’s only trick. Behind the scenes, the Assistant is so much more than just a pretty voice. In fact, I would argue that The Assistant is actually two things: the voice (including the spoken language model and all of the algorithms that govern its conversations) and the brains (the predictive and contextual engines that actually drive the results). Essentially, there’s an Assistant that “says”, and an Assistant that “does”. And they feel quite different in everyday use. Look, I’m not a computer science guy. My thoughts about the Assistant are purely from the perspective of an end user. I will come back shortly to my feelings about what the Assistant says. But the stuff that the Assistant does is nothing short of stunning.
Consider this scenario. I am at a job site talking to a contractor. I know that I should leave at 1:45p to pick up my daughter from school. But at 1:15p, something curious happens. My phone pops up an Assistant alert that says, “Leave by 1:30 pm to arrive at School on-time.” As I check the route on Google Maps, I realize there’s a traffic jam adding 15min to my drive. But because of the timely alert, I leave early and get my kiddo on-time. This scenario is real and it actually happened to me!
That evening, I started to think about all of the moving pieces that made that alert possible. Assistant had to read my calendar, periodically check traffic, and then figure out what time to generate the alert so it got to me in-time for me to depart by the new deadline! Telling me to leave at 1:30p is not really optimal advice if it’s already 1:30p!
This is the stuff, people! This is the exciting, cutting edge promise of artificial intelligence. Google doesn’t want to simply answer questions, but to predict the things you will need to know, before you need to know them! This is Google’s dream, and it is the beating heart of the Assistant. It was also a bit of a dejavu for me. I had personally experienced this exact same moment 10 years ago. But the Assistant didn’t generate the alert. It was another service long forgotten, called Google Now.
Project Majel – The first Google Assistant
For the Star Trek fans out there, we collectively collapsed onto the fainting couch when we heard that the codename for Google’s new AI was a reference to the beloved voice behind our favorite starship’s computer. (Fun fact, the voice of the ship’s computer for all post-Next Generation Star Trek series was Majel Barrett-Rodenberry. She was married to Star Trek creator Gene Rodenberry and played multiple characters across several series. She was also a class act and all-around awesome lady who I had the pleasure of meeting once at a sci-fi convention).
The Enterprise computer may have sounded like a person, but it only gave you the cold hard facts. No opinions, platitudes, or conversation skills. Although Project Majel didn’t speak, it mimicked the demeanor of the Enterprise computer. Google Now, as it would eventually be called, had an immediate focus on what Google did best: Search, data analytics, and predictions based on those analytics. Search for a flight number? It would appear in your Google Now feed later showing departure and arrival time. Search for a TV show? Google Now would remind you when it airs. And it could also do magic. Back in October 2012, I was at a job-site talking to an employee. My phone (a Galaxy Note 2, if I recall correctly) pushed a Google Now alert to me. I can’t remember specifically what it said, but it essentially advised me that traffic to a place that I usually go on weekday evenings was unusually heavy, and I should maybe think about hitting the road sooner rather than later.
I didn’t quite know what to make of it, but I pulled up Google Maps and checked the route to my daughter’s daycare. Sure enough, the interstate was paralyzed with traffic. I jumped in my car and barely made it on-time to pick her up. Sound familiar? Anticipating that I needed the info, analyzing that info for relevance, and figuring out when to alert me not based on a calendar event like the Assistant’s alert, but based on somewhere I frequently traveled to: in some ways, I think this was actually more useful than what the Assistant does today.
I recall some mainstream tech publications described the feature as “creepy”, implying that somehow my phone was spying on me without my permission. I think the phrase I would use is more like “saving my butt”. Google Now had its own launcher and lived in the real estate currently occupied by what’s now known as the Google Feed. It was filled with useful cards like incoming storm alerts, calendar appointments, stock price changes, and more.
Even better, it provided API hooks for 3rd party apps to deliver update cards of their own! Just talking about it makes me nostalgic. In a lot of ways, it had more “rubber meets the road” utility than the current Assistant. The best part is that Google Now was silent, just like the computers of The Expanse. It never uttered a word unless you activated the voice command features, and even then, the voice really only narrated the visual card that it provided you. By all measures, Google Now was a huge success, earning praise from all but the most die-hard Siri fans. So why was it dismantled?
Unlike other services I have written about, Google Now didn’t fail. Not technically, and not popularly. In fact, it was a success that was only supplanted by Google’s desire to make its avatar more interactive. And if I’m being honest, this obsession is what’s holding the current Assistant back.
The Assistant gets a real voice
Prior to the Assistant’s release in 2016, Google Now did have a voice. Sure, it was reminiscent of Irona from Richie Rich, but it was there. It was surely not the centerpiece of the experience, though. And it was certainly not the voice Google wanted spearheading its move into the living room with products like Google Home, and the car with Android Auto. So, Google created a new voice from the ground-up, merging human recordings with new natural voice machine learning algorithms.
I will be the first to say that it was an amazing upgrade: jaw-droppingly amazing when I first heard it. There are quite a few voices to select nowadays, with my personal favorite being the one that sounds like Robin Roberts, and they are all stunning advances that I could have only dreamt about when I was a kid. But none of this explains why Google strategically pivoted to trying to make that voice beat the Turing Test; to turn the Assistant into a person.
Sure it would be cool if the Assistant was good at conversation, or even at understanding natural language as we speak it. But it’s not; at least not consistently. And in seven years, it really hasn’t gotten much better. When you consider Moore’s Law about computing power and Google’s vast resources, this is quite a statement. Even after grafting on features like “continued conversations” and shrinking the language model and placing it on-device, conversing with the Assistant is an experience that only the bravest among us or children choose to endure. My wife avoids talking to the Assistant, preferring to type commands or use the touch interface of the Google Home display. She just wants things to work without having to remember the correct incantation of words.
Last week I tried creating a fairly wordy reminder, and the experience was so exhausting I abandoned it and just typed it into my phone. Clearly, half the problem is that the Assistant can’t always line up the things we are saying to its internal model of available “things it can do”. Typing on a phone display to accomplish tasks is a lot more forgiving than summoning the Assistant by voice only to be told “Sorry, there was a glitch”. Anything beyond task-related commands is a crap-shoot. Sometimes you get a concise answer to a question or voice-scrape from a web page, but complex questions often result in a “Sorry, I don’t know how to help.”
The problems are also with the successful responses, and it’s even more glaring with the upgraded voice. Our brains chafe at the uncanny valley of its oddly pitched, strung together words and occasional bizarre phrasing. And let’s not even talk about the awkwardness of the phrase “OK Google” that I manage to mangle half the time, while the wrong device responds the other half. I shouldn’t feel victorious every time the Assistant correctly has a verbal interaction with me, or like punching it to pieces when it fails. It should work like my car, or my microwave; reliably and predictably.
Google Assistant could learn a few things from Google Now
Normally in these articles, I talk about how Google turned a past failure into a triumph. Although the Google Assistant has been a success for the company, it strikes me that making it into a person is a mistake. We don’t need to create an artificial person. We need a machine that augments the ways in which people exist today. I’m talking about the people like me whose meat brains are bad at a lot of things, like remembering where we parked. To put it in sci-fi terms, we need Google’s Assistant to be the cybernetic brain implant from Black Mirror, not the droids from Star Wars.
Google Assistant is at its best when it’s popping up reminders, showing me forgotten photos, or offering to navigate to that place I always go every other Thursday morning (my barber). I’m sure it will get better at understanding us when we ask it to do certain tasks like playing a TV show or creating a calendar event: but it would be great if Google put intuitive visual shortcuts for all tasks on every device with a screen in the meantime. Give me more buttons to press, not more phrases to memorize.
While they’re at it, they should bring back the Assistant Snapshot! This was the last vestige of Google Now that was tucked away behind the Google android app until recently when it was silently killed off. As my colleague Michael Perrigo so eloquently described, the Snapshot was “incredibly useful and automated compared to having to ask Assistant for things manually and having to determine with your own intuition when you’d need to trigger things.”
The Snapshot was the best part of what the Assistant does that is useful and helpful to people in their everyday lives. As Assistant evolves, I am under no delusions that Google will abandon its anthropomorphic aspirations for its creation. But if it envisions a future where the Assistant is more of a useful tool than a purposeless curiosity, I think it should take a few lessons from its humble and silent predecessor, Google Now.