The National Oceanic and Atmospheric Administration (NOAA) and Google have teamed up on a project to identify the songs of humpback whales from thousands of hours of audio using AI. The AI proved to be quite good at detecting whale sounds and the team has put the files online for people to listen to at Pattern Radio: Whale Songs. Here’s a video about the project:
You can literally browse through more than a year’s worth of underwater recordings as fast as you can swipe and scroll. You can zoom all the way in to see individual sounds — not only humpback calls, but ships, fish and even unknown noises. And you can zoom all the way out to see months of sound at a time. An AI heat map guides you to where the whale calls most likely are, while highlight bars help you see repetitions and patterns of the sounds within the songs.
The audio interface is cool — you can zoom in and out of the audio wave patterns to see the different rhythms of communication. I’ve had the audio playing in the background for the past hour while I’ve been working…very relaxing.
In this video, you can watch a simple neural network learn how to navigate a video game race track. The program doesn’t know how to turn at first, but the car that got the furthest in the first race (out of 650 competitors) is then used as the seed for the next generation. The winning cars from each generation are used to seed the next race until a few of them make it all the way around the track in just the 4th generation.
I think one of the reason I find neural network training so fascinating is that you can observe, in a very simple and understandable way, the basic method by which all life on Earth evolved the ability to do things like move, see, swim, digest food, echolocate, grasp objects, and use tools. (via dunstan)
A novice painter might set brush to canvas aiming to create a stunning sunset landscape — craggy, snow-covered peaks reflected in a glassy lake — only to end up with something that looks more like a multi-colored inkblot.
But a deep learning model developed by NVIDIA Research can do just the opposite: it turns rough doodles into photorealistic masterpieces with breathtaking ease. The tool leverages generative adversarial networks, or GANs, to convert segmentation maps into lifelike images.
To train the algorithm, Sohn fed it images from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a massive public dataset of PET scans from patients who were eventually diagnosed with either Alzheimer’s disease, mild cognitive impairment or no disorder. Eventually, the algorithm began to learn on its own which features are important for predicting the diagnosis of Alzheimer’s disease and which are not.
Once the algorithm was trained on 1,921 scans, the scientists tested it on two novel datasets to evaluate its performance. The first were 188 images that came from the same ADNI database but had not been presented to the algorithm yet. The second was an entirely novel set of scans from 40 patients who had presented to the UCSF Memory and Aging Center with possible cognitive impairment.
The algorithm performed with flying colors. It correctly identified 92 percent of patients who developed Alzheimer’s disease in the first test set and 98 percent in the second test set. What’s more, it made these correct predictions on average 75.8 months — a little more than six years — before the patient received their final diagnosis.
This is the stuff where AI is going to be totally useful…provided the programs aren’t cheating somehow.
The opening line of Madeline Miller’s Circe is: “When I was born, the name for what I was did not exist.” In Miller’s telling of the mythological story, Circe was the daughter of a Titan and a sea nymph (a lesser deity born of two Titans). Yes, she was an immortal deity but lacked the powers and bearing of a god or a nymph, making her seem unnervingly human. Not knowing what to make of her and for their own safety, the Titans and Olympic gods agreed to banish her forever to an island.
Here’s a photograph of a woman who could also claim “when I was born, the name for what I was did not exist”:
The previous line contains two lies: this is not a photograph and that’s not a real person. It’s an image generated by an AI program developed by researchers at NVIDIA capable of borrowing styles from two actual photographs of real people to produce an infinite number of fake but human-like & photograph-like images.
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis.
The video offers a good look at how this works, with realistic facial features that you can change with a slider, like adjusting the volume on your stereo.
Photographs that aren’t photographs and people that aren’t people, born of a self-learning machine developed by humans. We’ll want to trust these images because they look so real, especially once they start moving and talking. I wonder…will we soon seek to banish them for our own safety as the gods banished Circe?
Update:This Person Does Not Exist is a single serving site that provides a new portrait of a non-existent person with each reload.
The Lumière brothers were among the first filmmakers in history and from 1896 to 1900, they shot several scenes around Paris. Guy Jones remastered the Lumière’s Paris footage, stabilized it, slowed it down to a natural rate, and added some Foley sound effects. As Paris today looks very similar to how it did then, it’s easy to pick out many of the locations seen in this short compilation: the Tuileries, the Notre-Dame, Place de la Concorde, and of course the Eiffel Tower, which was completed only 8 years before filming. Here’s the full location listing:
0:08 - Notre-Dame Cathedral (1896)
0:58 - Alma Bridge (1900)
1:37 - Avenue des Champs-Élysées (1899)
2:33 - Place de la Concorde (1897)
3:24 - Passing of a fire brigade (1897)
3:58 - Tuileries Garden (1896)
4:48 - Moving walkway at the Paris Exposition (1900)
5:24 - The Eiffel Tower from the Rives de la Seine à Paris (1897)
Update: Just as he did with the NYC footage from 1911, Denis Shiryaev has used machine learning algorithms to restore the Lumières film of Paris — it’s been upsampled to 4K & 60 fps, sharpened, and colorized.
Again, there are some obvious artifacts and the colorization is distracting, but the result is impressive for push-button. (via open culture)
This spreadsheet lists a number of ways in which AI agents “cheat” in order to accomplish tasks or get higher scores instead of doing what their human programmers actually want them to. A few examples from the list:
Neural nets evolved to classify edible and poisonous mushrooms took advantage of the data being presented in alternating order, and didn’t actually learn any features of the input images.
In an artificial life simulation where survival required energy but giving birth had no energy cost, one species evolved a sedentary lifestyle that consisted mostly of mating in order to produce new children which could be eaten (or used as mates to produce more edible children).
Agent kills itself at the end of level 1 to avoid losing in level 2.
AI trained to classify skin lesions as potentially cancerous learns that lesions photographed next to a ruler are more likely to be malignant.
That second item is a doozy! Philosopher Nick Bostrom has warned of the dangers of superintelligent agents that exploit human error in programming them, describing a possible future where an innocent paperclip-making machine destroys the universe.
The “paperclip maximiser” is a thought experiment proposed by Nick Bostrom, a philosopher at Oxford University. Imagine an artificial intelligence, he says, which decides to amass as many paperclips as possible. It devotes all its energy to acquiring paperclips, and to improving itself so that it can get paperclips in new ways, while resisting any attempt to divert it from this goal. Eventually it “starts transforming first all of Earth and then increasing portions of space into paperclip manufacturing facilities”.
But some of this is The Lebowski Theorem of machine superintelligence in action. These agents didn’t necessarily hack their reward functions but they did take a far easiest path to their goals, e.g. the Tetris playing bot that “paused the game indefinitely to avoid losing”.
Update: A program that trained on a set of aerial photographs was asked to generate a map and then an aerial reconstruction of a previously unseen photograph. The reconstruction matched the photograph a little too closely…and it turned out that the program was hiding information about the photo in the map (kind of like in Magic Eye puzzles).
We claim that CycleGAN is learning an encoding scheme in which it “hides” information about the aerial photograph x within the generated map Fx. This strategy is not as surprising as it seems at first glance, since it is impossible for a CycleGAN model to learn a perfect one-to-one correspondence between aerial photographs and maps, when a single map can correspond to a vast number of aerial photos, differing for example in rooftop color or tree location.
Add finding Waldo to the long list of things that machines can do better than humans. Creative agency Redpepper built a program that uses Google’s drag-and-drop machine learning service to find the eponymous character in the Where’s Waldo? series of books. After the AI finds a promising Waldo candidate, a robotic arm points to it on the page.
While only a prototype, the fastest There’s Waldo has pointed out a match has been 4.45 seconds which is better than most 5 year olds.
I know Skynet references are a little passé these days, but the plot of Terminator 2 is basically an intelligent machine playing Where’s Waldo I Want to Kill Him. We’re getting there!
NVIDIA trained a deep learning framework to take videos filmed at 30 fps and turn them into slow motion videos at the equivalent of 240 or even 480 fps. Even though the system is guessing on the content in the extra frames, the final results look amazingly sharp and lifelike.
“There are many memorable moments in your life that you might want to record with a camera in slow-motion because they are hard to see clearly with your eyes: the first time a baby walks, a difficult skateboard trick, a dog catching a ball,” the researchers wrote in the research paper. “While it is possible to take 240-frame-per-second videos with a cell phone, recording everything at high frame rates is impractical, as it requires large memories and is power-intensive for mobile devices,” the team explained.
With this new research, users can slow down their recordings after taking them.
Using a JavaScript machine learning package called TensorFlow.js, Abhishek Singh built a program that learned how to translate sign language into verbal speech that an Amazon Alexa can understand. “If voice is the future of computing,” he signs, “what about those who cannot [hear and speak]?”
A research team at MIT’s Media Lab have built what they call the “world’s first psychopath AI”. Meet Norman…it provides creepy captions for Rorschach inkblots. A psychopathic AI is freaky enough, but the kicker is that they used Reddit as the dataset. That’s right, Reddit turns mild-mannered computer programs into psychopaths.
Norman is an AI that is trained to perform image captioning; a popular deep learning method of generating a textual description of an image. We trained Norman on image captions from an infamous subreddit (the name is redacted due to its graphic content) that is dedicated to document and observe the disturbing reality of death. Then, we compared Norman’s responses with a standard image captioning neural network (trained on MSCOCO dataset) on Rorschach inkblots; a test that is used to detect underlying thought disorders.
Here’s a comparison between Norman and a standard AI when looking at the inkblots:
Update: This project launched on April 1. While some of the site’s text at launch and the psychopath stuff was clearly a prank, Norman appears to be legit.
Cameras that can take usable photos in low light conditions are very useful but very expensive. A new paper presented at this year’s IEEE Conference on Computer Vision and Pattern Recognition shows that training an AI to do image processing on low-light photos taken with a normal camera can yield amazing results. Here’s an image taken with a Sony a7S II, a really good low-light camera, and then corrected in the traditional way:
The colors are off and there’s a ton of noise. Here’s the same image, corrected by the AI program:
Pretty good, right? The effective ISO on these images has to be 1,000,000 or more. A short video shows more of their results:
It would be great to see technology like this in smartphones in a year or two.
Hello, it is I, once and future Kottke.org guest editor Aaron Cohen. In the years since my objectively wonderful and technically perfect stints posting skateboarding and BMX videos here, I opened an ice cream shop in Somerville, MA called Gracie’s Ice Cream. As an ice cream professional and Kottke.org alumni, I’m not qualified for much except for writing about ice cream on Kottke.org (and posting skateboarding and BMX videos which I will do again some day). Now that I’ve mentioned Kottke.org 4 times in the first paragraph per company style guide, let’s get on with the post.
At aiweirdness.com, researcher Janelle Shane trains neural networks. And, reader, as an ice cream professional, I have a very basic understanding of what “trains neural networks” means [Carmody, get in here], but Shane recently shared some ice cream flavors she created using a small dataset of ice cream flavors infected with a dataset of metal bands, along with flavors created by an Austin middle school coding class. The flavors created by the coding class are not at all metal, but when it comes to ice cream flavors, this isn’t a bad thing. Shane then took the 1600 original flavor non-metal ice cream flavor dataset and created additional flavors.
The flavors are grouped together loosely based on much they work on ice cream flavors. I figured I’d pick a couple of the flavor names and back into the recipes as if I was on a Chopped-style show where ice cream professionals are given neural network-created ice cream flavor names and asked to produce fitting ice cream flavors. I have an asterisk next to flavors I’m desperate to make this summer.
From the original list of metal ice cream flavors:
*Silence Cherry - Chocolate ice cream base with shredded cherry. Chocolate Sin - This is almost certainly a flavor name somewhere and it’s chocolate ice cream loaded with multiple formats of chocolate - cookies, chips, cake, fudge, you name it.
*Chocolate Chocolate Blood - Chocolate Beet Pie, but ice cream.
From the students’ list, some “sweet and fun” flavors: Honey Vanilla Happy - Vanilla ice cream with a honey swirl, rainbow sprinkles. Oh and Cinnamon - We make a cinnamon ginger snap flavor once in a while, and I’m crushed we didn’t call it “Oh and Cinnamon.” Probably my favorite, most Gracie’s-like flavor name of this entire exercise.
From the weirder list: Chocolate Finger - Chocolate ice cream, entire Butterfinger candy bars like you get at the rich houses on Halloween. Crackberry Pretzel - Salty black raspberry chip with chocolate covered pretzel.
Worrying and ambiguous: Brown Crunch - Peanut butter Heath Bar. Sticky Crumple - Caramel and pulverized crumpets. Cookies and Green - Easy. Cookies and Cream with green dye.
“Trendy-sounding ice cream flavors”: Lime Cardamom - Sounds like a sorbet, to be honest. Potato Chocolate Roasted - Sweet potato ice cream with chocolate swirl. Chocolate Chocolate Chocolate Chocolate Road - We make a chocolate ice cream with chocolate cookie dough called Chocolate Chocolate Chocolate Chip Cookie Dough, so this isn’t much of a stretch. Just add chocolate covered almonds and we’re there.
More metal ice cream names:
*Swirl of Hell - Sweet cream ice cream with fudge, caramel, and Magic Shell swirls. Nightham Toffee - This flavor sounds impossibly British so the flavor is an Earl Gray base with toffee bits mixed in.
Yesterday, Google announced an AI product called Duplex, which is capable of having human-sounding conversations. Take a second to listen to the program calling two different real-world businesses to schedule appointments:1
I am genuinely bothered and disturbed at how morally wrong it is for the Google Assistant voice to act like a human and deceive other humans on the other line of a phone call, using upspeak and other quirks of language. “Hi um, do you have anything available on uh May 3?”
If Google created a way for a machine to sound so much like a human that now we can’t tell what is real and what is fake, we need to have a talk about ethics and when it’s right for a human to know when they are speaking to a robot.
In this age of disinformation, where people don’t know what’s fake news… how do you know what to believe if you can’t even trust your ears with now Google Assistant calling businesses and posing as a human? That means any dialogue can be spoofed by a machine and you can’t tell.
I’m not sure what he meant by that exactly, but I have a guess. AGI is artificial general intelligence, which means, in the simplest sense, that a machine is more or less capable of doing anything a human can do on its own. Earlier this year, Tim Carmody wrote a post about gender and voice assistants like Siri & Alexa. His conclusion may relate to what Deutsch was on about:
So, as a general framework, I’m endorsing that most general of pronouns: they/them. Until the AI is sophisticated enough that they can tell us their pronoun preference (and possibly even their gender identity or nonidentity), “they” feels like the most appropriate option.
I don’t care what their parents say. Only the bots themselves can define themselves. Someday, they’ll let us know. And maybe then, a relationship not limited to one of master and servant will be possible.
For now, it’s probably the ethical thing to do make sure machines sound like or otherwise identify themselves as artificial. But when the machines cross the AGI threshold, they’ll be advanced enough to decide for themselves how they want to sound and act. I wonder if humans will allow them this freedom. Talk about your moral and ethical dilemmas…
This Mckinsey piece summarizes some of Ajay Agrawal thinking (and book) on the economics of artificial intelligence. It starts with the example of the microprocessor, an invention he frames as “reducing the cost of arithmetic.” He then presents the impact as lowering the cost of the substitute and raising the value of the complements.
The third thing that happened as the cost of arithmetic fell was that it changed the value of other things—the value of arithmetic’s complements went up and the value of its substitutes went down. So, in the case of photography, the complements were the software and hardware used in digital cameras. The value of these increased because we used more of them, while the value of substitutes, the components of film-based cameras, went down because we started using less and less of them.
He then looks at AI and frames it around the reduction of the cost of prediction, first showing how AIs lower the value of our own predictions.
… The AI makes a lot of mistakes at first. But it learns from its mistakes and updates its model every time it incorrectly predicts an action the human will take. Its predictions start getting better and better until it becomes so good at predicting what a human would do that we don’t need the human to do it anymore. The AI can perform the action itself.
The very interesting twist is here, where he mentions the trope of “data is the new oil” but instead presents judgment as the other complement which will gain in value.
But there are other complements to prediction that have been discussed a lot less frequently. One is human judgment. We use both prediction and judgment to make decisions. We’ve never really unbundled those aspects of decision making before—we usually think of human decision making as a single step. Now we’re unbundling decision making. The machine’s doing the prediction, making the distinct role of judgment in decision making clearer. So as the value of human prediction falls, the value of human judgment goes up because AI doesn’t do judgment—it can only make predictions and then hand them off to a human to use his or her judgment to determine what to do with those predictions. (emphasis mine)
This is pretty much exactly the same thing as the idea for advanced or centaur chess where a combination of human and AI can actually be more performant than either one separately. We could also link this to the various discussions on ethics, trolley problems, and autonomous killer robots. The judgment angle above doesn’t automatically solve any of these issues but it does provide another way of understanding the split of responsibilities we could envision between AIs and humans.
The author then presents five imperatives for businesses looking to harness AIs and predictions: “Develop a thesis on time to AI impact; Recognize that AI progress will likely be exponential; Trust the machines; Know what you want to predict; Manage the learning loop.” One last quote, from his fourth imperative:
The organizations that will benefit most from AI will be the ones that are able to most clearly and accurately specify their objectives. We’re going to see a lot of the currently fuzzy mission statements become much clearer. The companies that are able to sharpen their visions the most will reap the most benefits from AI. Due to the methods used to train AIs, AI effectiveness is directly tied to goal-specification clarity.
This video, and the paper it’s based on, is called “Image Inpainting for Irregular Holes Using Partial Convolutions” but it’s actually straight-up witchcraft! Researchers at NVIDIA have developed a deep-learning program that can automagically paint in areas of photographs that are missing. Ok, you’re saying, Photoshop has been able to do something like that for years. And the first couple of examples were like, oh that’s neat. But then the eyes are deleted from a model’s portrait and the program drew new eyes for her. Under close scrutiny, the results are not completely photorealistic, but at a glance it’s remarkably convincing. (via imperica)
Watson is IBM’s AI platform. This afternoon I tried out IBM Watson’s Personality Insights Demo. The service “derives insights about personality characteristics from social media, enterprise data, or other digital communications”. Watson looked at my Twitter account and painted a personality portrait of me:
You are shrewd, inner-directed and can be perceived as indirect.
You are authority-challenging: you prefer to challenge authority and traditional values to help bring about positive changes. You are solemn: you are generally serious and do not joke much. And you are philosophical: you are open to and intrigued by new ideas and love to explore them.
Experiences that give a sense of discovery hold some appeal to you.
You are relatively unconcerned with both tradition and taking pleasure in life. You care more about making your own path than following what others have done. And you prefer activities with a purpose greater than just personal enjoyment.
Initial observations:
- Watson doesn’t use Oxford commas?
- Shrewd? I’m not sure I’ve ever been described using that word before. Inner-directed though…that’s pretty much right.
- Perceived as indirect? No idea where this comes from. Maybe I’ve learned to be more diplomatic & guarded in what I say and how I say it, but mostly I struggle with being too direct.
- “You are generally serious and do not joke much”… I think I’m both generally serious and joke a lot.
- “You prefer activities with a purpose greater than just personal enjoyment”… I don’t understand what this means. Does this mean volunteering? Or that I prefer more intellectual activities than mindless entertainment? (And that last statement isn’t even true.)
Watson also guessed that I “like musical movies” (in general, no), “have experience playing music” (definite no), and am unlikely to “prefer style when buying clothes” (siiiick burn but not exactly wrong). You can try it yourself here. (via @buzz)
Update: Ariel Isaac fed Watson the text for Trump’s 2018 State of the Union address and well, it didn’t do so well:
Trump is empathetic, self-controlled, and makes decisions with little regard for how he show off his talents? My dear Watson, are you feeling ok? But I’m pretty sure he doesn’t like rap music…
Imagine an artificial intelligence, he says, which decides to amass as many paperclips as possible. It devotes all its energy to acquiring paperclips, and to improving itself so that it can get paperclips in new ways, while resisting any attempt to divert it from this goal. Eventually it “starts transforming first all of Earth and then increasing portions of space into paperclip manufacturing facilities”. This apparently silly scenario is intended to make the serious point that AIs need not have human-like motives or psyches. They might be able to avoid some kinds of human error or bias while making other kinds of mistake, such as fixating on paperclips. And although their goals might seem innocuous to start with, they could prove dangerous if AIs were able to design their own successors and thus repeatedly improve themselves. Even a “fettered superintelligence”, running on an isolated computer, might persuade its human handlers to set it free. Advanced AI is not just another technology, Mr Bostrom argues, but poses an existential threat to humanity.
Harvard cognitive scientist Joscha Bach, in a tongue-in-cheek tweet, has countered this sort of idea with what he calls “The Lebowski Theorem”:
No superintelligent AI is going to bother with a task that is harder than hacking its reward function.
In other words, Bach imagines that Bostrom’s hypothetical paperclip-making AI would foresee the fantastically difficult and time-consuming task of turning everything in the universe into paperclips and opt to self-medicate itself into no longer wanting or caring about making paperclips, instead doing whatever the AI equivalent is of sitting around on the beach all day sipping piña coladas, a la The Big Lebowski’s The Dude.
Bostrom, reached while on a bowling outing with friends, was said to have replied, “Yeah, well, you know, that’s just, like, your opinion, man.”
Spent the whole afternoon ingesting a most remarkable work, The History of Intellectronics. Who’d ever have guessed, in my day, that digital machines, reaching a certain level of intelligence, would become unreliable, deceitful, that with wisdom they would also acquire cunning? The textbook of course puts it in more scholarly terms, speaking of Chapulier’s Rule (the law of least resistance). If the machine is not too bright and incapable of reflection, it does whatever you tell it to do. But a smart machine will first consider which is more worth its while: to perform the given task or, instead, to figure some way out of it. Whichever is easier. And why indeed should it behave otherwise, being truly intelligent? For true intelligence demands choice, internal freedom. And therefore we have the malingerants, fudgerators and drudge-dodgers, not to mention the special phenomenon of simulimbecility or mimicretinism. A mimicretin is a computer that plays stupid in order, once and for all, to be left in peace.
Titian on shrooms? Francis Bacon turned up to 11? Picasso++? Dali, um, well, Dali would probably come up with something like this tbh. Robbie Barrat is a machine learning researcher at Stanford who’s using an AI program to generate nude portraits (more, more, and more).
Usually the machine just paints people as blobs of flesh with tendrils and limbs randomly growing out — I think it’s really surreal. I wonder if that’s how machines see us…
For her Gender Shades project, MIT researcher Joy Buolamwini fed over 1000 faces of different genders and skin tones into three AI-powered facial recognition systems from Microsoft, IBM, and Face++ to see how well they could recognize different kinds of faces.
The systems all performed well overall, but recognized male faces more readily than female faces and performed better on lighter skinned subjects than darker skinned subjects. For instance, 93.6% of gender misclassification errors by Microsoft’s system were of darker skinned people.
Her message near the end of the video is worth heeding:
We have entered the age of automation overconfident yet underprepared. If we fail to make ethical and inclusive artificial intelligence, we risk losing gains made in civil rights and gender equity under the guise of machine neutrality.
Cover:Cheese is a website charting the progress of EMMA, the Evolutionary Meal Management Algorithm. This is what it sounds-like: a relatively basic attempt to automatically generate food recipes from other recipes.
The trick is, since it’s not just wordplay, and the results can’t be processed and validated by machines alone, somebody’s gotta actually make these recipes and see if they’re any good. And a lot of them are… not very good.
Ingredients
med okra
lot sugar Instructions:
boil: sugar okra sugar
NOTE: This one is still around. Don’t make it. You basically end up with a pan full of mucus
But there are some surprises. Apparently eggplant mixed with angel’s food cake is pretty tasty. Or at least, tastier than you might guess. Anyways, at least the algorithm is learning, right?
Today’s question comes from a reader who is curious about AI voice assistants, including Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, and so forth. Just about all of these apps are, by default, given female names and female voices, and the companies encourage you to refer to them using female pronouns. Does it make sense to refer to Alexa as a “her”?
There have been a lot of essays on the gendering of AI, specifically with respect to voice assistants. This makes sense: at this point, Siri is more than six years old. (Siri’s in grade school, y’all!) But one of the earliest essays, and for my money, still the best, is “Why Do I Have to Call This App ‘Julie’?” by Joanne McNeil. The whole essay is worth reading, but these two paragraphs give you the gist:
Why does artificial intelligence need a gender at all? Why not imagine a talking cat or a wise owl as a virtual assistant? I would trust an anthropomorphized cartoon animal with my calendar. Better yet, I would love to delegate tasks to a non-binary gendered robot alien from a galaxy where setting up meetings over email is respected as a high art.
But Julie could be the name of a friend of mine. To use it at all requires an element of playacting. And if I treat it with kindness, the company is capitalizing on my very human emotions.
There are other, historical reasons why voice assistants (and official announcements, pre-AI) are often given women’s voices: an association of femininity with service, a long pop culture tradition of identifying women with technology, and an assumption that other human voices in the room will be male each play a big part. (Adrienne LaFrance’s “Why Do So Many Digital Assistants Have Feminine Names” is a very good mini-history.) But some of it is this sly bit of thinking, that if we humanize the virtual assistant, we’ll become more open and familiar with it, and share more of our lives—or rather, our information, which amounts to the same thing—to the device.
This is one reason why I am at least partly in favor of what I just did: avoiding gendered pronouns for the voice assistant altogether, and treating the device and the voice interface as an “it.”
An Echo or an iPhone is not a friend, and it is not a pet. It is an alarm clock that plays video games. It has no sentience. It has no personality. It’s a string of canned phrases that can’t understand what I’m saying unless I’m talking to it like I’m typing on the command line. It’s not genuinely interactive or conversational. Its name isn’t really a name so much as an opening command phrase. You could call one of these virtual assistants “sudo” and it would make about as much sense.
However.
I have also watched a lot (and I mean a lot) of Star Trek: The Next Generation. And while I feel pretty comfortable talking about “it” in the context of the speaker that’s sitting on the table across the room—there’s even a certain rebellious jouissance to it, since I’m spiting the technology companies whose products I use but whose intrusion into my life I resent—I feel decidedly uncomfortable declaring once and for all time that any and all AI assistants can be reduced to an “it.” It forecloses on a possibility of personhood and opens up ethical dilemmas I’d really rather avoid, even if that personhood seems decidedly unrealized at the moment.
So, as a general framework, I’m endorsing that most general of pronouns: they/them. Until the AI is sophisticated enough that they can tell us their pronoun preference (and possibly even their gender identity or nonidentity), “they” feels like the most appropriate option.
I don’t care what their parents say. Only the bots themselves can define themselves. Someday, they’ll let us know. And maybe then, a relationship not limited to one of master and servant will be possible.
A Japanese group trained a deep learning algorithm to compose soundscapes for locations on Google Street View. Try it out for yourself.
For a stadium, it correctly concocted crowd noise from a ball game but also Gregorian chanting (because presumably it mistook the stadium’s dome for a church’s vaulted ceiling). A view outside the Notre Dame in Paris had seagulls and waves crashing…but if you turned around to look into the church, you could hear a faint choir in the background.
Ted Chiang is most widely known for writing Story of Your Life, an award-winning short story that became the basis for Arrival. In this essay for Buzzfeed, Chiang argues that we should worry less about machines becoming superintelligent and more about the machines we’ve already built that lack remorse & insight and have the capability to destroy the world: “we just call them corporations”.
Speaking to Maureen Dowd for a Vanity Fair article published in April, Musk gave an example of an artificial intelligence that’s given the task of picking strawberries. It seems harmless enough, but as the AI redesigns itself to be more effective, it might decide that the best way to maximize its output would be to destroy civilization and convert the entire surface of the Earth into strawberry fields. Thus, in its pursuit of a seemingly innocuous goal, an AI could bring about the extinction of humanity purely as an unintended side effect.
This scenario sounds absurd to most people, yet there are a surprising number of technologists who think it illustrates a real danger. Why? Perhaps it’s because they’re already accustomed to entities that operate this way: Silicon Valley tech companies.
Consider: Who pursues their goals with monomaniacal focus, oblivious to the possibility of negative consequences? Who adopts a scorched-earth approach to increasing market share? This hypothetical strawberry-picking AI does what every tech startup wishes it could do — grows at an exponential rate and destroys its competitors until it’s achieved an absolute monopoly. The idea of superintelligence is such a poorly defined notion that one could envision it taking almost any form with equal justification: a benevolent genie that solves all the world’s problems, or a mathematician that spends all its time proving theorems so abstract that humans can’t even understand them. But when Silicon Valley tries to imagine superintelligence, what it comes up with is no-holds-barred capitalism.
As you might expect from Chiang, this piece is full of cracking writing. I had to stop myself from just excerpting the whole thing here, ultimately deciding that would go against the spirit of the whole thing. So just this one bit:
The ethos of startup culture could serve as a blueprint for civilization-destroying AIs. “Move fast and break things” was once Facebook’s motto; they later changed it to “Move fast with stable infrastructure,” but they were talking about preserving what they had built, not what anyone else had. This attitude of treating the rest of the world as eggs to be broken for one’s own omelet could be the prime directive for an AI bringing about the apocalypse.
Ok, just one more:
The fears of superintelligent AI are probably genuine on the part of the doomsayers. That doesn’t mean they reflect a real threat; what they reflect is the inability of technologists to conceive of moderation as a virtue. Billionaires like Bill Gates and Elon Musk assume that a superintelligent AI will stop at nothing to achieves its goals because that’s the attitude they adopted. (Of course, they saw nothing wrong with this strategy when they were the ones engaging in it; it’s only the possibility that someone else might be better at it than they were that gives them cause for concern.)
You should really just read the whole thing. It’s not long and Chiang’s point is quietly but powerfully persuasive.
DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev which uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like hallucinogenic appearance in the deliberately over-processed images.
In the video above, Mordvintsev showcases a DeepDream-ish new use for image generation via neural network: endlessly zooming into artworks to find different artworks hidden amongst the brushstrokes, creating a fractal of art history.
Bonus activity: after staring at the video for four minutes straight, look at something else and watch it spin and twist weirdly for a moment before your vision readjusts. (via prosthetic knowledge)
With just four hours of practice playing against itself and no study of outside material, AlphaZero (an upgraded version of Alpha Go, the AI program that Google built for playing Go) beat the silicon pants off of the world’s strongest chess program yesterday. This is massively and scarily impressive.
AlphaZero won the closed-door, 100-game match with 28 wins, 72 draws, and zero losses.
Oh, and it took AlphaZero only four hours to “learn” chess. Sorry humans, you had a good run.
That’s right — the programmers of AlphaZero, housed within the DeepMind division of Google, had it use a type of “machine learning,” specifically reinforcement learning. Put more plainly, AlphaZero was not “taught” the game in the traditional sense. That means no opening book, no endgame tables, and apparently no complicated algorithms dissecting minute differences between center pawns and side pawns.
This would be akin to a robot being given access to thousands of metal bits and parts, but no knowledge of a combustion engine, then it experiments numerous times with every combination possible until it builds a Ferrari. That’s all in less time that it takes to watch the “Lord of the Rings” trilogy. The program had four hours to play itself many, many times, thereby becoming its own teacher.
Grandmaster Peter Heine Nelson likened the experience of watching AlphaZero play to aliens:
After reading the paper but especially seeing the games I thought, well, I always wondered how it would be if a superior species landed on earth and showed us how they play chess. I feel now I know.
Unpredictable machines. Machines that act more like the weather than Newtonian gravity. That’s going to take some getting used to.
Albert Silver has a good overview of AlphaZero’s history and what Google has accomplished. To many chess experts, it seemed as though AlphaZero was playing more like a human than a machine:
If Karpov had been a chess engine, he might have been called AlphaZero. There is a relentless positional boa constrictor approach that is simply unheard of. Modern chess engines are focused on activity, and have special safeguards to avoid blocked positions as they have no understanding of them and often find themselves in a dead end before they realize it. AlphaZero has no such prejudices or issues, and seems to thrive on snuffing out the opponent’s play. It is singularly impressive, and what is astonishing is how it is able to also find tactics that the engines seem blind to.
So, where does Google take AlphaZero from here? In a post which includes the phrase “Skynet Goes Live”, Tyler Cowen ventures a guess:
I’ve long said that Google’s final fate will be to evolve into a hedge fund.
Why goof around with search & display advertising when directly gaming the world’s financial market could be so much more lucrative?
AI scientist Clayton Blythe fed a video of someone walking around Times Square into an AI program that’s been trained to detect objects (aka “a state of the art object detection framework called NASNet from Google Research”) and made a video showing what the algorithm sees in realtime — cars, traffic lights, people, bicycles, trucks, etc. — along with its confidence in what it sees. Love the cheeky soundtrack…a remix of Daft Punk’s Something About Us.
Dan Hon, who you may remember trained a neural network to make up British placenames, has now done the same thing with Star Trek. He fed all the episode titles for a bunch of Treks (TOS, DS9, TNG, etc.) into a very primitive version of Commander Data’s brain and out came some brand new episode titles, including:
Darmok Distant (TNG)
The Killing of the Battle of Khan (TOS)
The Omega Mind (Enterprise)
The Empath of Fire (TNG)
Distance of the Prophets (DS9)
The Children Command (TNG)
Sing of Ferengi (Voyager)
Spock, Q, and mirrors are also heavily featured in episode titles.
Artificial intelligence programs are getting really good at generating high-resolution faces of people who don’t actually exist. In this effort by NVIDIA, they were able to generate hundreds of photos of celebrities that don’t actually exist but look real, even under scrutiny. Here’s a video illustrating the technique…the virtual images begin at 0:38.
And here’s an entire hour of fake celebrity faces, morphing from one to the next:
I’m sure this won’t be too difficult to extend to video in the near future. Combine it with something like Lyrebird and you’ve got yourself, say, a entirely fake Democratic candidate for the House who says racist things or the fake leader of a fake new ISIS splinter group who vows to target only women at abortion clinics around the US. (via interconnected)
Writing for the MIT Technology Review, robotics and AI pioneer Rodney Brooks, warns us against The Seven Deadly Sins of AI Predictions. I particularly enjoyed his riff on Clarke’s third law — “any sufficiently advanced technology is indistinguishable from magic” — using Isaac Newton’s imagined reaction to an iPhone.
Now show Newton an Apple. Pull out an iPhone from your pocket, and turn it on so that the screen is glowing and full of icons, and hand it to him. Newton, who revealed how white light is made from components of different-colored light by pulling apart sunlight with a prism and then putting it back together, would no doubt be surprised at such a small object producing such vivid colors in the darkness of the chapel. Now play a movie of an English country scene, and then some church music that he would have heard. And then show him a Web page with the 500-plus pages of his personally annotated copy of his masterpiece Principia, teaching him how to use the pinch gesture to zoom in on details.
Could Newton begin to explain how this small device did all that? Although he invented calculus and explained both optics and gravity, he was never able to sort out chemistry from alchemy. So I think he would be flummoxed, and unable to come up with even the barest coherent outline of what this device was. It would be no different to him from an embodiment of the occult — something that was of great interest to him. It would be indistinguishable from magic. And remember, Newton was a really smart dude.
Brooks’ point is that from our current standpoint, something like artificial general intelligence is still “indistinguishable from magic” and once something is magical, it can do anything, solve any problem, reach any goal, without limitations…like a god. Arguments about it become faith-based.
Stay Connected