The State of Machine Learning in 2020
Deep learning. Machine learning. Neural nets. Whatever term is employed, there’s little doubt that artificial intelligence will shape our future: how we work, shop, communicate, monitor our health, drive (or don’t), and more.
Interviews by Andrew Postman
Artwork by Jason Mecier
NYU has long been at the vanguard of the AI revolution, and it is seeing its prominence in the field surge as of late. With a hyper-collaborative approach, award-winning faculty, and a groundbreaking institute, the subject is being taught, studied, and applied seemingly everywhere, in places expected (the Tandon School of Engineering), unexpected (the College of Arts and Science’s Carter Journalism Institute), and new (the interdisciplinary AI Now Institute, a pioneering academy examining the social implications of the technology). What exactly is happening? We’ll let 10 resident experts fill you in, starting with one of the three godfathers of AI and winner of the Turing Award, considered the Nobel Prize of computing.
Professor, Center for Data Science, Center for Neural Science, Courant Institute of Mathematical Sciences, Tandon School of Engineering; Director of AI Research at Facebook
There are a number of ways to be arrogant or naive about AI. If you have not realized after a few years of experience that it’s very difficult to reduce your grand ideas to practice, that means you’re self-deluded or a crackpot. Another way is to oversell what AI does and will do in the future. The history of AI is littered with people who said within 10 years the world champion in chess would be a computer, or by 1992 we’d have HAL 9000 [the computer in 2001: A Space Odyssey]. But a naive solution to almost any AI task often requires a ridiculously, impractically large amount of computation. For example, you could play chess by exploring all possible sequences of moves. But it’s a huge tree of possibilities. It’s just too big to explore fully, even with a powerful computer. We could also be arrogant by deciding what type of AI technology should or should not be developed. But that decision should be in the hands of society at large.
I’m known for having developed a technique called convolutional neural network, which is used for almost all image and speech recognition applications. It is used in all automatic emergency braking systems in cars. This technology reduces collisions by 40 percent and saves lives. It’s also used in systems that analyze medical images—detecting breast cancer in mammograms, outlining the contour of bones for orthopedic surgery. It makes the radiologist more efficient and reduces the false negative rate. The same underlying technology is used for monitoring wildlife. But it’s also used by the Chinese government to spy on people. The bad uses are generally by governments that are not democratic, so the best protection is strong democratic institutions.
Humans, many mammals, and many birds have autonomous intelligence in that their behavior is not entirely hard-wired but indirectly determined by an objective that is built into us by evolution, in our basal ganglia. It tells us we’re hungry. We need to satisfy this objective by eating. But it won’t tell us how.
Some of us have done experiments where you train the system to predict what will happen. If I turn the wheel to the right, what will happen to me or the cars around me? If you have a predictive system where the task is defined by an objective and not the direct action, then that machine will have emotions. If it predicts that a collision may be coming up, it will avoid it, as if it is animated by fear.
In the last 10 years, there’s been a revolution in AI due to the emergence of deep learning, the enabling technology that has brought about good speech recognition, good autonomous driving, image recognition, text generation, all that stuff.
Thirty years ago, when we started playing with convolutional nets and the backpropagation algorithm that we use to train those neural nets, we thought it was super exciting. Turned out the system needed a lot of data to be trained, and data was rarely available. Then the internet happened and society became digital. Now there’s lots of data, and those methods turned out to really thrive on large data.
In the mid-2000s, a small group of us here that included Rob Fergus [computer science professor at the Courant Institute of Mathematical Sciences] started working on those things. In the late 2000s, there were a few tasks in computer vision that deep learning and convolutional nets were solving really well, like pedestrian detection. Then around 2010, people working in speech recognition switched to using deep learning for what’s called acoustic modeling, which takes the speech signal and identifies the sequence of sounds being pronounced. That brought down the error rates on speech recognition by a big amount. Within 18 months of the first prototypes, it was in everybody’s hands—in smartphones by Google, Microsoft, Apple. That was a watershed moment. At the end of 2012, early 2013, an even bigger revolution happened in computer vision. The ImageNet competition made a large data set available with over one million training samples in 1,000 categories. That competition was won handily by our friends at the University of Toronto. Using very large convolutional nets running on graphics processing units, they reduced the error rate by such a huge margin that people said, “Stop everything. Let’s use that now.”
In 2014, researchers at Google showed that deep learning systems could perform language translation with comparable performance to more classical approaches. The following year, a team at the University of Montreal, which included Kyunghyun Cho, now an associate professor of computer science at Courant and of data science at the Center for Data Science [CDS], proposed a neural net architecture for translation that worked so well that within a year, all the Facebooks, Googles, Microsofts, IBMs of the world were using it in their translation services. This was a watershed.
Then, at the end of 2018, a group from Google proposed a new neural net architecture to represent text called BERT. You show this giant neural network a long sequence of words from a text, take some words out, and train the network to predict the words that are missing. In the process of learning to do this, it learns to represent the meaning of text by long lists of numbers (called vectors). This led to a revolution in text understanding by machines, blowing away all the records on a set of tasks used to evaluate computer understanding of text called GLUE, partially designed by Sam Bowman, who is an assistant professor of linguistics in the Faculty of Arts and Science and of data science at CDS. For example, you give a system a sentence like “The trophy doesn’t fit in the suitcase because it is too large,” then ask which word “it” refers to (suitcase or trophy?). Until recently, the best systems were only 60 percent correct. The latest record is 93 percent. Humans get 95 percent.
This has immediate consequences on society. Facebook, YouTube, and Twitter need to be able to take down hate speech. You need a system that takes a piece of text, turns it into a representation of the meaning that’s independent of the language, and then feed that to a classifier that has been trained with tons of data to tell you: Is this hate speech or not? Those technologies are crucial to be able to do this.
NYU is particularly strong in natural language processing. A team here is developing new paradigms of machine learning. The two big challenges in AI are: How do we make machines learn to reason as opposed to just learning to perceive, and how do we get machines to learn as efficiently as humans and animals? Machines need the equivalent of 80 hours of training playing a simple video game to reach the performance that any human reaches in about 15 minutes. If you want the computer to be good at playing the board game Go, you need to run it for two weeks in parallel on 2,000 computers. It must play 4 million games before it reaches a good performance.
Using similar algorithms, a self-driving car would have to run off a cliff hundreds of times before it figures out it’s a bad idea. It would be better to figure out how to reproduce the type of learning humans perform. We can learn to drive in 20 hours because we have this predictive model that tells us: if I’m near a cliff and I turn the wheel to the right, nothing good will come of it. How we get machines to learn predictive models of the world like humans and animals is the big challenge of the next decade or two.
A good thing is that the vast majority of AI research is published and available for free to anyone. When I joined Facebook, I explained to Mark Zuckerberg that the way you make a research lab work is you mandate people to publish results. If you forbid scientists from publishing, the best ones won’t join. So from the start, we published everything we were doing really quickly, open-sourced all the code, and that prompted Google to do the same.
Stacie Grossman Bloom
Vice Provost for Research
The university’s biggest advantage in AI is our broad expertise—from the most fundamental research areas to the applications across science and society. We also consider the legal, philosophical, and ethical implications of AI. And we encourage people to think outside their disciplines and schools to work collaboratively.
NYU has a unique combination of unparalleled advantages: established visibility in machine learning; a long-standing reputation in applied mathematics; one of the premier data science centers nationwide; an unparalleled location within a short distance of major NYC industry research labs; rapidly growing interest from traditional NYC industries (finance, insurance, healthcare); and a top medical school fully committed to supporting medical AI research.
Further, the interdisciplinary research at NYU makes us a leader in ethical AI and responsible data management. NYU’s ability to bring new products to market and creating new start-ups will add significant value to our AI endeavors and drive entrepreneurship. On the education side, the demand for AI-related courses far exceeds the supply: we expect that soon every student in sciences, social sciences, and engineering will want to take an AI-related course.
With investment, NYU will become the intellectual hub for AI in NYC and in the world. We will develop NYU as a world-class leader in AI research, solidifying our established excellence in deep learning and expanding it to key aspects of AI and its myriad applications. NYU will train the next generation of leading AI scientists and will engage stakeholders from across academia, industry, and government to form a robust, innovative network at the cutting edge of the most pressing open questions in AI research today.
Dean, Tandon School of Engineering
Courant, the Center for Data Science, and Tandon kind of jump-started the AI work being done at NYU. We are hiring more, but it’s hard because so is everyone else. Since we are a school of engineering, robotics is a big deal. Robotics without AI does not mean much.
There are lots of terms that are conflated with one another: AI, machine learning, neural network, data science. There really is no accepted definition of these. I’d say AI and data science started in parallel. AI is the investigation or science of computers and computing. They do tasks that normally require human intelligence, which requires the computer or machine to learn, to do predictive analysis, and to adapt.
New faculty member S. Farokh Atashzar, an assistant professor in the departments of electrical and computer engineering, as well as mechanical and aerospace engineering, does neural-assisted and neural-rehabilitation robotics. Let’s say you have a stroke. To regain some mobility, you have robotic attachments to your legs that make you exercise. If you make a wrong movement, they learn how you react and adapt to how they’re moving your legs. This is AI—the learning, predicting, and adapting.
I have worked for more than a decade on biomedical imaging, and the project that still goes on involves middle ear infection. Many otoscopes can take a picture of the tympanic membrane. If there’s an infection, that tympanic membrane will be bulging and red. But it could also be a buildup of water behind the membrane, which produces a similar picture, but there is no infection. Because it’s hard to tell them apart, pediatricians give an antibiotic, which is useless if it’s water buildup. With my collaborators at the University of Pittsburgh School of Medicine, I created an algorithm that takes these images and gives an output that says this is the one with infection or not. Highly skilled pediatricians can make the distinction with close to 80 percent accuracy, while algorithms we developed can do it with close to 95 percent accuracy. We say our work is a diagnostic aid, which means we say to the pediatrician, “Think about it more because what we are seeing might be something different.”
Some exciting Tandon projects include computer science and engineering professor Nasir Memon’s use of AI to fight deepfakes and bots; electrical and computer engineering assistant professor Anna Choromanska’s use of AI in autonomous vehicles; technology management and innovation professor Oded Nov’s work helping medical experts understand AI tools; and chemical and biomolecular engineering assistant professor Miguel Modestino’s use of AI to create sustainable chemical processes to produce precursors of nylon.
Let’s say you have a stroke. To regain some mobility, you have robotic attachments to your legs that make you exercise. If you make a wrong movement, they learn how you react and adapt to how they’re moving your legs. This is AI—the learning, predicting, and adapting.
Director, Center for Data Science
We have masses of data coming in various shapes and forms from various domains. How do we transform it into knowledge? Even though the tools or methods are borrowed from various fields, you can argue that this is its own new field.
The basic goal of research in AI is to develop the scientific and engineering foundations of automation of tasks. And the focus of data science is to make predictions based on large amounts of data. I’d argue that data science is much wider than AI because we are connecting with neuroscience, medicine, biology, physics, social science, even the humanities. All of these have big data and grapple with similar problems.
The Center for Data Science [CDS] is a provostial unit, not a department or school. NYU is visionary to be one of the first investing in this area. We share most of the faculty at the CILVR [Computational Intelligence, Learning, Vision, and Robotics] Lab, because in some sense all those experts are working in the fields of both AI and data science.
The big successes of deep learning rely on feeding huge amounts of data into the neural networks. Even if we’ve never seen a tiger before, humans will run when they see one, whereas machines have to see a million tigers to understand it’s not safe. CDS faculty, joint with psychology, neuroscience, and linguistics, are trying to understand how humans build these concepts so much faster.
CDS offers a whole gamut of programs—from a bachelor’s degree to a PhD in data science—and we have built ethics and responsibility into the curriculum.
CDS offers a whole gamut of programs—from a bachelor’s degree to a PhD in data science—and we have built ethics and responsibility into the curriculum
Russel E. Caflisch
Director, Courant Institute of Mathematical Sciences
AI used to be a rather specialized topic, mostly in computer science (CS). With the success of so-called deep learning or neural networks, people from many disciplines became involved. The focus is still mainly on CS, because core CS methods drive most of the applications.
A tipping point for AI occurred around 2011 with the classification of images and recognition of objects. This was done by Yann LeCun here at NYU, Geoffrey Hinton at the University of Toronto, and Yoshua Bengio at the University of Montreal. They pushed it from being an interesting idea to being an explosively powerful method.
One goal is to get away from the need for big data and to enable machines to learn from a small number of examples like people do.
There are a few AI methods generating a lot of success, but I believe that number will grow. One limitation is that when a method doesn’t work, we don’t know why, and when it does work, we also don’t know why. New ideas are often poorly understood at first. For example, it took decades for quantum mechanics to be understood and accepted.
One goal is to get away from the need for big data and to enable machines to learn from a small number of examples like people do.
Chair, Department of Computer Science, Courant Institute of Mathematical Sciences
The roots of AI are primarily in computer science, and it’s an increasingly significant part of computer science, integrating influences of, and having impact in, many other disciplines. At NYU, the history of AI goes back to our department’s founding, but our main efforts to expand in this area started in the 2000s, way before AI became a household word. We were very lucky to hire Yann LeCun in 2003. Now about a third, if not a greater fraction, of our faculty in one way or another are involved in AI research.
Yann, Geoffrey Hinton, and Yoshua Bengio were behind the deep learning revolution in the late 2000s. This technology, once considered a dead end, started outperforming most competing approaches. Once this happened, a lot of research and applications shifted to deep learning–based techniques. That resulted in major advances, in areas from computer vision (e.g., facial recognition) to natural language processing (e.g., Google Translate).
Courant, Tandon, and the Center for Data Science started an NYU AI initiative aiming to develop AI at NYU as an integrated whole, across different schools, departments, and centers. Applications of AI are now ubiquitous: some of my colleagues are working with the medical school on AI techniques identifying patients at risk of life-threatening conditions based on medical records. A Tandon faculty member works on applications of AI to surgical and rehabilitation robotics. A collaboration of physicists and computer scientists aims to apply AI techniques to applications in astrophysics. AI is already widely used in biology and is now making inroads in applications from art to crime prediction. The list goes on and on.
At NYU, the history of AI goes back to our department’s founding, but our main efforts to expand in this area started in the 2000s, way before AI became a household word.
Chair, Department of Computer Science and Engineering, Tandon School of Engineering
Tandon is increasingly getting into AI for health, supporting doctors at the Grossman School of Medicine in diagnosis or interpretation of image data, or collecting patient information to do prediction. We work with ophthalmology for studying eye diseases, diagnosis in mammography, learning how the infant brain grows. A second effort is robotics. Four robotics faculty have been hired over the last few years. Another is the concern about responsibly using AI.
We have two computer science departments. Courant may lean more toward theoretical aspects, whereas Tandon historically has a stronger effort in engineering. We have a very strong security program and medical imaging, for example, while Courant has computer vision. About three years ago, when I became chair, there was a strong push to meet regularly and write documents together. A white paper—Computing at NYU—summarized our efforts. The byproduct of the paper was the formation of the university’s AI Now Institute. We increasingly synchronize when we hire faculty. We share the interview talks. We are working on a joint PhD program between our computer science institutions.
This AI initiative will help us be more competitive so we can hire the best faculty. It will also be a very strong incentive for the best of the best undergraduate, graduate, and PhD students, and also postdocs, to come to Tandon.
Tandon is increasingly getting into AI for health, supporting doctors at the Grossman School of Medicine in diagnosis or interpretation of image data, or collecting patient information to do prediction.
Assistant Professor, Center for Data Science and Tandon School of Engineering
In addition to computer scientists doing basic research in AI, we have expertise in essentially every domain where AI is used, from urban data to medical data to politics. The political and policy climate in New York City also makes us unique.
I was on New York City’s Automated Decision Systems Task Force. Based on our recommendations, the mayor issued an executive order to establish an Algorithms Management and Policy Officer, who, with assistance from external and internal advisory bodies, will help the city understand how to oversee, audit, and regulate its use of algorithms when making decisions. New York City is the first to have passed a law of this sort, with much participation from NYU.
An example is the Department of Education’s use of algorithms to match children with middle schools and high schools. It uses data, preferences of the families and of the schools, and policy as constraints. We want to ensure that members of the public understand these algorithm-assisted processes, can challenge their decisions, and provide meaningful input on their design and operation.
Another area I’m very interested in is employment. Companies use algorithmic assistance to screen résumés and even to decide what job ads are shown to which individuals. And we often see disparities—by gender, race, and disability status—in the way these systems respond to individuals.
Part of the difficulty is that the platforms don’t disclose how these systems work. They should be both legally compelled and incentivized to proactively become good citizens in this space. We have indications that platforms actually want to be told what is legitimate, in part to avoid liability.
Based on our recommendations, the mayor issued an executive order to establish an Algorithms Management and Policy Officer, who will help the city understand how to oversee, audit, and regulate its use of algorithms when making decisions. New York City is the first to have passed a law of this sort, with much participation from NYU.
Cofounder/Codirector, AI Now Institute
I was at Google, and [AI Now Institute] cofounder Kate Crawford was at Microsoft. We were concerned about the effectiveness of AI technologies, about the claims being made, and about what it meant to have technologies created by large corporations implemented in very sensitive contexts, often in ways that weren’t being tested or measured. In 2016, the Obama administration asked us to host one of their Future of AI symposia. In the process of pulling that together, we realized that there wasn’t another center of study that was focused on what it meant to be applying these technologies on humans right now. We decided to start one ourselves.
We have a number of partners. We copublished a toolkit with the ACLU that helps lawyers and civil rights organizations understand the potential implications of AI and algorithmic systems. We host a regular workshop for lawyers trying cases that involve algorithmic systems.
Our research shows that those using AI tend to be those who already have power—employers, police, government agencies. And the people on whom AI is being used tend to be those who do not have power, or particular populations that have borne historical bias or exclusion. In a lot of ways, AI is increasing the power of those who already have it and diminishing the power of those who don’t.
We’ve studied the diversity crisis in AI extensively. The systems are created in shiny conference rooms in Silicon Valley, say, and applied in diverse contexts across the globe, affecting billions. We see a feedback loop between the bias in these technologies and those narrow worldviews. Plus, AI is produced by corporations whose duty is to the shareholders.
We realized that there wasn’t another center of study that was focused on what it meant to be applying these technologies on humans right now. We decided to start one ourselves.
Associate Professor, Data Journalism, Carter Journalism Institute at Arts and Science; Affiliate Faculty Member, Center for Data Science; Author, Artificial Unintelligence
I teach data journalism, which is the practice of finding stories in numbers and using numbers to tell stories. I teach journalists how to code, work with data in order to do algorithmic accountability reporting, and do visual journalism or investigative reporting using data.
NYU’s new undergraduate major in data science and its new minor in journalism will allow students to bring together data science education with rigorous journalistic thinking.
Traditionally, the media’s role was to hold decision makers accountable. In a world where decisions are increasingly made by algorithms, it’s important to keep them and their creators accountable. One way is to audit algorithms for racial or gender bias. As a data journalist, I write my own algorithms—and use AI—to hold institutions and algorithm makers accountable.
One example of algorithmic inequality: Amazon tried to write an algorithm to sort through résumés, and the algorithm kicked out all the women. Algorithmic systems try to reproduce the world as it is. So an algorithm looking for successful future employees will just replicate the types of people who are already in power and will reproduce existing inequality. That’s why we need to push back against a kind of bias I call technochauvinism, which says technology is always superior. Technochauvinists claim that technology is always objective and unbiased. That’s simply not true.
Amazon tried to write an algorithm to sort through résumés, and the algorithm kicked out all the women. Algorithmic systems try to reproduce the world as it is. So an algorithm looking for successful future employees will just replicate the types of people who are already in power and will reproduce existing inequality.