Artificial intelligence is front and center, with business and government leaders pondering the right moves. But what’s happening in the lab, where discoveries by academic and corporate researchers will set AI’s course for the coming year and beyond? Our own team of researchers from PwC’s AI Accelerator has homed in on the leading developments both technologists and business leaders should watch closely. Here’s what they are and why they matter.
What it is: Deep neural networks, which mimic the human brain, have demonstrated their ability to “learn” from image, audio, and text data. Yet even after being in use for more than a decade, there’s still a lot we don’t yet know about deep learning, including how neural networks learn or why they perform so well. That may be changing, thanks to a new theory that applies the principle of an information bottleneck to deep learning. In essence, it suggests that after an initial fitting phase, a deep neural network will “forget” and compress noisy data—that is, data sets containing a lot of additional meaningless information—while still preserving information about what the data represents.
Why it matters: Understanding precisely how deep learning works enables its greater development and use. For example, it can yield insights into optimal network design and architecture choices, while providing increased transparency for safety-critical or regulatory applications. Expect to see more results from the exploration of this theory applied to other types of deep neural networks and deep neural network design.
What it is: Capsule networks, a new type of deep neural network, process visual information in much the same way as the brain, which means they can maintain hierarchical relationships. This is in stark contrast to convolutional neural networks, one of the most widely used neural networks, which fail to take into account important spatial hierarchies between simple and complex objects, resulting in misclassification and a high error rate.
Why it matters: For typical identification tasks, capsule networks promise better accuracy via reduction of errors—by as much as 50 percent. They also don’t need as much data for training models. Expect to see the widespread use of capsule networks across many problem domains and deep neural network architectures.
What it is: A type of neural network that learns by interacting with the environment through observations, actions, and rewards. Deep reinforcement learning (DRL) has been used to learn gaming strategies, such as Atari and Go—including the famous AlphaGo program that beat a human champion.
Why it matters: DRL is the most general purpose of all learning techniques, so it can be used in the most business applications. It requires less data than other techniques to train its models. Even more notable is the fact that it can be trained via simulation, which eliminates the need for labeled data entirely. Given these advantages, expect to see more business applications that combine DRL and agent-based simulation in the coming year.
What it is: A generative adversarial network (GAN) is a type of unsupervised deep learning system that is implemented as two competing neural networks. One network, the generator, creates fake data that looks exactly like the real data set. The second network, the discriminator, ingests real and synthetic data. Over time, each network improves, enabling the pair to learn the entire distribution of the given data set.
Why it matters: GANs open up deep learning to a larger range of unsupervised tasks in which labeled data does not exist or is too expensive to obtain. They also reduce the load required for a deep neural network because the two networks share the burden. Expect to see more business applications, such as cyber detection, employ GANs.
What it is: The biggest challenge in machine learning (deep learning, in particular), is the availability of large volumes of labeled data to train the system. Two broad techniques can help address this: (1) synthesizing new data and (2) transferring a model trained for one task or domain to another. Techniques, such as transfer learning (transferring the insights learned from one task/domain to another) or one-shot learning (transfer learning taken to the extreme with learning occurring with just one or no relevant examples)—making them “lean data” learning techniques. Similarly, synthesizing new data through simulations or interpolations helps obtain more data, thereby augmenting existing data to improve learning.
Why it matters: Using these techniques, we can address a wider variety of problems, especially those with less historical data. Expect to see more variations of lean and augmented data, as well as different types of learning applied to a broad range of business problems.
What it is: A high-level programming language that more easily enables a developer to design probability models and then automatically “solve” these models. Probabilistic programming languages make it possible to reuse model libraries, support interactive modeling and formal verification, and provide the abstraction layer necessary to foster generic, efficient inference in universal model classes.
Why it matters: Probabilistic programming languages have the ability to accommodate the uncertain and incomplete information that is so common in the business domain. We will see wider adoption of these languages and expect them to also be applied to deep learning.
What it is: Different types of deep neural networks, such as GANs or DRL, have shown great promise in terms of their performance and widespread application with different types of data. However, deep learning models do not model uncertainty, the way Bayesian, or probabilistic, approaches do. Hybrid learning models combine the two approaches to leverage the strengths of each. Some examples of hybrid models are Bayesian deep learning, Bayesian GANs, and Bayesian conditional GANs.
Why it matters: Hybrid learning models make it possible to expand the variety of business problems to include deep learning with uncertainty. This can help us achieve better performance and explainability of models, which in turn could encourage more widespread adoption. Expect to see more deep learning methods gain Bayesian equivalents while a combination of probabilistic programming languages start to incorporate deep learning.
What it is: Developing machine learning models requires a time-consuming and expert-driven workflow, which includes data preparation, feature selection, model or technique selection, training, and tuning. AutoML aims to automate this workflow using a number of different statistical and deep learning techniques.
Why it matters: AutoML is part of what’s seen as a democratization of AI tools, enabling business users to develop machine learning models without a deep programming background. It will also speed up the time it takes data scientists to create models. Expect to see more commercial AutoML packages and integration of AutoML within larger machine learning platforms.
What it is: A digital twin is a virtual model used to facilitate detailed analysis and monitoring of physical or psychological systems. The concept of the digital twin originated in the industrial world where it has been used widely to analyze and monitor things like windmill farms or industrial systems. Now, using agent-based modeling (computational models for simulating the actions and interactions of autonomous agents) and system dynamics (a computer-aided approach to policy analysis and design), digital twins are being applied to nonphysical objects and processes, including predicting customer behavior.
Why it matters: Digital twins can help spur the development and broader adopting of the internet of things (IoT), providing a way to predictively diagnosis and maintain IoT systems. Going forward, expect to see greater use of digital twins in both physical systems and consumer choice modeling.
What it is: Today, there are scores of machine learning algorithms in use that sense, think, and act in a variety of different applications. Yet many of these algorithms are considered “black boxes,” offering little if any insight into how they reached their outcome. Explainable AI is a movement to develop machine learning techniques that produce more explainable models while maintaining prediction accuracy.
Why it matters: AI that is explainable, provable, and transparent will be critical to establishing trust in the technology and will encourage wider adoption of machine learning techniques. Enterprises will adopt explainable AI as a requirement or best practice before embarking on widespread deployment of AI, while governments may make explainable AI a regulatory requirement in the future.