内容简介:Facebook AI has built the first AI system that can solve advanced mathematics equations using symbolic reasoning. By developing a new way to represent complex mathematical expressions as a kind of language and then treating solutions as a translation probl
Facebook AI has built the first AI system that can solve advanced mathematics equations using symbolic reasoning. By developing a new way to represent complex mathematical expressions as a kind of language and then treating solutions as a translation problem for sequence-to-sequence neural networks, we built a system that outperforms traditional computation systems at solving integration problems and both first- and second-order differential equations.
Previously, these kinds of problems were considered out of the reach of deep learning models, because solving complex equations requires precision rather than approximation. Neural networks excel at learning to succeed through approximation, such as recognizing that a particular pattern of pixels is likely to be an image of a dog or that features of a sentence in one language match those in another. Solving complex equations also requires the ability to work with symbolic data, such as the letters in the formula b - 4ac = 7 . Such variables can’t be directly added, multiplied, or divided, and using only traditional pattern matching or statistical analysis, neural networks were limited to extremely simple mathematical problems.
Our solution was an entirely new approach that treats complex equations like sentences in a language. This allowed us to leverage proven techniques in neural machine translation (NMT), training models to essentially translate problems into solutions. Implementing this approach required developing a method for breaking existing mathematical expressions into a language-like syntax, as well as generating a large-scale training data set of more than 100M paired equations and solutions.
When presented with thousands of unseen expressions — equations that weren’t part of its training data — our model performed with significantly more speed and accuracy than traditional, algebra-based equation-solving software, such as Maple, Mathematica, and Matlab. This work not only demonstrates that deep learning can be used for symbolic reasoning but also suggests that neural networks have the potential to tackle a wider variety of tasks, including those not typically associated with pattern recognition. We’re sharing details about our approach as well as methods to help others generate similar training sets.
A new way to apply NMT
Humans who are particularly good at symbolic math often rely on a kind of intuition. They have a sense of what the solution to a given problem should look like — such as observing that if there is a cosine in the function we want to integrate, then there may be a sine in its integral — and then do the necessary work to prove it. This is different from the direct calculation required for algebra. By training a model to detect patterns in symbolic equations, we believed that a neural network could piece together the clues that led to their solutions, roughly similar to a human’s intuition-based approach to complex problems. So we began exploring symbolic reasoning as an NMT problem, in which a model could predict possible solutions based on examples of problems and their matching solutions.
An example of how our approach expands an existing equation (on the left) into an expression tree that can serve as input for a translation model. For this equation, the preorder sequence input into our model would be: (plus, times, 3, power, x, 2, minus, cosine, times, 2, x, 1).
To implement this application with neural networks, we needed a novel way of representing mathematical expressions. NMT systems are typically sequence-to-sequence (seq2seq) models, using sequences of words as input, and outputting new sequences, allowing them to translate complete sentences rather than individual words. We used a two-step approach to apply this method to symbolic equations. First, we developed a process that effectively unpacks equations, laying them out in a branching, treelike structure that can then be expanded into sequences that are compatible with seq2seq models. Constants and variables act as leaves, while operators (such as plus and minus) and functions are the internal nodes that connect the branches of the tree.
Though it might not look like a traditional language, organizing expressions in this way provides a language-like syntax for equations — numbers and variables are nouns, while operators act as verbs. Our approach enables an NMT model to learn to align the patterns of a given tree-structured problem with its matching solution (also expressed as a tree), similar to matching a sentence in one language with its confirmed translation. This method lets us leverage powerful, out-of-the-box seq2seq NMT models, swapping out sequences of words for sequences of symbols.
Building a new data set for training
Though our expression-tree syntax made it theoretically possible for an NMT model to effectively translate complex math problems into solutions, training such a model would require a large set of examples. And because in the two classes of problems we focused on — integration and differential equations — a randomly generated problem does not always have a solution, we couldn’t simply collect equations and feed them into the system. We needed to generate an entirely novel training set consisting of examples of solved equations restructured as model-readable expression trees. This resulted in problem-solution pairs, similar to a corpus of sentences translated between languages. Our set would also have to be significantly larger than the training data used in previous research in this area, which has attempted to train systems on thousands of examples. Since neural networks generally perform better when they have more training data, we created a set with millions of examples.
Building this data set required us to incorporate a range of data cleaning and generation techniques. For our symbolic integration equations, for example, we flipped the translation approach around: Instead of generating problems and finding their solutions, we generated solutions and found their problem (their derivative), which is a much easier task. This approach of generating problems from their solutions — what engineers sometimes refer to as trapdoor problems — made it feasible to create millions of integration examples. Our resulting translation-inspired data set consists of roughly 100M paired examples, with subsets of integration problems as well as first- and second-order differential equations.
We used this data set to train a seq2seq transformer model with eight attention heads and six layers. Transformers are commonly used for translation tasks, and our network was built to predict the solutions for different kinds of equations, such as determining a primitive for a given function. To gauge our model’s performance, we presented it with 5,000 unseen expressions, forcing the system to recognize patterns within equations that didn’t appear in its training. Our model demonstrated 99.7 percent accuracy when solving integration problems, and 94 percent and 81.2 percent accuracy, respectively, for first- and second-order differential equations. Those results exceeded those of all three of the traditional equation solvers we tested against. Mathematica achieved the next best results, with 84 percent accuracy on the same integration problems and 77.2 percent and 61.6 percent for differential equation results. Our model also returned most predictions in less than 0.5 second, while the other systems took several minutes to find a solution and sometimes timed out entirely.
Our model took the equations on the left as input — equations that both Mathematica and Matlab were unable to solve — and was able to find correct solutions (shown on the right) in less than one second.
Comparing generated solutions to reference solutions allowed us to easily and precisely validate the results. But our model is also able to produce multiple solutions for a given equation. This is similar to what happens in machine translation, where there are many ways to translate an input sentence.
What’s next for equation-solving AI
Our model currently works on problems with a single variable, and we plan to expand it to multiple-variable equations. This approach could also be applied to other mathematics- and logic-based fields, such as physics, potentially leading to software that assists scientists in a broad range of work.
But our system has broader implications for the study and use of neural networks. By discovering a way to use deep learning where it was previously seen as unfeasible, this work suggests that other tasks could benefit from AI. Whether through the further application of NLP techniques to domains that haven’t traditionally been associated with languages, or through even more open-ended explorations of pattern recognition in new or seemingly unrelated tasks, the perceived limitations of neural networks may be limitations of imagination, not technology.
Written by
François Charton
Visiting Entrepreneur, Facebook AI
Guillaume Lample
Research Scientist, Facebook AI
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
社交电商
[美] Stephan Spencer(斯蒂芬.斯宾塞)、[美] Jimmy Harding(吉米.哈丁)、[美] Jennifer Sheahan(詹尼弗.希汉) / 谭磊 / 电子工业出版社 / 2015-3 / 69.00元
你想要在互联网上赚钱吗?想要做好电子商务吗?那么你一定不能忽视社交媒体的力量。不管你想要营销的是实物商品、电子类产品还是本地的服务,这本书会教你怎么做。 《社交电商》全面介绍形形色色的社交媒体以及如何利用这些社交媒体来为你的企业做好服务。如果你经营得不好,在社交媒体上散发出的只是噪声而不是真正的信息。 而如果做得好,社交媒体会成为你最有效的营销工具,帮助你赢得老客户的拥戴,获得新的客户。 ......一起来看看 《社交电商》 这本书的介绍吧!