Facebook is getting even closer to a human-level understanding of what people are saying.
Facebook has developed DeepText, a new way to parse text using artificial intelligence processes that’s quicker at picking up new languages and slang than traditional approaches.
In a company blog post published on Wednesday, three members of the company’s applied machine learning team — Ahmad Abdulkader, Aparna Lakshmiratan and Joy Zhang — announced the technology that’s already being used across Facebook and Facebook Messenger.
DeepText is able to churn through “several thousands of posts per second” across more than 20 languages and understand what’s being communicated with “near-human accuracy,” according to the announcement post.
Facebook’s ability to comprehend what people are saying on its platform isn’t new. The company has been doing that for years in order to pick out which posts to show in people’s news feeds, which ads to show them and, more recently, which posts to show in its search results.
The new part is how good Facebook is getting at understanding what people are saying and, as importantly, how quickly it’s going to be able to get even better.
With DeepText, Facebook isn’t taking the traditional approach to computationally understanding text. The traditional approach basically entails writing a combination of the Oxford English Dictionary, Encyclopedia Britannica and a grammar textbook that a computer can reference when it’s processing words, sentences and paragraphs and repeating the process for each language the computer needs to learn. That human-dependent process limits the computer’s knowledge base to what humans would be able to teach it. It would be better for a computer to be able to teach itself, a process called machine learning that is exactly what Facebook’s team has adopted.
Facebook’s team took inspiration from a research paper published last year by members of Facebook’s artificial intelligence team that eschewed the traditional word-based approach for a character-based one. From what I can understand of that research paper, the character-based approach means that a computer doesn’t need a human-compiled dictionary to start learning what words mean and how they relate to one another; it can figure out those meanings and relationships on its own by starting from scratch at the character level.
That is to say, DeepText is able to pick up slang and new languages super-quickly without being limited by humans’ ability to teach it those words. Or put another way, DeepText is able to learn languages like Keanu Reeves was able to learn kung fu in “The Matrix,” except it doesn’t have to first get its butt kicked by Morpheus.