Alibaba iDST develops a deep-learning model that scored higher than a human being

This is the first time that machine outperforms humans in a global reading test.

Update: 2018-01-16 08:00 GMT
World's first conversational Artificial Intelligence (AI) enabled nutritionist.

Alibaba’s iDST (Institute of Data Science of Technologies), the fundamental research arm on artificial intelligence under Alibaba Group, developed a deep learning model which surpassed human performance on Stanford’s reading comprehension test. This is the first time that machine outperforms humans in a global reading test.

SQuAD, or Stanford Question Answering Dataset, is a large-scale reading comprehension dataset, consisting of over 100,000 question-answer pairs based on more than 500 Wikipedia articles.

Participating teams are required to build machine learning models that can provide answers to the questions in the dataset. It is perceived as the world’s top machine reading comprehension test and attracts universities and institutes ranging from Google, Facebook, IBM, Microsoft to Carnegie Mellon University, Stanford University and Allen Research Institute.

On January 11th, the deep neural network model developed by Alibaba generated results of score 82.44 in Exact Match – providing exact answers to questions - beating the score by humans (82.304), marking the first time that a machine is proved performing better than humans in reading comprehension.

The model, which leverages the innovative Hierarchical Attention Network that reads from paragraphs to sentences to words in order to locate the precise phases with potential answers, is seen with great commercial values. The underlying technology has been applied in Alibaba’s Global Shopping Festival over the years, with machines answering huge inbound inquiries during the big sales period.

Luo Si, Chief Scientist of Natural Language Processing (NLP) at Alibaba iDST commented: “It is our great honor to witness the milestone where machines surpass humans in reading comprehension. That means objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines. To our excitement, we believe the technology underneath can be gradually applied to numerous applications such as customer services, museum tutorial, and online response to inquiry from patients, freeing up human efforts in an unpresented way.”

“We are thrilled to see NLP research has achieved significant progress over the year. We look forward to sharing our model building methodology to the wider community and exporting the technology to our clients in the short future.” Si added.

Alibaba iDST NLP team has received the best scores in previous global evaluations including ACM CIKM cup of personalized e-commerce search, Chinese Grammar Error Diagnosis and English named entity classification task in the Text Analysis Conference.

Similar News