Developing an AI system capable of understanding natural language isn’t just time-consuming — it’s really expensive.

Developers have to collect thousands of voice samples and annotate them by hand, a process that often takes weeks.

That’s why researchers at Amazon’s Alexa division pursued transfer learning, which leverages a neural network — i.e., layers of mathematical functions that mimic neurons in the brain — trained on a large dataset of previously annotated samples to bootstrap training in a new domain with sparse data.

In a newly published paper (“Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents”), Alexa AI scientists describe a technique that taps millions of unannotated interactions with Amazon’s voice assistant to reduce errors by 8 percent.

They’ll present the fruit of their labor at the Association for the Advancement of Artificial Intelligence (AAAI) in Honolulu, Hawaii later this year.

These interactions were used to train an AI system to generate embeddings — numerical representations of words — such that words with similar functions were grouped closely together.

The text above is a summary, you can read full article here.