Microsoft researchers have released a chatbot version of a cutting-edge text generator trained on tens of millions of Reddit posts—albeit with a disclaimer in place should things get offensive.
The open-source blueprint, DialoGPT, builds on a breakthrough in language-based artificial intelligence called GPT-2, another separate program released earlier this year that can generate random copy with unprecedented realism and serve as a base foundation for more tailored programs like Microsoft’s chatbot.
As one of the early attempts to channel the GPT-2’s unpredictable tech into a chatbot, the Microsoft project includes a precautionary measure that requires developers to write their own code for translating output data into readable text.
“The conversational text might be different from any large text corpus that the previous works have been using, in that it is less formal, sometimes trollish and generally much more noisy,” the researchers wrote in an accompanying paper.
“Responses generated using this model may exhibit a propensity to express agreement with propositions that are unethical, biased or offensive—or the reverse, disagreeing with otherwise ethical statements.”
Despite the wildcard potential, some researchers think models like GPT-2 could supercharge advances in the type of machine learning that can understand and produce natural language in much the same way that analogous models for image recognition set the scene for an ongoing boom in computer vision AI around 2012.