The BBC reports that Facebook has developed a new chatbot that was trained using Reddit content. Yes, you read that right, they trained a chatbot using Reddit. I will let that sink in for a minute. Yes, it is just as bad an idea as it sounds. A quote from the article confirms this:
Numerous issues arose during longer conversations. Blender would sometimes respond with offensive language, and at other times it would make up facts altogether.Facebook uses 1.5bn Reddit posts to create chatbot. (2020, May 4). Retrieved May 7, 2020, from https://www.bbc.com/news/technology-52532930
Just about what you would expect from someone learning how to converse using Reddit as their teaching tool.
I completely understand the desire to create chatbots that learn using machine learning algorithms but shouldn’t there be some level of responsibility in training them using data sets that don’t have a propensity to hate speech and other offensive language? What’s next, training chatbots using 4chan content? It’s time to for developers to wake up and realize that just because you can do something doesn’t mean you should. Were the results interesting? Sure. But I suspect there are better data sets to use to train your chatbot than an online community not known for it’s civility.