Linguistic Evaluation of Machine-Generated ‘Real’ and ‘Fake’ News
Research on machine-generated fake news has often equated these two qualities, treating the task of identifying “machine-generated” news as equivalent to the task of identifying “fake news.” In this project, we create datasets of machine-generated “real news” and machine- generated “fake news” by using GPT-Neo 1.3B to perform text generation on input from the LIAR dataset including 12.8 K short stataments. Two main goals of the project: 1) to assess whether this approach is an effective way to create compa- rable machine-generated real and fake news, and 2) to ascertain if there are any detectable stylometric or linguistic differences between real and fake news generated in this way.