Towards Fake News Detection
Finished master thesis
- Razan Masood
- Targeted audience
- AI Master
- Ability to read and understand papers written in English.
- Ability to perform academic writing.
- Strong programming skills (e.g.Java, essential)
- Lectures Information Retrieval oder Information Mining (essential) Strong Machine Learning Skills -- must be aware of different classification techniques and have already used some of them for e.g. classifiying text. (essential)
Fake News is one of the critical issues that has drawn attention in the last year. The media is reporting that the social media played an important role in the results of US 2016 elections. Propaganda, conspiracy theories and other false stories have always been used in the media for a second gain like moneytization, political gain and opinion manipulation. Obviously, social media was a powerful tool to spread false stories extensively. On the other hand, journalists and fact checkers with their available strategies cannot label fake stories in time before they are out of control. Automating those strategies is one solution to speed up the procedure. For that, two approaches were suggested by researches, one is the linguistic approaches using natural language processing supported by artificial intelligence and machine learning techniques to find patterns in the text that can detect lies and language leakage. The other way for detecting fake stories are network approaches like using Linked data in knowledge networks, so the new information can be linked to already existing knowledge items which enables the detection of falsification. Moreover, Social network behavior with its metadata can also be utilized for identity investigation and other purposes. A suggested method is to combine both approaches together to improve the detection results.
To solve the fake news problem, a couple of competitions was established to engage as many participants as possible in solving this issue. One of those contests was "Fake News Challenge" which was established in December 2016. The task of the contest was a machine learning problem. However, detecting fake news is hard because of various reasons. However, at least to make the first step the challenge proposed stance detection as a pre-step towards fake news detection. The stance detector should estimate the relative perspective (or stance) of two pieces of text relative to a topic, claim or issue. If there are enough stance information than one can approximate the veracity of the "fake news". In this Master Thesis the candidate should investigate novel methods to perform stance detection within the frame of fake news detection. Experimental data is available through the Fake News Challenge.
- literature scan. This should be done before the actual project starts. Here the student will be given some initial papers. Based on these papers the student should collect more papers, perform a review of all the papers and prepare an oral presentation of 30 mins. providing an intro to the field. This should take 3-5 weeks. Actual work:
- Preprocessing of data. This can be done automatically using Natural Language Processing techniques.
- Feature extraction and supervised learning. The student should perform automatic feature extraction, feature selection and apply machine learning to tackle the stance detection task. In terms of machine learning traiditional as well as deep learning approaches should be investigated. Results of the automatic system should be evaluated using metrics released by the fake news challenge.