Homepage > Catalog > Business economics - Information Management

Applying sentiment analysis for tweets linking to scientific papers

Name: Applying sentiment analysis for tweets linking to scientific papers
Price: 42.95 EUR
Availability: InStock
Author: Natalie Friedrich
ISBN: 978-3-668-11271-1

Bachelor Thesis, 2015

66 Pages, Grade: 1,3

Natalie Friedrich (Author)

Excerpt

1. Introduction

2. Materials and Methods
2.1 Dataset
2.1.1 Bibliographic information of tweeted documents
2.1.2 Tweets
2.2 Sentiment analysis
2.2.1 The definition of a sentiment
2.2.2 Sentiment analysis
2.3 Methods
2.3.1 Intellectual coding of sentiments
2.3.2 Removing Twitter affordances
2.3.3 Removing title terms
2.3.4 Adapting the lexicon
2.3.5 Removing non-English terms
2.3.6 Calculating sentiments per discipline

3. Results and Discussions
3.1 The ground truth
3.1.1 Sentiment analysis I
3.1.2 Sentiment analysis II
3.1.3 Sentiment analysis III
3.1.4 Sentiment analysis IV
3.2 Automated analysis of all tweets
3.3 Results
3.4 Discipline specific results

4. Conclusion

5. References

6. Appendix

1. Introduction

In 1922, Max Weber postulated the methodological principle to the point and demanded the value freedom of scientific statements within one sentence:

Eine empirische Wissenschaft vermag niemanden zu lehren, was er soll, sondern nur, was er kann und - unter Umständen - was er will (Weber, 1956, p. 150). Thus, he negated the conception that value judgments can be obtained from scientific knowledge. Scientific knowledge is of the practical benefit to supply relevant and informative theories regarding a practical problem. Therefore, scientific knowledge exchange represents influence within the human interaction due to problem solving approach. Similarly, Merton’s ethos of science and in particular the universalism norm claims that scientists evaluate each other’s work solely on its scientific merit and disregard personal judgements (Merton, 1973). In the present context of the scientific communication and the public exchange of views on the social web, it is of interest to analyze whether the communication and views regarding scientific documents on social media contain sentiments. Particularly in the context of altmetrics (Priem et al., 2010), where the number of tweets are counted as impact, it is crucial to determine if and to what extent tweets contain positive or negative opinions about the papers they link to.

This work will analyze tweets linking to scientific papers to find out if the tweets are positive, or negative or do not express an opinion. This will inform the meaning of tweets as a measure of impact in the context of altmetrics. The following research questions are examined: - In how far can sentiment analysis be used to detect positive or negative statements towards scientific papers expressed on Twitter?

- Do tweets linking to scientific papers express positive or negative opinions? How do sentiments differ by academic discipline?
- How do results affect the meaning of tweets to scientific papers as an altmetric indicator?

The thesis is organized in the following way. In Section 1 the term altmetrics in the context of scholarly communication is defined. Related literature is reviewed in Section Related work. Section 2 with the applied methods and materials including the data set (Section 2.1), the tools (Section 2.2) and the techniques (Section 2.3). Section 3provides and evaluates the results for the steps made in Section 2.3. Section 3 also discusses the results regarding the methods and the results. The thesis concludes in Section 4 with a summary and some directions for future work. The work will focus on the methods because the adaptation of the data set was complex and extensive.

Related work

In the context of scientific communication Bar-Ilan, Shema and Thelwall (2014) emphasize the World Wide Web to have sustainably influenced scholarly communication and raise the question how the impact of the so-called web 2.0 platforms can be measured. With the introduction of Twitter, Facebook and other social networking sites into academia, the idea of using informetric methods to capture traces of research and researchers on the social web to reflect their impact has been introduced and popularized as so-called altmetrics. Four years after being established, Priem and Costello (2011) identified Twitter as a communication platform for scholars and found out that scholars use Twitter.

While citations are mostly positive or neutral at least in the natural sciences, engineering and medicine - and can thus be used as an indicator of scientific influence in these fields - the sentiment of discussions of scientific contents on the social web have not yet been analyzed systematically. By considering scientific communication on Twitter, Weller, and Puschmann (2011) found a way to identify scientific tweets and to perform citation analysis on Twitter. By identifying URLs, which act as external citations and link to a particular paper, they distinguish between scientific and non-scientific tweets. Although there are various motivations to cite, citations in the academic literature are mostly positive or show no judgement at all - hence, negative judgements are rare (Garfield, 1979). Almost 50 years after Weber’s statement, the scientific community discusses research on social networking sites. Do scholars judge on Twitter?

Twitter is a micro-blogging platform that allows sending, sharing as well as reading short text messages for registered users. According to Twitter 500 million tweets were sent per day in March 2015¹. Twitter is used to share information, to stay in contact with virtual contacts and friends, to upload updates regarding personal and daily events and to discuss news (Thelwall et al., 2013). During academic conferences Twitter is used extensively to share information and to discuss on specific topics (Weller & Puschmann, 2011). Nevertheless, there is evidence for little activity on Twitter among presenters at academic conferences (Bar-Ilan et al., 2012). Sentiment analysis can be used to discover the opinion towards academic content on the social web and thus help to indicate whether this impact is positive or negative.

Wilson, Wiebe, and Hoffmann (2005) define sentiment analysis as the task of identifying positive and negative opinions. According to Stock (2013), it is a process to recognize documents containing positive or negative attitudes towards an object. Sentiment analysis in social media is usually applied to analyze sentiments for companies or products for marketing purposes. Kasper and Vela (2011) used sentiment analysis to gain opinions on online information towards hotels while Wang, Liu, Song, and Lu (2014) developed a sentiment analysis approach for product reviews. Pang and Lee (2008) described existing techniques and approaches for opinion-oriented information retrieval in their study. Since its launch in 2006, Twitter, a microblogging platform, has developed into an important communication channel on the web, being used as daily chatter and for information sharing (Java, Song, Finin, & Tseng, 2007). Twitter became the object of sentiment analysis (Pak & Paroubek, 2010) analyzing the public opinion towards political events (Shamma, Kennedy, & Churchill, 2010) or the public mood in general (Bollen, Mao & Zeng, 2011). Bollen, Mao, and Zeng (2011) showed the public mood’s correlation with the Dow Jones and that it is even possible to predict the stock market’s behavior by analyzing the public mood via Twitter. A specific mood towards a specific topic can be given via Twitter: positive events, like the Oscars, may cause negative reactions on Twitter (Thelwall, Buckley, & Paltoglou, 2011). Pak and Paroubek (2010) collected a corpus to train a sentiment classifier determining positive, negative and neutral sentiments of tweets. To collect the corpus a TreeTagger for POS- tagging was used and the difference in distributions among positive, negative and neutral sets was observed. Pak & Paroubek (2010) took emoticons into account for defining the training data. Barbosa & Feng (2010) presented a two-step sentiment analysis classification for Twitter that first classifies a tweet as subjective and objective, and further distinguishes the subjective tweet as positive or negative. Read (2005) created a corpus using emoticons to receive positive and negative examples, and used then various classifiers. Wilson, Wiebe, & Hoffmann (2009) found out that in phase-level sentiment analysis positive and negative words from a lexicon were used much more often than they were used in expression of the opposite polarity - this very fact has to be beard in mind because it may be difficult to assign negative and positive terms in the scientific context. Agarwal, Xie, Vovsha, Rambow, & Passonneau (2011) observed sentiment analysis for twitter data and presente 75% average accuracy for the two-way classification (positive and negative sentiments) and a 60.5% average accuracy for the three-way classification (positive, negative and neutral). Although the two-way classification is more accurate in automatic sentiment analysis, the three-way classification is more comparable to manual coding. (Kouloumpis, Wilson, & Moore, 2011) claimed that the POS-tagging may not be useful in the microblogging domain but microblogging features, for example intensifiers, emoticons and abbreviations were the most useful. Using hashtags to collect traning data remains also useful.

Considering scientific communication in the microblogging domain Thelwall, Tsou, Weingart, Holmberg, & Haustein (2013) performed a small manual sentiment analysis. In the survey, the question is discussed if tweets linking to scientific articles are typically positive, negative, neutral or mixed in tone. For this purpose, a twitter dataset was collected by running a Twitter query from 4 March 2012 to 16 October 2012 for each of a number of URLs of journals or digital libraries. A random sample of 270 of 159,076 tweets was selected to undertake a content analysis and to construct a set of relevant categories. The four categories including a summary of the tweet, an attribution of the tweeted articles featured by e.g. a target, a sentiment about the tweet and an expression of interest, resulted in 73% dealing with the title of tweets and 18% mentioning the author of the tweet. Only 4.8% of the tweets were marked as positive or interesting and none as negative - so, the majority of the tweets has no sentiment at all. Sentiment analysis of scholarly discussions on Twitter are rare with Thelwall representing the only study analyzing the sentiment of tweets to a small sample of journal articles.

This work focuses on sentiment expressed on social media with regard to scientific papers.

2. Materials and Methods

To answer the question, if scholarly discussions on Twitter contain sentiment, 663,547 tweets linking to 238,281 scientific articles were selected as the corpus of the study. Since it is not possible to identify and analyze the meaning of every tweet intellectually, it was decided to detect sentiments automatically. Since judgments are part of a subjective opinion, it was decided to use opinion-mining tools to detect sentiments and to find out Twitter’s role in scholarly communication. For this purpose, 1,000 random tweets were coded intellectually regarding positive, negative and neutral sentiments. On this basis, the ground truth was compared to results by selected sentiment analysis tools to gain an overview of the reliability of the given sentiment analysis tools. As a result, several improvements were applied to improve the accuracy of the tool and therefore to improve final results of automated sentiment detection.

In the following section, the materials, such as the Twitter dataset (Section 2.1) as well as the sentiment tools (Section 2.2) used for the study are introduced. The last Section (Section 2.3) describes the generation and adaptation of the Twitter.

2.1 Dataset

The dataset consists of tweets mentioning scientific articles as captured by Altmetric.com. The study focuses on papers indexed in the Web of Science (WoS) as articles or reviews and published 2012. The papers were linked to the Altmetric.com database using DOIs, excluding all documents without at DOI in WoS. All tweets published until June, 30th 2014 were selected for the study. Overall, 663,547 tweets were linked to 238,281 articles.

2.1.1 Bibliographic information of tweeted documents

Web of Science supplies the bibliographic information including the publication year, document type, DOI, article title and journal name. The classification of the journals is based on the journal level classification of the National Science Foundation (NSF). The NSF system classifies each journal and thus each article into exactly one scientific domain (e.g., Social Sciences), discipline (e.g., Sociology) and specialty (e.g., Social Problems). The tweeted papers belonged to 14 disciplines and 143 specialties.

2.1.2 Tweets

The main research focuses on the data derived from Twitter. The altmetrics.org² database provides the tweets linking to the articles via the DOI. Since the collection of the tweets via the Twitter-API needs to be continuously carried out, the tweets are derived from altmetric.com. Altmetric.com collects tweets since June 2011 if they contain the digital object identifier (DOI), the PubMed identifier (PMID) and other ids, or link to the article on the publisher’s website (Priem, Groth, & Taraborelli, 2012).

Twitter text messages consist of a maximal length of 140 characters providing the actual text message but also hashtags, usernames, pictures and URLs. Since the users are limited to use 140 characters, they using abbreviations, contractions and acronyms or shorten, cut or truncate messages and use slang.

To stay up-to-date with the published tweets, users are able to follow and simultaneously to be followed by other users. Retweeted tweets are tweets that are re-sent and do not differ from the original tweet (Java, Song, Finin, & Tseng, 2007). Usually retweets are marked with the label RT - for retweet. Originally, this label was used to mark a tweet quote and is now an official Twitter affordance.

Since the number of characters is limited in a tweet, it is assumed that every opinion mentioned in a tweet linking to a paper must be related to the paper - so the opinion in the tweet reflects the user’s opinion towards the paper the tweet is linked to. Retweets were therefore excluded from the data set as they indicate distribution rather than original contribution and were not assumed to carry additional meaning; modified retweets are kept as they may include a subjective opinion in front of the retweeted tweet - that subjective opinion may be important for the sentiment analysis.

For the intellectual assessment and the pre-test 1,000 random tweets were chosen from the data set.

2.2 Sentiment analysis

The following Section is given to define a sentiment and the process of sentiment analysis and sentiment analysis tools.

2.2.1 The definition of a sentiment

A sentiment represents a mood, a feeling or perception or an emotion based opinion a person might have and might express. Taboada, Brooke, Tofiloski, Voll, and Stede (2011) explained a sentimental orientation as a way to measure subjectivity or opinion in texts. The difference between subjectivity and objectivity in message is decisive: only subjective texts can be considered to contain sentiments. To define a sentimental orientation Wiebe, Wilson, and Cardie (2005) specified a positive expression into positive emotions, evaluations and stances. Analog to that a negative expression is sub classified into negative emotions, evaluations and stances. Go, Bhayani, & Huang (2009, p. 2) defined a sentiment as “a personal positive or negative feeling”. To distinguish tweets with and without a sentiment Go, Bhuyani, and Huang introduced the litmus test to decide whether a tweet contained sentiment: if a tweet could appear as a newspaper headline, it was assumed not to carry any sentiments.

2.2.2 Sentiment analysis

Sentiment analysis is also known as opinion mining and sentiment detection. In particular, sentiment analysis describes a process of extracting and understanding the sentiments in every kind of text documents. Besides, the sentiment classification is named positive, negative, both or neutral in due to create a lexicon for sentiment analysis. A sentiment can include words, punctuation marks and emoticons. Emoticons are strings of characters, which, ordered in a special sequence, can express an emotion or a feeling. Sentiment analysis can be applied on at least two levels: the word-level sentiment analysis and the sentence-level sentiment analysis. Word-level analysis determines the sentiment value of a word or a phrase while sentence- level and document-level analyses identify a dominant or an overall sentiment value of a sentence or document (Pang & Lee, 2008). The aim of the sentence- and document-level analyses is that a sentence or document may contain a mixture of positive and negative sentiment values. Pang and Lee (2008) denoted the detection of sentiments as the body of work that deals with the computational treatment of opinion, sentiment or subjectivity in all kind of texts.

A sentiment analysis tool automatically classifies word or sentences into sentimental orientations. There are two-way classifications into positive and negative sentiments and three-way classifications positive, negative and neutral sentiments. Depending on the tool, the given values may differ: the assignments of a single value as well as the assignments of two opposite values are possible. Depending on the applied corpus, the methods may also differ. Sentistrength uses an algorithm, which simultaneously extracts positive and negative sentiments from short informal texts (Thelwall et al., 2010).

In the context of the given limitation of characters on Twitter, sentence-level analysis is of interest for this study. Tying up the manual sample and considering the problem, which occurred in the manual analysis, this survey tries to identify sentiments upon a huge tweet set by an automated sentiment analysis.

Sentiment analysis tools implement the theoretical ideas of the sentiment analysis. Depending on the theoretical background, the methods can vary. To find an appropriate sentiment analysis tool, the search query ‘sentiment detection tool’ was performed on Google: The results gave a first access into the selection of sentiment analysis tools. Suitable tools were selected with respect to the following criteria. First of all, the tool had to be free or accessible via an educational account. Besides, analyzing short texts, such as tweets, should be the main focus of the tool as well as the ability to classify a huge amount of tweets automatically. Considering this purposes, two sentiment analysis tools were selected: sentiment140.com³ and Sentistrength (Thelwall, Buckley, & Paltoglou,2012). As a sentiment analysis is a lexicon- based process, some terms cannot be classified correctly. A great problem constitutes irony but far more misclassifications can cause problems: in ‘6,000 times greater’ the term ‘great’ would be classified as a positive sentiment although it is used here as a relational operator. Some expressions, which are not containing a clearly negative sentiment, may contain a negative connotation that only can be detected intellectually.

sentiment140.com

The sentiment analysis tool sentiment140.com extracts sentiments from short common and informal texts, especially from tweets. According to the online presence of the tool the lexicon is trained on a common corpus based on selected tweets from April, the 6th until the end of May, 2009. Detailed features regarding creating the training corpus or classifying terms are not known. The training corpus is available but there is no access to the lexicon of the sentiment analysis tool. The corpus is only available in English, so only English tweets can be analyzed correctly. The given sentiment values are 0 for negative, 2 for neutral and 4 for positive corresponding the three-way classification. For better comparison with the second sentiment analysis tool and the manual coding, the values are normalized and transformed into the values -1 for negative, 0 for neutral and +1 for positive sentiments. There is no option for missing values, a tweet is strictly categorized into one of these polarity values. If a tweet cannot be classified as positive nor negative according to the lexicon, it is considered neutral, that is, not containing any sentiment.

To process the tweets, they have to be uploaded as a .txt-file separated with a line-break. 10,000 tweets can be analyzed at the same time.

The sentiment analysis tool SentiStrengh was written and developed in Java. Although there is no access to the source code, Sentistrength shows a high transparency due to the functionality of the tool. Downloading the full program for academic research provides access to the lexicon consisting of several .txt.-files. The .txt-files, which contain a collection of 298 positive and 465 negative terms with sentiment strength from 2 to 5, are basically the core of the string-matching algorithm. These classifications are based on intellectual decisions during the development. The lexicon includes a booster word list containing words, which increase or decrease the values of the sequencing words - for example, the term ‘very’ would increase the value of the sequencing word by 1. In questions, negative emotions are ignored. Moreover, the lexicon includes a negating word list containing words, which invert sequencing words: ‘very happy’, previously valued with +4 would be inverted to -4, if the term ‘not’ had been added. The emoticon list contains the strengths for emoticons. Repeated letters as well as repeated punctuation increase the strength by 1; exclamation marks increase the strength by 2 (Thelwall et al., 2010). Each tweet receives a negative and a positive strength from 2 to 5 (Thelwall, 2013). To compare the strength to the intellectual coding, the negative and the positive strength were converted to an average value and normalized to the values -1 for negative, 0 for positive and +1 for positive sentiment. 16,000 tweets can be analyzed per one run.

2.3 Methods

2.3.1 Intellectual coding of sentiments

In the pre-test the sentiment of 1,000 random tweets was analyzed intellectually in order to determine the validity and accuracy of the automatic sentiment analyses carried out by the sentiment analysis tools (Section 2.2.2). Tweets were compared to the article titles due to detect their sentiment regarding the content (si). The evaluation of the tweets covers three sentiments: neutral (0), negative (-1) and positive (1). Table 1 provides examples of the 1,000 intellectually analyzed tweets. Tweets, which simply express the content or discuss the topic of the article they linked to, were classified as neutral and thus flagged with the sentiment value 0 [see Table 1 tweet id: 68676]. Tweets which contained only the original title of the linked article were defined as clearly neutral and were also included in the first category, as they did not include any opinion by the Twitter user [tweet id: 7495]. Similarly tweets containing ‘Most popular’ [tweet id: 1603054], ‘New Article’ [tweet id: 1636995] and ‘New review’ [tweet id: 1634723] were defined as neutral as they did not reflect any opinion towards the paper. Tweets, which contain an explicit positive adjective, e.g. interesting [tweet id: 149186], superb [tweet id: 149186] and nice [tweet id: 1124785], were assumed to have a positive opinion towards the paper and are marked with a positive sentiment value (1). Negative valued tweets were mostly not identified by a particular term but by the general context within the tweet: [tweet id: 47177] - here the term study was used ironically implying the opposite of something that is defined as a study. The term “flash report” [tweet id: 1796390] is not connoted pejoratively in the general sense but in the context of scientific communication on Twitter this term demotes the whole opinion towards the linked article. That leads to a negative sentiment (-1). Furthermore, expressions are used with an opposite meaning as in [tweet id: 1994553]: here ‘lol’, what is an abbreviation for laughing out loud and is used as an expression of happy feelings, is used to express malice and is thus marked as a negative sentiment.

Table 1. Examples of tweets - 1st intellectual sentiment analysis (si)