• Review Article
  • Published: 09 September 2022

A Review Paper on the Role of Sentiment Analysis in Quality Education

  • Pooja 1 &
  • Rajni Bhalla   ORCID: orcid.org/0000-0003-4032-1645 2  

SN Computer Science volume  3 , Article number:  469 ( 2022 ) Cite this article

2957 Accesses

2 Citations

1 Altmetric

Metrics details

Education is a critical indication of progress and a major factor in well-being. The UNs Sustainable Development Goals establish specific requirements for increasing educational quality and protecting the well-being of children. UN’s agenda for Sustainable Development Goal 4 which aims to “ensure inclusive and equitable quality education and promote lifelong learning opportunities for all” was adopted in India in 2015. Students’ academic success is a vital part of the education system. Predicting student performance has grown more challenging due to the enormous amount of data in educational databases. Low-performing students will experience a variety of difficulties, including delayed graduation and even dropping out. Therefore, educational institutions should closely monitor the academic progress of their students and provide quick assistance to those who have low performance. Using Students’ academic achievement predictions to accomplish that is one method. This method will help educational institutions in identifying and supporting low-performing students at an initial stage. This study presents a systematic review of research on sentiment analysis towards SDG4 quality education through social media platform such as Twitter, Facebook and a review of 21 studies indexed in SCOPUS. Using social media data rather than a conventional survey of the data, evaluation of outspoken opinion and feelings of students towards their institution to obtain Quality Education. In this study, the dataset is taken from kaggle with names as student-performance-data-set which uses two files named as student-math, and student-por which shows the student performance in a Math language course and Portuguese language course, respectively, with 33 attributes and 396 records in each. Of 396 records, 110 records were selected as sample. During the visualization, we analyzed SVM model is stable because even minor data changes have no impact on the hyperplane and it handles the nonlinear data using Kernel techniques.

Working on a manuscript?

Introduction.

Sustainable development goals establish priorities and direction for government, corporations and civil society around the world. Education is a right that belongs to everyone. Out of 17 SDGs of the United Nations, the 4th goal is about Quality Education. It aims to ensure inclusive, equitable quality education and promote lifelong learning opportunities for all by 2030. Due to the innovative nature of digital technologies such as Artificial Intelligence and its subsets like Deep Learning and Internet of things (IoT) are rapidly being used by the Government sector and Industry. The knowledge, abilities, attitudes, and values that enable people to contribute to and profit from a more inclusive and sustainable future must be developed through education. Sentiment analysis or opinion mining plays a promising role in Quality Education (SDG4). Most of the information shared by students on social media platforms includes clear sentimental aspects, and it has become an important research field.

Sentiment Analysis and Literature Review

People’s anxiety and resistance to new technology are frequently a result of their development and application. A collective lack of awareness often fuels this hostile attitude. This animosity is frequently caused by a lack of collective awareness/familiarity with the capabilities and the technologies’ potential advantages. AI applications and their related technologies such as Machine learning, deep learning, and natural language processing are enabling the automation and optimization of a wide range of human tasks, many of which would aid in the achievement of the SDGs. Countries improve their educational institutions to grow further in society [ 1 , 2 , 3 ]. Education is the key dimension of well-being and a crucial indicator of development. 17 SDGs established by UNs Agenda 2030, in 2015 for SDG 4 which aims to “ensure inclusive and equitable quality education and promote lifelong learning opportunities to all” [ 4 ]. According to [ 5 ], Artificial intelligence and other emerging technologies are used in almost every aspect of society, particularly in education. Our society is being revolutionized by digital technologies such as artificial intelligence, block chains, cloud computing, big data, and deep learning, among others. In their daily initiatives and endeavors, industry and government administrations are increasingly relying on digital technologies [ 6 , 7 , 8 ]. Several studies have emerged in recent years to investigate the problems, benefits, and potential impact of Artificial Intelligence [ 9 , 10 , 11 , 12 ] and related technologies (e.g. block chain [ 13 , 14 ]) as drivers for the entire suite of SDGs.

Sentiment analysis is a procedure that automatically analyses natural language utterances, identifies key statements or opinions, and categorizes them based on their emotional attitude. Sentiment techniques can be categorized in Fig. 1 .

figure 1

Approaches to sentiment analysis

There are a large number of studies with a great focus [ 9 , 10 , 11 , 12 , 13 , 14 ], that give a more general assessment of issues, consequences and advantages of combined applications of AI and digital technology applications to the entire SDG framework. Given the significance of these papers to the scope of this research, I've summarized them below:

Over the past few years, numerous studies on understanding the factors influencing a student's performance have been published. A few of these works incorporate only analyzing the traits which have direct or indirect impacts on students’ result and some also incorporate predicting student’s outcomes based on the considered traits utilizing diverse learning algorithms.

Huang et al. [ 12 ] identified the economic, environmental and social barriers to achieving the SDGs with AI. In e-learning scenarios, the authors highlight the need of implementing personalisation strategies when it comes to customizing or recommending teaching materials. Similarly, they proposed comprehensive conceptual frameworks that depict how data from various sources interact with AI approaches and digital techniques for various SDG perspectives. They conclude that artificial digital technologies have a tremendous deal of promise to help the world’s economic, environmental and social sustainability.

Vinuesa et al. [ 11 ] Using a consensus-based method for eliciting opinions, the authors evaluate the emphasis of AI-based services on collectively fulfilling the SDGs. According to the analysis, Artificial Intelligence services can help achieve 79% of SDG targets have been met while hindering 35% of them, 14% leaving a tiny number of “overlapping” ambitions, where depending on the specific uses given, AI could be both an enabler or barrier to the goal(s). Therefore, proper regulation of AI systems could make it more helpful (and reduce the risks associated with it).

For example, social media platforms now give users a profusion of information, the veracity of which is rarely vetted beforehand. As a result, certain social media platforms may experience social and political division, resulting in social and political friction. On the other hand, some artificial intelligence algorithms based on natural language processing (NLP) may be useful in detecting false information on social media [ 15 ].

Esparza et al. [ 16 ] utilized sentiment analysis to investigate the views of students on teacher performance, and they discovered that the difficulty of assessing teacher evaluation feedback could be solved with the use of social mining.

Altrabsheh et al. [ 17 , 18 ] SA methods were used to analyze the feedback of students which identify their positive or negative feeling towards the process of teaching. They used sentiment analysis to automatically model student feedback using SVM, NB, and Maximum Entropy. The results revealed that SVM and NB are best for modeling the student’s feedback.

Yadav et al. [ 19 ] use multimodal sentiment analysis to extract deeper emotions which sometimes may not be possible from textual anlaysis. The usage of audio-video equipment aids in the detection of various target users’ moods. Audio and video are gathered simultaneously from audio–video input to infer emotions from facial expressions in multimodal sentiment analysis. Kastrati [ 20 , 21 ] proposed a model for representing documents with rich semantic content that uses deep learning to automatically categorise financial documents. Also they used different deep learning techniques such as DNN, CNN for managing and classifying educational content for various search and retrieval applications in order to provide a more personalized learning experience. For the purpose of classifying and organizing content in a MOOC scenario, they suggested a video classification system that makes use of multiple transcript feature representations and deep learning. The main aim of their study is to identify the role of higher education in SDGs. They summarized the best practices of QA agencies in promoting SGDs and provided recommendation system internal and external QA system for SDG achievement [ 22 ].

Kandhro et al. [ 23 ]: Different machine learning methods such as Support Vector machine. Multinomial Naive Bayes, Random Forest and Multilayer Perception classifier were used to propose the SA model for improving teaching quality in HE institutions. The study successfully investigated various SA models to identify the best model for analyzing student’s classroom feedback. They stressed that social media websites such as Twitter and Facebook could be used as a source of information and mining of opinions related to student’s learning activities. El-Sayad et al. [ 24 ]: In order to analyze the effects of the COVID-19 epidemic on educational systems, particularly on the psychological health of university students, they used statistics and machine learning techniques. A variety of data were gathered using an online questionnaire, including demographic data, digital tools, sleeping patterns, social contact, academic performance, psychological condition, and a scale for anxiety and sadness.

An automated evaluation approach for assessing student performance has been suggested, along with an analysis of student achievement. To accurately estimate student performance, the author employed the Tree algorithm here. Naive bayes is used frequently to find parameter that affects the performance of student. But Naive Bayes faced issue of zero probability [ 25 ]. Author proposed new algorithm RB-Bayes to solve this issue.

A Brief Interpretation of Sentiment Analysis in SDG4 (Quality Education)

After a thorough review of the literature, many researchers used supervised machine learning techniques such as classification and regression algorithms for the prediction of students’ performance. Classification and Regression algorithms such as Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), K- Nearest Neighbors(KNN, Random Forest(RF) , Artificial Neural Network (ANN), and Linear Regressions (LR) are the most of the researchers choice. Some researchers also used unsupervised machine learning techniques such as K-means, Fuzzy C-Means, Hierarchical, and Apriori algorithms.

Research Gap Identification

In student academic performance prediction, many researchers have developed models for the same using machine learning. They focused on academic, marital, psychological and demographic factors. But there are still some important and unknown factors that may influence the performance of students.

These factors are:

Eating Habits, Sleeping Hours

Mobile addiction

Availing any scholarship (Single Girl Child, EWS, SC, ST, BC etc.)

Place they live (Urban/Rural)

Transportation Facility

Language Barrier

Mode of Admission (Regular/ Private/Distance)

Gap year/ Pass out school background

Hostler or Day scholar

Effect of friend circle

Parents intention and cooperation

High expectation of parents

Career Awareness

Self employment

Purpose of Proposed Model

To study and analyze existing research pertaining to the performance of students for identifying the parameters affecting the performance of the students.

To propose an ensemble model for predicting the performance of the students.

To test and validate the proposed model on various performance metrices.

Proposed Framework

Source of dataset.

In this paper, we took the dataset named as student-performance-dataset from kaggle. It uses 2 files named student-math and student-por which show the student performance in the Math language course and Portuguese language course, respectively. Sizes of these files are 56,993 bytes and 93,220 bytes. In this dataset, there are 33 attributes (like student's school, age, sex, address, family size, mother_education(medu), father_education,(fedu),mother’s job(mjob), Father’s job(fjob), parent’s cohabitation status grade1(G1),grade2(G2), final_grade(G3) etc. ) and 396 records in each file.

We took sample of 110 records to visualize the relationship between age and grade. http://www.kaggle.com/datasets/larsen0966/student-performance-data-set

The architecture of the proposed model is shown in Fig. 2 .

Data Collection: Data will be collected from online and offline mode through Google form and questionnaire.

Data Preprocessing: Data will be collected from online/offline sources and did not have a proper structure, the data needed to be converted into suitable data for injecting into the proposed model using appropriate preprocessing methods.

Split the dataset : After that we will split the dataset into 2 parts.

Training the dataset

Testing the dataset

Proposed Method (SVM/Naive Bayes): Our proposed model is based on a hybrid approach of sentiment analysis (i.e.SVM and Naive Bayes)

Performance of the model : We will compare the performance of the proposed hybrid model with an existing model which is based on either SVM or Naive Bayes.

figure 2

Structure of proposed model

Visualization of the Dataset

Figure 3 a and b show the graphical representation of the relationship between age and grade attributes. After the

figure 3

a Age vs grade. b Scatter diagram of age and grade variable

During the visualization, we analyzed SVM model is stable because even minor data changes have no impact on the hyperplane and it handles the nonlinear data using Kernel techniques and works quickly with a clear margin of separation, but SVM takes a long training time on a large dataset. On the other hand, Naive Bayes approach works as all the features are independent, but if any one categorical value is missing in the dataset, then it assigns zero probability and would not be able to make any prediction.

In future, we will apply ensemble techniques and will perform comparative analysis to find the best model for prediction.

Future Plan

To make learning and teaching about sustainable development a reality, it is critical to coordinate the activities of stakeholders in university education. To make learning and teaching about sustainable development a reality at all levels to begin and maintain a systematic discussion on innovation in SDG and enhancement in learning and teaching, as well as to encourage/support student/teacher feedback issues related to learning outcomes, assessment, and quality assurance reviews, while taking into account the skills needed to add value [ 29 ].

In future, we will ensemble such a hybrid model based on SVM and NB to overcome the problem of the existing model.

Improved accuracy of classification techniques.

To capture a user’s opinion or thoughts from a text more effectively, ML/DL sentiment analysis approaches and techniques should place a great emphasis on embedding the semantic context using lexical sources such as SentiWordNet, WordNet as well as semantic representation using ontologies [ 19 ].

In this study, we have read and reviewed 21 studies indexed in Scopus based on Sentiment Analysis towards Quality Education and how it can be used for various purposes. In this paper, I could find methods to implement algorithms in the predictive analysis of students’ performance. Machine learning algorithms such as Support Vector classifier (SVC) and Naïve Bayes (NB) were utilized to predict the students’ performance.

We visualized the students’ performance according to existing methods. We will obtain the result of different dimensions separately, and compare them with the proposed model. Then, after this step, to show the superiority of the proposed hybrid model over the traditional approaches of sentiment analysis to achieve quality education which is the fourth goal of SDGs and its main aim towards individual/society is to ensure that all boys and girls complete free, equitable and primary and secondary education leading to relevant and effective learning outcomes. It also empowers people everywhere to live more healthy and sustainable lives.

Rivzi P, Lingard B. Globalising education policy. Routledge; 2009.

Google Scholar  

Costa EB, Fonseca B, Santana MA, de Araujo FF, Rego J. Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput Hum Behav. 2017;73:247–56.

Article   Google Scholar  

Liao SN, Zingaro D, Thai K, Alvarado C, Griswold WG, Porter L. A robust machine learning technique to predict low- performing students. ACM Trans Comput Educ (TOCE). 2019;19:1–19.

Kaurav RP (2020) Theoretical extension of the new education policy 2020 using twitter mining

Palomares I. Reciprocal recommeder system: analysis of state-of-art literature, challenges and opportunities on social recommendation. Inf Fusion Press. 2021. https://doi.org/10.1016/j.inffus.2020.12.001 .

Dunis C. Artificial intelligence in financial markets. Berlin: Springer; 2019.

Lytras MD. The recent development of artificial intelligence for smart and sustainable energy systems and applications. Energies. 2019;12(16):3108.

Mao C. Real time carbon emissions monitoring tool for prefabricated construction: an IoT based system framework. In: ICCREM 2018: Sustainable construction and prefabrication. American Society of Civil Engineers Reston, VA. 2018; pp. 121–7.

Goralski MA. Artificial intelligence and sustainable development. Int J Manag Educ. 2020;18(1):10030.

Truby J. Governing artificial intelligence to benefit the UN sustainable development goals. Sustain Dev. 2020;28(4):946–59.

Vinuesa R. The role of artificial intelligence in achieving the sustainable development goals. Nat Commun. 2020;11(1):1–10.

Huang. Information and communication technologies for sustainable development goals: state-of the-art, needs and perspectives. IEEE Commun Surv Tut. 2018;20(3):2389–406.

Nguyen QK (2016) Blockchain—a financial technology for future sustainable development. In: 2016 3rd International Conference on green technology and sustainable development (GTSD); pp 51–4

Zwitter A and Herman J (2018) Blockchain for sustainable development goals. University of Groningen, Report 2018 7–2018 Ed. 2018

Zhang X, Ghorbani AA. An overview of online fake news: characterization, detection and discussion. Inf Process Manag. 2020;57(2): 102025.

Esparza GG. A sentiment analysis model to analyze students reviews of teacher performance using support vector machines. In: International Symposium on Distributed Computing and Artificial Intelligence. 2017. Springer.

Altrabsheh N. Sentiment analysis towards a tool for analysis real time students feedback. In: IEEE 26th International Conference on tools with artificial intelligence IEEE, 2014.

Altrabsheh N. SA-E: Sentiment Analysis for Education International conference on Intelligent Decision technologies, 2013.

Yadav SK. Multimodal sentiment analysis: sentiment analysis using audiovisual format. In: 2nd International Conference on Computing for Sustainable Global Development. (INDIACom) 2015.

Kastrati Z. The impact of deep learning on document classification using semantically rich representations. Inf Process Manag. 2019;2019(56):1618–32.

Kastrati Z. Integrating word embedding and document topics with deep learning in a video classification framework. Pattern Recognit Lett. 2019;128:85–92.

Rome Communique. 2020. https://erasmusplus.org.ua/novyny/3131-bologna-conference-in-rome-19-nov-2020.html

Kandhro IA. Student feedback sentiment analysis model using various Machine Learning schemes: a review. Indian J Sci Technol. 2019; 12(14).

El-Sayad A, Ewis A, Abdel Rauof MM, Ghoneim O. A new approach in identifying the psychological impact of COVID-19 on university Students’ academic performance. Alexandria Eng J. 2022. https://doi.org/10.1016/j.aej.2021.10.046 .

Bhalla R (2019) A comparative analysis of application of proposed and the existing methodologies on a mobile phone survey. In: International conference on futuristic trends in networks and computing technologies. Springer, Singapore

Tarik A, Aissa H, Yousef F. Artificial Intelligence and Machine Learning to predict Student performance during COVID-19. In: The 3rd International workshop on Big Data and Business Intelligennce(BDBI 2021) March 23–26, 2021; Warsaw, Poland. https://doi.org/10.1016/j.procs.2021.03.104

Sekeroglu B, Dimililer K, Tuncal K. Student performance prediction and classification using machine learning algorithms. ICEIT 2019, March 24, Cambridge. 2019. https://doi.org/10.1145/3318396.3318419 .

Dabhade P, Agarwal R, Alameen KP, Fatima AT, Sridharan R, Gopukumar G. Educational Data Mining for predicting student’s academic performance using Machine learning algorithms. Mater Today. 2021. https://doi.org/10.1016/j.matpr.2021.05.646 .

Shuang K. Convolution deconvolution word embedding: an end-to end multi-prototype fusion embedding method for natural language processing. Inf Fusion. 2020;2020(53):112–22.

Download references

Author information

Authors and affiliations.

Computer Application, Lovely Professional University, LPU Phagwara, Jalandhar, Punjab, India

School of Computer Application, Lovely Professional University, LPU Phagwara, Jalandhar, Punjab, India

Rajni Bhalla

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Rajni Bhalla .

Ethics declarations

Conflict of interest.

Both authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Not applicable

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Intelligent Systems” guest edited by Geetha Ganesan, Lalit Garg, Renu Dhir, Vijay Kumar and Manik Sharma.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Cite this article.

Pooja, Bhalla, R. A Review Paper on the Role of Sentiment Analysis in Quality Education. SN COMPUT. SCI. 3 , 469 (2022). https://doi.org/10.1007/s42979-022-01366-9

Download citation

Received : 16 April 2022

Accepted : 04 August 2022

Published : 09 September 2022

DOI : https://doi.org/10.1007/s42979-022-01366-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Sentiment analysis
  • Quality education
  • Text mining

Advertisement

  • Find a journal
  • Publish with us
  • Methodology
  • Open access
  • Published: 16 June 2015

Sentiment analysis using product review data

  • Xing Fang 1 &
  • Justin Zhan 1  

Journal of Big Data volume  2 , Article number:  5 ( 2015 ) Cite this article

174k Accesses

372 Citations

4 Altmetric

Metrics details

Sentiment analysis or opinion mining is one of the major tasks of NLP (Natural Language Processing). Sentiment analysis has gain much attention in recent years. In this paper, we aim to tackle the problem of sentiment polarity categorization, which is one of the fundamental problems of sentiment analysis. A general process for sentiment polarity categorization is proposed with detailed process descriptions. Data used in this study are online product reviews collected from Amazon.com. Experiments for both sentence-level categorization and review-level categorization are performed with promising outcomes. At last, we also give insight into our future work on sentiment analysis.

Introduction

Sentiment is an attitude, thought, or judgment prompted by feeling. Sentiment analysis [ 1 - 8 ], which is also known as opinion mining, studies people’s sentiments towards certain entities. Internet is a resourceful place with respect to sentiment information. From a user’s perspective, people are able to post their own content through various social media, such as forums, micro-blogs, or online social networking sites. From a researcher’s perspective, many social media sites release their application programming interfaces (APIs), prompting data collection and analysis by researchers and developers. For instance, Twitter currently has three different versions of APIs available [ 9 ], namely the REST API, the Search API, and the Streaming API. With the REST API, developers are able to gather status data and user information; the Search API allows developers to query specific Twitter content, whereas the Streaming API is able to collect Twitter content in realtime. Moreover, developers can mix those APIs to create their own applications. Hence, sentiment analysis seems having a strong fundament with the support of massive online data.

However, those types of online data have several flaws that potentially hinder the process of sentiment analysis. The first flaw is that since people can freely post their own content, the quality of their opinions cannot be guaranteed. For example, instead of sharing topic-related opinions, online spammers post spam on forums. Some spam are meaningless at all, while others have irrelevant opinions also known as fake opinions [ 10 - 12 ]. The second flaw is that ground truth of such online data is not always available. A ground truth is more like a tag of a certain opinion, indicating whether the opinion is positive, negative, or neutral. The Stanford Sentiment 140 Tweet Corpus [ 13 ] is one of the datasets that has ground truth and is also public available. The corpus contains 1.6 million machine-tagged Twitter messages. Each message is tagged based on the emoticons (☺as positive, ☹ as negative) discovered inside the message.

Data used in this paper is a set of product reviews collected from Amazon [ 14 ], between February and April, 2014. The aforementioned flaws have been somewhat overcome in the following two ways: First, each product review receives inspections before it can be posted a . Second, each review must have a rating on it that can be used as the ground truth. The rating is based on a star-scaled system, where the highest rating has 5 stars and the lowest rating has only 1 star (Figure 1 ).

Rating System for Amazon.com.

This paper tackles a fundamental problem of sentiment analysis, namely sentiment polarity categorization [ 15 - 21 ]. Figure 2 is a flowchart that depicts our proposed process for categorization as well as the outline of this paper. Our contributions mainly fall into Phase 2 and 3. In Phase 2: 1) An algorithm is proposed and implemented for negation phrases identification; 2) A mathematical approach is proposed for sentiment score computation; 3) A feature vector generation method is presented for sentiment polarity categorization. In Phase 3: 1) Two sentiment polarity categorization experiments are respectively performed based on sentence level and review level; 2) Performance of three classification models are evaluated and compared based on their experimental results.

Sentiment Polarity Categorization Process.

The rest of this paper is organized as follows: In section ‘ Background and literature review ’, we provide a brief review towards some related work on sentiment analysis. Software package and classification models used in this study are presented in section ‘ Methods ’. Our detailed approaches for sentiment analysis are proposed in section ‘ Background and literature review ’. Experimental results are presented in section ‘ Results and discussion ’. Discussion and future work is presented in section ‘ Review-level categorization ’. Section ‘ Conclusion ’ concludes the paper.

Background and literature review

One fundamental problem in sentiment analysis is categorization of sentiment polarity [ 6 , 22 - 25 ]. Given a piece of written text, the problem is to categorize the text into one specific sentiment polarity, positive or negative (or neutral). Based on the scope of the text, there are three levels of sentiment polarity categorization, namely the document level, the sentence level, and the entity and aspect level [ 26 ]. The document level concerns whether a document, as a whole, expresses negative or positive sentiment, while the sentence level deals with each sentence’s sentiment categorization; The entity and aspect level then targets on what exactly people like or dislike from their opinions.

Since reviews of much work on sentiment analysis have already been included in [ 26 ], in this section, we will only review some previous work, upon which our research is essentially based. Hu and Liu [ 27 ] summarized a list of positive words and a list of negative words, respectively, based on customer reviews. The positive list contains 2006 words and the negative list has 4783 words. Both lists also include some misspelled words that are frequently present in social media content. Sentiment categorization is essentially a classification problem, where features that contain opinions or sentiment information should be identified before the classification. For feature selection, Pang and Lee [ 5 ] suggested to remove objective sentences by extracting subjective ones. They proposed a text-categorization technique that is able to identify subjective content using minimum cut. Gann et al. [ 28 ] selected 6,799 tokens based on Twitter data, where each token is assigned a sentiment score, namely TSI(Total Sentiment Index), featuring itself as a positive token or a negative token. Specifically, a TSI for a certain token is computed as:

where p is the number of times a token appears in positive tweets and n is the number of times a token appears in negative tweets. \(\frac {tp}{tn}\) is the ratio of total number of positive tweets over total number of negative tweets.

Research design and methdology

Data collection.

Data used in this paper is a set of product reviews collected from amazon.com. From February to April 2014, we collected, in total, over 5.1 millions of product reviews b in which the products belong to 4 major categories: beauty, book, electronic, and home (Figure 3 (a)). Those online reviews were posted by over 3.2 millions of reviewers (customers) towards 20,062 products. Each review includes the following information: 1) reviewer ID; 2) product ID; 3) rating; 4) time of the review; 5) helpfulness; 6) review text. Every rating is based on a 5-star scale(Figure 3 (b)), resulting all the ratings to be ranged from 1-star to 5-star with no existence of a half-star or a quarter-star.

Data collection (a) Data based on product categories (b) Data based on review categories.

Sentiment sentences extraction and POS tagging

It is suggested by Pang and Lee [ 5 ] that all objective content should be removed for sentiment analysis. Instead of removing objective content, in our study, all subjective content was extracted for future analysis. The subjective content consists of all sentiment sentences. A sentiment sentence is the one that contains, at least, one positive or negative word. All of the sentences were firstly tokenized into separated English words.

Every word of a sentence has its syntactic role that defines how the word is used. The syntactic roles are also known as the parts of speech. There are 8 parts of speech in English: the verb, the noun, the pronoun, the adjective, the adverb, the preposition, the conjunction, and the interjection. In natural language processing, part-of-speech (POS) taggers [ 29 - 31 ] have been developed to classify words based on their parts of speech. For sentiment analysis, a POS tagger is very useful because of the following two reasons: 1) Words like nouns and pronouns usually do not contain any sentiment. It is able to filter out such words with the help of a POS tagger; 2) A POS tagger can also be used to distinguish words that can be used in different parts of speech. For instance, as a verb, “enhanced" may conduct different amount of sentiment as being of an adjective. The POS tagger used for this research is a max-entropy POS tagger developed for the Penn Treebank Project [ 31 ]. The tagger is able to provide 46 different tags indicating that it can identify more detailed syntactic roles than only 8. As an example, Table 1 is a list of all tags for verbs that has been included in the POS tagger.

Each sentence was then tagged using the POS tagger. Given the enormous amount of sentences, a Python program that is able to run in parallel was written in order to improve the speed of tagging. As a result, there are over 25 million adjectives, over 22 million adverbs, and over 56 million verbs tagged out of all the sentiment sentences, because adjectives, adverbs, and verbs are words that mainly convey sentiment.

Negation phrases identification

Words such as adjectives and verbs are able to convey opposite sentiment with the help of negative prefixes. For instance, consider the following sentence that was found in an electronic device’s review: “The built in speaker also has its uses but so far nothing revolutionary." The word, “revolutionary" is a positive word according to the list in [ 27 ]. However, the phrase “nothing revolutionary" gives more or less negative feelings. Therefore, it is crucial to identify such phrases. In this work, there are two types of phrases have been identified, namely negation-of-adjective (NOA) and negation-of-verb (NOV).

Most common negative prefixes such as not, no, or nothing are treated as adverbs by the POS tagger. Hence, we propose Algorithm 1 for the phrases identification. The algorithm was able to identify 21,586 different phrases with total occurrence of over 0.68 million, each of which has a negative prefix. Table 2 lists top 5 NOA and NOV phrases based on occurrence, respectively.

Sentiment score computation for sentiment tokens

A sentiment token is a word or a phrase that conveys sentiment. Given those sentiment words proposed in [ 27 ], a word token consists of a positive (negative) word and its part-of-speech tag. In total, we selected 11,478 word tokens with each of them that occurs at least 30 times throughout the dataset. For phrase tokens, 3,023 phrases were selected of the 21,586 identified sentiment phrases, which each of the 3,023 phrases also has an occurrence that is no less than 30. Given a token t , the formula for t ’s sentiment score (SS) computation is given as:

O c c u r r e n c e i ( t ) is t ’s number of occurrence in i -star reviews, where i =1,...,5. According to Figure 3 , our dataset is not balanced indicating that different number of reviews were collected for each star level. Since 5-star reviews take a majority amount through the entire dataset, we hereby introduce a ratio, γ 5, i , which is defined as:

In equation 3 , the numerator is the number of 5-star reviews and the denominator is the number of i -star reviews, where i =1,...,5. Therefore, if the dataset were balanced, γ 5, i would be set to 1 for every i . Consequently, every sentiment score should fall into the interval of [1,5]. For positive word tokens, we expect that the median of their sentiment scores should exceed 3, which is the point of being neutral according to Figure 1 . For negative word tokens, it is to expect that the median should be less than 3.

As a result, the sentiment score information for positive word tokens is showing in Figure 4 (a). The histogram chart describes the distribution of scores while the box-plot chart shows that the median is above 3. Similarly, the box-plot chart in Figure 4 (b) shows that the median of sentiment scores for negative word tokens is lower than 3. In fact, both the mean and the median of positive word tokens do exceed 3, and both values are lower than 3, for negative word tokens (Table 3 ).

Sentiment score information for word tokens (a) Positive word tokens (b) Negative word tokens.

The ground truth labels

The process of sentiment polarity categorization is twofold: sentence-level categorization and review-level categorization. Given a sentence, the goal of sentence-level categorization is to classify it as positive or negative in terms of the sentiment that it conveys. Training data for this categorization process require ground truth tags, indicating the positiveness or negativeness of a given sentence. However, ground truth tagging becomes a really challenging problem, due to the amount of data that we have. Since manually tagging each sentence is infeasible, a machine tagging approach is then adopted as a solution. The approach implements a bag-of-word model that simply counts the appearance of positive or negative (word) tokens for every sentence. If there are more positive tokens than negative ones, the sentence will be tagged as positive, and vice versa. This approach is similar to the one used for tagging the Sentiment 140 Tweet Corpus. Training data for review-level categorization already have ground truth tags, which are the star-scaled ratings.

Feature vector formation

Sentiment tokens and sentiment scores are information extracted from the original dataset. They are also known as features, which will be used for sentiment categorization. In order to train the classifiers, each entry of training data needs to be transformed to a vector that contains those features, namely a feature vector. For the sentence-level (review-level) categorization, a feature vector is formed based on a sentence (review). One challenge is to control each vector’s dimensionality. The challenge is actually twofold: Firstly, a vector should not contain an abundant amount (thousands or hundreds) of features or values of a feature, because of the curse of dimensionality [ 32 ]; secondly, every vector should have the same number of dimensions, in order to fit the classifiers. This challenge particularly applies to sentiment tokens: On one hand, there are 11,478 word tokens as well as 3,023 phrase tokens; On the other hand, vectors cannot be formed by simply including the tokens appeared in a sentence (or a review), because different sentences (or reviews) tend to have different amount of tokens, leading to the consequence that the generated vectors are in different dimensions.

Since we only concern each sentiment token’s appearance inside a sentence or a review,to overcome the challenge, two binary strings are used to represent each token’s appearance. One string with 11,478 bits is used for word tokens, while the other one with a bit-length of 3,023 is applied for phrase tokens. For instance, if the i th word (phrase) token appears, the word (phrase) string’s i th bit will be flipped from “0" to “1". Finally, instead of directly saving the flipped strings into a feature vector, a hash value of each string is computed using Python’s built-in hash function and is saved. Hence, a sentence-level feature vector totally has four elements: two hash values computed based on the flipped binary strings, an averaged sentiment score, and a ground truth label. Comparatively, one more element is exclusively included in review-level vectors. Given a review, if there are m positive sentences and n negative sentences, the value of the element is computed as: −1× m +1× n .

Results and discussion

Evaluation methods.

Performance of each classification model is estimated base on its averaged F1-score ( 4 ):

where P i is the precision of the i th class, R i is the recall of the i th class, and n is the number of classes. P i and R i are evaluated using 10-fold cross validation. A 10-fold cross validation is applied as follows: A dataset is partitioned into 10 equal size subsets, each of which consists of 10 positive class vectors and 10 negative class vectors. Of the 10 subsets, a single subset is retained as the validation data for testing the classification model, and the remaining 9 subsets are used as training data. The cross-validation process is then repeated 10 times, with each of the 10 subsets used exactly once as the validation data. The 10 results from the folds are then averaged to produce a single estimation. Since training data are labeled under two classes (positive and negative) for the sentence-level categorization, ROC (Receiver Operating Characteristic) curves are also plotted for a better performance comparison.

Sentence-level categorization

Result on manually-labeled sentences.

200 feature vectors are formed based on the 200 manually-labeled sentences. As a result, the classification models show the same level of performance based on their F1-scores, where the three scores all take a same value of 0.85. With the help of the ROC curves (Figure 5 ), it is clear to see that all three models performed quite well for testing data that have high posterior probability. (A posterior probability of a testing data point, A , is estimated by the classification model as the probability that A will be classified as positive, denoted as P (+| A ).) As the probability getting lower, the Naïve Bayesain classifier outperforms the SVM classifier, with a larger area under curve. In general, the Random Forest model performs the best.

ROC curves based on the manually labeled set.

Result on machine-labeled sentences

2-million feature vectors (1 million with positive labels and 1 million with negative labels) are generated from 2-million machine-labeled sentences, known as the complete set. Four subsets are obtained from the complete set, with subset A contains 200 vectors, subset B contains 2,000 vectors, subset C contains 20,000 vectors, and subset D contains 200,000 vectors, respectively. The amount of vectors with positive labels equals the amount of vectors with negative labels for every subset. Performance of the classification models is then evaluated based on five different vector sets (four subsets and one complete set, Figure 6 ).

F1 scores of sentence-level categorization.

While the models are getting more training data, their F1 scores are all increasing. The SVM model takes the most significant enhancement from 0.61 to 0.94 as its training data increased from 180 to 1.8 million. The model outperforms the Naïve Bayesain model and becomes the 2nd best classifier, on subset C and the full set. The Random Forest model again performs the best for datasets on all scopes. Figure 7 shows the ROC curves plotted based on the result of the full set.

ROC curves based on the complete set.

Review-level categorization

3-million feature vectors are formed for the categorization. Vectors generated from reviews that have at least 4-star ratings are labeled as positive, while vectors labeled as negative are generated from 1-star and 2-star reviews. 3-star reviews are used to prepare neutral class vectors. As a result, this complete set of vectors are uniformly labeled into three classes, positive, neutral, and negative. In addition, three subsets are obtained from the complete set, with subset A contains 300 vectors, subset B contains 3,000 vectors, subset C contains 30,000 vectors, and subset D contains 300,000 vectors, respectively.

Figure 8 shows the F1 scores obtained on different sizes of vector sets. It can be clearly observed that both the SVM model and the Naïve Bayesain model are identical in terms of their performances. Both models are generally superior than the Random Forest model on all vector sets. However, neither of the models can reach the same level of performance when they are used for sentence-level categorization, due to their relative low performances on neutral class.

F1 scores of review-level categorization.

The experimental result is promising, both in terms of the sentence-level categorization and the review-level categorization. It was observed that the averaged sentiment score is a strong feature by itself, since it is able to achieve an F1 score over 0.8 for the sentence-level categorization with the complete set. For the review-level categorization with the complete set, the feature is capable of producing an F1 score that is over 0.73. However, there are still couple of limitations to this study. The first one is that the review-level categorization becomes difficult if we want to classify reviews to their specific star-scaled ratings. In other words, F1 scores obtained from such experiments are fairly low, with values lower than 0.5. The second limitation is that since our sentiment analysis scheme proposed in this study relies on the occurrence of sentiment tokens, the scheme may not work well for those reviews that purely contain implicit sentiments. An implicit sentiment is usually conveyed through some neutral words, making judgement of its sentiment polarity difficult. For example, sentence like “Item as described.", which frequently appears in positive reviews, consists of only neutral words.

With those limitations in mind, our future work is to focus on solving those issues. Specifically, more features will be extracted and grouped into feature vectors to improve review-level categorizations. For the issue of implicit sentiment analysis, our next step is to be able to detect the existence of such sentiment within the scope of a particular product. More future work includes testing our categorization scheme using other datasets.

Sentiment analysis or opinion mining is a field of study that analyzes people’s sentiments, attitudes, or emotions towards certain entities. This paper tackles a fundamental problem of sentiment analysis, sentiment polarity categorization. Online product reviews from Amazon.com are selected as data used for this study. A sentiment polarity categorization process (Figure 2 ) has been proposed along with detailed descriptions of each step. Experiments for both sentence-level categorization and review-level categorization have been performed.

Software used for this study is scikit-learn [ 33 ], an open source machine learning software package in Python. The classification models selected for categorization are: Naïve Bayesian, Random Forest, and Support Vector Machine [ 32 ].

Naïve Bayesian classifier

The Naïve Bayesian classifier works as follows: Suppose that there exist a set of training data, D , in which each tuple is represented by an n -dimensional feature vector, X = x 1 , x 2 ,.., x n , indicating n measurements made on the tuple from n attributes or features. Assume that there are m classes, C 1 , C 2 ,..., C m . Given a tuple X , the classifier will predict that X belongs to C i if and only if: P ( C i | X )> P ( C j | X ), where i , j ∈ [1, m ] a n d i ≠ j . P ( C i | X ) is computed as:

Random forest

The random forest classifier was chosen due to its superior performance over a single decision tree with respect to accuracy. It is essentially an ensemble method based on bagging. The classifier works as follows: Given D , the classifier firstly creates k bootstrap samples of D , with each of the samples denoting as D i . A D i has the same number of tuples as D that are sampled with replacement from D . By sampling with replacement, it means that some of the original tuples of D may not be included in D i , whereas others may occur more than once. The classifier then constructs a decision tree based on each D i . As a result, a “forest" that consists of k decision trees is formed. To classify an unknown tuple, X , each tree returns its class prediction counting as one vote. The final decision of X ’s class is assigned to the one that has the most votes.

The decision tree algorithm implemented in scikit-learn is CART (Classification and Regression Trees). CART uses Gini index for its tree induction. For D , the Gini index is computed as:

where p i is the probability that a tuple in D belongs to class C i . The Gini index measures the impurity of D . The lower the index value is, the better D was partitioned. For the detailed descriptions of CART, please see [ 32 ].

Support vector machine

Support vector machine (SVM) is a method for the classification of both linear and nonlinear data. If the data is linearly separable, the SVM searches for the linear optimal separating hyperplane (the linear kernel), which is a decision boundary that separates data of one class from another. Mathematically, a separating hyperplane can be written as: W · X + b =0, where W is a weight vector and W = w 1 , w 2,..., w n . X is a training tuple. b is a scalar. In order to optimize the hyperplane, the problem essentially transforms to the minimization of ∥ W ∥ , which is eventually computed as: \(\sum \limits _{i=1}^{n} \alpha _{i} y_{i} x_{i}\) , where α i are numeric parameters, and y i are labels based on support vectors, X i . That is: if y i =1 then \(\sum \limits _{i=1}^{n} w_{i}x_{i} \geq 1\) ; if y i =−1 then \(\sum \limits _{i=1}^{n} w_{i}x_{i} \geq -1\) .

If the data is linearly inseparable, the SVM uses nonlinear mapping to transform the data into a higher dimension. It then solve the problem by finding a linear hyperplane. Functions to perform such transformations are called kernel functions. The kernel function selected for our experiment is the Gaussian Radial Basis Function (RBF):

where X i are support vectors, X j are testing tuples, and γ is a free parameter that uses the default value from scikit-learn in our experiment. Figure 9 shows a classification example of SVM based on the linear kernel and the RBF kernel.

A Classification Example of SVM.

a Even though there are papers talking about spam on Amazon.com, we still contend that it is a relatively spam-free website in terms of reviews because of the enforcement of its review inspection process.

b The product review data used for this work can be downloaded at: http://www.itk.ilstu.edu/faculty/xfang13/amazon_data.htm .

Kim S-M, Hovy E (2004) Determining the sentiment of opinions In: Proceedings of the 20th international conference on Computational Linguistics, page 1367.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Google Scholar  

Liu B (2010) Sentiment analysis and subjectivity In: Handbook of Natural Language Processing, Second Edition.. Taylor and Francis Group, Boca.

Liu B, Hu M, Cheng J (2005) Opinion observer: Analyzing and comparing opinions on the web In: Proceedings of the 14th International Conference on World Wide Web, WWW ’05, 342–351.. ACM, New York, NY, USA.

Chapter   Google Scholar  

Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining In: Proceedings of the Seventh conference on International Language Resources and Evaluation.. European Languages Resources Association, Valletta, Malta.

Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts In: Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics, ACL ’04.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr2(1-2): 1–135.

Article   Google Scholar  

Turney PD (2002) Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, 417–424.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM ’05, 625–631.. ACM, New York, NY, USA.

Twitter (2014) Twitter apis. https://dev.twitter.com/start .

Liu B (2014) The science of detecting fake reviews. http://content26.com/blog/bing-liu-the-science-of-detecting-fake-reviews/ .

Jindal N, Liu B (2008) Opinion spam and analysis In: Proceedings of the 2008 International Conference on, Web Search and Data Mining, WSDM ’08, 219–230.. ACM, New York, NY, USA.

Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews In: Proceedings of the 21st, International Conference on World Wide Web, WWW ’12, 191–200.. ACM, New York, NY, USA.

Stanford (2014) Sentiment 140. http://www.sentiment140.com/ .

www.amazon.com.

Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision, 1–12.. CS224N Project Report, Stanford.

Lin Y, Zhang J, Wang X, Zhou A (2012) An information theoretic approach to sentiment polarity classification In: Proceedings of the 2Nd Joint WICOW/AIRWeb Workshop on Web Quality, WebQuality ’12, 35–40.. ACM, New York, NY, USA.

Sarvabhotla K, Pingali P, Varma V (2011) Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents. Inf Retrieval14(3): 337–353.

Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis In: Proceedings of the conference on human language technology and empirical methods in natural language processing, 347–354.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Yu H, Hatzivassiloglou V (2003) Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences In: Proceedings of the 2003 conference on, Empirical methods in natural language processing, 129–136.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Zhang Y, Xiang X, Yin C, Shang L (2013) Parallel sentiment polarity classification method with substring feature reduction In: Trends and Applications in Knowledge Discovery and Data Mining, volume 7867 of Lecture Notes in Computer Science, 121–132.. Springer Berlin Heidelberg, Heidelberg, Germany.

Zhou S, Chen Q, Wang X (2013) Active deep learning method for semi-supervised sentiment classification. Neurocomputing120(0): 536–546. Image Feature Detection and Description.

Chesley P, Vincent B, Xu L, Srihari RK (2006) Using verbs and adjectives to automatically classify blog sentiment. Training580(263): 233.

Choi Y, Cardie C (2009) Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP ’09, 590–598.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent twitter sentiment classification In: Proceedings of the 49th, Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, 151–160.. Association for Computational Linguistics, Stroudsburg, PA, USA.

Tan LK-W, Na J-C, Theng Y-L, Chang K (2011) Sentence-level sentiment polarity classification using a linguistic approach In: Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation, 77–87.. Springer, Heidelberg, Germany.

Liu B (2012) Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.

Hu M, Liu B (2004) Mining and summarizing customer reviews In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 168–177.. ACM, New York, NY, USA.

Gann W-JK, Day J, Zhou S (2014) Twitter analytics for insider trading fraud detection system In: Proceedings of the sencond ASE international conference on Big Data.. ASE.

Roth D, Zelenko D (1998) Part of speech tagging using a network of linear separators In: Coling-Acl, The 17th International Conference on Computational Linguistics, 1136–1142.

Kristina T (2003) Stanford log-linear part-of-speech tagger. http://nlp.stanford.edu/software/tagger.shtml .

Marcus M (1996) Upenn part of speech tagger. http://www.cis.upenn.edu/~treebank/home.html .

Han J, Kamber M, Pei J (2006) Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems), 2nd ed.. Morgan Kaufmann, San Francisco, CA, USA.

(2014) Scikit-learn. http://scikit-learn.org/stable/ .

Download references

Acknowledgements

This research was partially supported by the following grants: NSF No. 1137443, NSF No. 1247663, NSF No. 1238767, DoD No. W911NF-13-0130, DoD No. W911NF-14-1-0119, and the Data Science Fellowship Award by the National Consortium for Data Science.

Author information

Authors and affiliations.

Department of Computer Science, North Carolina A&T State University, Greensboro, NC, USA

Xing Fang & Justin Zhan

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Xing Fang .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contributions

XF performed the primary literature review, data collection, experiments, and also drafted the manuscript. JZ worked with XF to develop the articles framework and focus. All authors read and approved the final manuscript.

Authors’ information

Xing Fang is a Ph.D. candidate at the Department of Computer Science, North Carolina A&T State University. His research interests include social computing, machine learning, and natural language processing. Mr. Fang holds one Master’s degree in computer science from North Carolina A&T State University, and one Baccalaureate degree in electronic engineering from Northwestern Polytechnical University, Xi’an, China.

Dr. Justin Zhan is an associate professor at the Department of Computer Science, North Carolina A&T State University. He has previously been a faculty member at Carnegie Mellon University and National Center for the Protection of Financial Infrastructure in Dakota State University. His research interests include Big Data, Information Assurance, Social Computing, and Health Science.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0 ), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Cite this article.

Fang, X., Zhan, J. Sentiment analysis using product review data. Journal of Big Data 2 , 5 (2015). https://doi.org/10.1186/s40537-015-0015-2

Download citation

Received : 12 January 2015

Accepted : 20 April 2015

Published : 16 June 2015

DOI : https://doi.org/10.1186/s40537-015-0015-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Sentiment analysis; Sentiment polarity categorization; Natural language processing; Product reviews

sentiment analysis paper review

Captcha Page

We apologize for the inconvenience...

To ensure we keep this website safe, please can you confirm you are a human by ticking the box below.

If you are unable to complete the above request please contact us using the below link, providing a screenshot of your experience.

https://ioppublishing.org/contacts/

Please solve this CAPTCHA to request unblock to the website

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Survey on sentiment analysis: evolution of research methods and topics

Jingfeng cui.

1 Institute of High Performance Computing, A*STAR, 1 Fusionopolis Way, Singapore, 138632 Singapore

2 School of Information Management, Nanjing Agricultural University, 1 Weigang, Nanjing, 210095 China

Zhaoxia Wang

3 School of Computing and Information Systems, Singapore Management University, 80 Stamford Rd, Singapore, 178902 Singapore

Seng-Beng Ho

Erik cambria.

4 School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798 Singapore

Associated Data

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Sentiment analysis, one of the research hotspots in the natural language processing field, has attracted the attention of researchers, and research papers on the field are increasingly published. Many literature reviews on sentiment analysis involving techniques, methods, and applications have been produced using different survey methodologies and tools, but there has not been a survey dedicated to the evolution of research methods and topics of sentiment analysis. There have also been few survey works leveraging keyword co-occurrence on sentiment analysis. Therefore, this study presents a survey of sentiment analysis focusing on the evolution of research methods and topics. It incorporates keyword co-occurrence analysis with a community detection algorithm. This survey not only compares and analyzes the connections between research methods and topics over the past two decades but also uncovers the hotspots and trends over time, thus providing guidance for researchers. Furthermore, this paper presents broad practical insights into the methods and topics of sentiment analysis, while also identifying technical directions, limitations, and future work.

Introduction

Web 2.0 has driven the proliferation of user-generated content on the Internet. This content is closely related to the lives, emotions, and opinions of users. Therefore, analysis of this user-generated data is beneficial for monitoring public opinion and assisting in making decisions. Sentiment analysis, as one of the most popular applications of text-based analytics, can be used to mine people’s attitudes, emotions, appraisals, and opinions about issues, entities, topics, events, and products (Cambria et al. 2022a , b , c , d ; Injadat et al. 2016 ; Jiang et al. 2017 ; Liang et al. 2022 ; Oueslati et al. 2020 ; Piryani et al. 2017 ). Sentiment analysis can help us interpret emotions in unstructured texts as positive, negative, or neutral, and even calculate how strong or weak the emotions are. Today, sentiment analysis is widely used in various fields, such as business, finance, politics, education, and services. This analytical technique has gained broad acceptance not only among researchers but also among governments, institutions, and companies (Khatua et al. 2020 ; Liu et al. 2012 ; Sánchez-Rada and Iglesias 2019 ; Wang et al. 2020b ). It helps policy leaders, businessmen, and service people make better decisions.

The majority of user-generated content data is unstructured text, which increases the great difficulty of sentiment analysis. Since 2000, researchers have been exploring techniques and methods to enhance the accuracy of such analysis. The popularity of social media platforms has brought people around the world closer together. With the continuous advancement of technology, the research topics, application fields, and core methods and technologies of sentiment analysis are also constantly changing.

Comparing and analyzing papers from specific disciplines can help researchers gain a comprehensive understanding of the field. There have been many surveys on sentiment analysis (Nair et al. 2019 ; Obiedat et al. 2021 ; Raghuvanshi and Patil 2016 ). However, there is a lack of adequate discussion on the connections between research methods and topics in the field, as well as on their evolution over time. In 1983, Callon et al. proposed co-word analysis (Callon et al. 1983 ). It can effectively reflect the correlation strength of information items in text data. Co-word analysis based on the frequency of co-occurrence of keywords used to describe papers can reveal the core contents of the research in specific fields. An evolutionary analysis of the associations between core contents is helpful for a comprehensive understanding of the research hotspots and frontiers in the field (Deng et al. 2021 ). It can provide guidance for researchers, especially those who are new to the field, and help them determine research directions, avoid repetitive research, and better discover and grasp the research trends in this field (Wang et al. 2012 ). To fill in the gap in existing research, we conduct keyword co-occurrence analysis and evolution analysis with informetric tools to explore the research hotspots and trends of sentiment analysis.

The main contributions of this survey are as follows:

  • Using keyword co-occurrence analysis and the informetric tools, the paper presents a survey on sentiment analysis, explores and discovers useful information.
  • A keyword co-occurrence network is constructed by combining the paper title, abstract, and author keywords. Through the keyword co-occurrence network and community detection algorithm, the research methods and topics in the field of sentiment analysis, along with their evolution in the past two decades, are discussed.
  • The paper summarizes the research hotspots and trends in sentiment analysis. It also highlights practical implications and technical directions.

The remainder of this paper is organized as follows: In Sect.  2 , we summarize and analyze the existing surveys on sentiment analysis and present the research purpose and methodologies of this paper. Section  3 details the survey methodology, including the collection and processing of scientific publications, visualization, and analysis using different methods and tools. In Sect.  4 , we analyze the results obtained from the keyword co-occurrence analysis and evolution analysis, along with the research hotspots and trends in sentiment analysis identified through the analysis results. Finally, in Sect.  5 , we summarize the research conclusions as well as the practical implications and technical directions of sentiment analysis. We also clarify the limitations of this paper and make suggestions for future work.

Existing surveys on sentiment analysis

Sentiment analysis is a concept encompassing many tasks, such as sentiment extraction, sentiment classification, opinion summarization, review analysis, sarcasm detection or emotion detection, etc. Since the 2000s, sentiment analysis has become a popular research field in natural language processing (Hussein 2018 ). In the existing surveys, the researchers mainly conducted specific analyses of the tasks, technologies, methods, analysis granularity, and application fields involved in the sentiment analysis process.

Surveys on contents and topics of sentiment analysis

When research on sentiment analysis was still in its infancy, the contents and topics of surveys mainly focused on sentiment analysis tasks, analysis granularity, and application areas. Kumer et al. reviewed the basic terms, tasks, and levels of granularity related to sentiment analysis (Kumar and Sebastian 2012 ). They also discussed some key feature selection techniques and the applications of sentiment analysis in business, politics, recommender systems and other fields. Nassirtoussi et al. explored the application of sentiment analysis in market prediction (Nassirtoussi et al. 2014 ). Medhat et al. analyzed the improvement of the algorithms proposed in 2010–2013 and their application fields (Medhat et al. 2014 ). Ravi et al. analyzed the papers related to opinion mining and sentiment analysis from 2002 to 2015. Their study mainly discussed the necessary tasks, methods, applications, and unsolved problems in the field of sentiment analysis (Ravi and Ravi 2015 ).

Existing surveys of the applications of sentiment analysis have focused more on the domains of market research, medicine, and social media in recent years. Rambocas et al. examined the application of sentiment analysis in marketing research from three main perspectives, including the unit of analysis, sampling design, and methods used in sentiment detection and statistical analysis (Rambocas and Pacheco 2018 ). Cheng et al. summarized techniques based on semantic, sentiment, and event extraction, as well as hybrid methods employed in stock forecasting (Cheng et al. 2022 ). Yue et al. categorized and compared a large number of techniques and approaches in the social media domain. That study also introduced different types of data and advanced research tools, and discussed their limitations (Yue et al. 2019 ). In the context of the COVID-19 epidemic, Alamoodi et al. reviewed and analyzed articles on the occurrence of different types of infectious diseases in the past 10 years. They reviewed the applications of sentiment analysis from the identified 28 articles, summarizing the adopted techniques such as dictionary-based models, machine learning models, and mixed models (Alamoodi et al. 2021b ); Alamoodi et al. also conducted a review of the applications of sentiment analysis for vaccine hesitancy (Alamoodi et al. 2021a ). Researchers also reviewed the application of sentiment analysis in the fields of election prediction (Brito et al. 2021 ), education (Kastrati et al. 2021 ; Zhou and Ye 2020 ) and service industries (Adak et al. 2022 ).

Quite a number of research works investigated sentiment analysis works in non-English languages. Sentiment analysis in Chinese (Peng et al. 2017 ), Arabic (Al-Ayyoub et al. 2019 ; Boudad et al. 2018 ; Nassif et al. 2021 ; Oueslati et al. 2020 ), Urdu (Khattak et al. 2021 ), Spanish (Angel et al. 2021 ), and Portuguese (Pereira 2021 ) were conducted. They mainly reviewed the classification frameworks of the sentiment analysis process, supported language resources (dictionaries, natural language processing tools, corpora, ontologies, etc.), and deep learning models used (CNN, RNN, and transfer learning) for each of the languages involved.

Surveys on methods of sentiment analysis

Before machine learning technology became mature, researchers were particularly concerned about feature extraction methods. For example, Feldman summarized methods for extracting preferred entities from indirect opinions and methods for dictionary acquisition (Feldman 2013 ). Asghar et al. reviewed the natural language processing techniques for extracting features based on part of speech and term position; statistical techniques for extracting features based on word frequency and decision tree model; and techniques for combining part of speech tagging, syntactic feature analysis, and dictionaries (Asghar et al. 2014 ). Koto et al. discussed the best features for Twitter sentiment analysis prior to 2014 by comparing 9 feature sets (Koto and Adriani 2015 ). They found that the current best features for sentiment analysis of Twitter texts are AFINN (a list of English terms used for sentiment analysis manually rated by Finn Årup Nielsen) (Nielsen 2011 ) and Senti-Strength (Thelwall et al. 2012 ). Taboada sorted out the characteristics of words, phrases, and sentence patterns in sentiment analysis from the perspective of linguistics (Taboada 2016 ). Besides, Schouten and Frasinar conducted a comprehensive and in-depth critical evaluation of 15 sentiment analysis web tools (Schouten and Frasincar 2015 ). Medhat et al. ( 2014 ) and Ravi et al. (Ravi and Ravi 2015 ) also analyzed the early algorithms for sentiment analysis.

In the study by Schouten et al., the authors focused on aspect-level sentiment analysis, combing the techniques of aspect-level sentiment analysis before 2014, such as frequency-based, syntax-based, supervised machine learning, unsupervised machine learning, and hybrid approaches. They concluded that the latest technology was moving beyond the early stages (Schouten and Frasincar 2015 ). As research into sentiment analysis became more and more popular and there was important progress made in the development of deep learning technologies, researchers started to pay more attention to the techniques and methods of sentiment analysis. Deep learning methods in particular became the focus of discussions among researchers.

Prabha et al. analyzed various deep learning methods used in different applications at the level of sentence and aspect/object sentiment analysis, including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-term Memory (LSTM) (Prabha and Srikanth 2019 ). They discussed the advantages and disadvantages of these methods and their performance parameters. Ain et al. introduced deep learning techniques such as Deep Neural Network (DNN), CNN and Deep Belief Network (DBN) to solve sentiment analysis tasks like sentiment classification, cross-lingual problems, and product review analysis (Ain et al. 2017 ). Zhang et al. investigated deep learning and machine learning techniques for sentiment analysis in the contexts of aspect extraction and categorization, opinion expression extraction, opinion holder extraction, sarcasm analysis, multimodal data, etc. (Zhang et al. 2018 ). Habimana et al. compared the performance of deep learning methods on specific datasets and proposed that performance could be improved using models including Bidirectional Encoder Representations from Transformers (BERT), sentiment-specific word embedding models, cognitive-based attention models, and commonsense knowledge (Habimana et al. 2020 ). Wang et al. reviewed and discussed existing analytical models for sentiment classification and proposed a computational emotion-sensing model (Wang et al. 2020b ).

Some researchers also discussed web tools (Zucco et al. 2020 ), fuzzy logic algorithms (Serrano-Guerrero et al. 2021 ), transformer models (Acheampong et al. 2021 ), and sequential transfer learning (Chan et al. 2022 ) for sentiment analysis.

Overall survey methodology

With the increase in the popularity of sentiment analysis research, more related research results began to accumulate. Researchers needed to systematically organize and analyze results from a large number of publications to perform literature reviews. They used different survey methodologies to conduct surveys of a large number of papers.

Content analysis is a powerful approach to characterizing the contents of each study by carefully reading its content and manually identifying, coding, and organizing key information in it. A literature review is formed as a result of the repeated use of this approach (Elo and Kyngäs 2008 ; Stemler 2000 ). Content analysis has been used for different studies and systematic reviews (Qazi et al. 2015 , 2017 ). For example, Birjali et al. have studied the most commonly used classification techniques in sentiment analysis from a large amount of literature and introduced the application areas and sentiment classification processes, including preprocessing and feature selection (Birjali et al. 2021 ). They conducted a comprehensive analysis of the papers, discovering that supervised machine learning algorithms are the most commonly used techniques in the field. A complete review of methods and evaluation for sentiment analysis tasks and their applications was conducted by Wankhade et al. ( 2022 ). They compared the strengths and weaknesses of the methods, and discussed the future challenges of sentiment analysis in terms of both the methods and the forms of the data. Although this method can review the research contents and penetrate into the cores of the papers most systematically, it requires a considerable amount of manpower and time for in-depth literature reading.

The systematic literature review guideline proposed by Kitchenham and Charters has gradually attracted the attention of researchers (Kitchenham 2004 ; Kitchenham and Charters 2007 ; Sarsam et al. 2020 ). This review process is divided into six stages: research question definition, search strategy formulation, inclusion and exclusion criteria definition, quality assessment, data extraction, and data synthesis. Researchers can eliminate a large number of retrieved papers by using this standard process and finally conducting further analysis and research on a small number of papers. Kumar et al. reviewed context-based sentiment analysis in social multimedia between 2006 and 2018. From the 573 papers retrieved in the initial search, they finally selected 37 papers to use in discussing sentiment analysis techniques (Kumar and Garg 2020 ). This approach was also used by Kumar et al. in their research on sentiment analysis on Twitter using soft computing techniques. They selected 60 articles out of 502 for follow-up analysis (Kumar and Jaiswal 2020 ). Zunic et al. selected 86 papers from 299 papers retrieved in the period 2011–2019 to discuss the application of sentiment analysis techniques in the field of health and well-being (Zunic et al. 2020 ); Ligthart et al. followed Kitchenham’s guideline and identified 14 secondary studies. They provided an overview of specific sentiment analysis tasks and of the features and methods required for different tasks (Ligthart et al. 2021 ). Obiedat (Obiedat et al. 2021 ), Angel (Angel et al. 2021 ) and Lin (Lin et al. 2022 ) also all followed this guideline to select literature for further analysis. This method can reduce the amount of literature that requires in-depth reading, but in the case of a large amount of literature, more effort is still required to search and screen the material than in traditional literature review methods (Kitchenham and Charters 2007 ).

There are also a few authors who have used informetric methods to review papers. Piryani et al. conducted an informetric analysis of research on opinion mining and sentiment analysis from 2000 to 2015 (Piryani et al. 2017 ). The authors used social network analysis, literature co-citation analysis, and other methods in the paper. They analyzed publication growth rates; the most productive countries, institutions, journals, and authors; and topic density maps and keyword bursts, among other elements. To a certain extent, they interpreted core authors, core papers, areas of research focus in this field, and the current state of national cooperation. In order to explore the application of sentiment analysis in building smart societies, Verma collected 353 papers published between 2010 and 2021 (Verma 2022 ). Using a topic analysis perspective combined with the Louvain algorithm, the author identified four sub-topics in the research field. Similarly, Mantyla et al. employed LDA techniques and manual classification to explore the topic structures of sentiment analysis articles (Mäntylä et al. 2018 ). The informetric methods use natural language processing technologies to intuitively conduct topic mining and analysis of a large number of papers. Through topic clustering, the literature is organized and analyzed, which reduces the time researchers spend on reading the literature in depth. These methods are suitable for exploring research topics and trends in the field.

Summary of advantages and disadvantages of the existing surveys

In the following, we discuss the advantages and disadvantages of the existing surveys from a number of different points of view.

From the point of view of the contents and topics of sentiment analysis

As summarized in Table ​ Table1, 1 , the researchers organized the literature and conducted depth investigations of the contents and topics of sentiment analysis. They reviewed the tasks of sentiment analysis (e.g., different text granularity, opinion mining, spam review detection, and emotion detection), the application areas of sentiment analysis (e.g. market, medicine, social media, and election prediction), and different languages for sentiment analysis, such as Chinese, Spanish, and Arabic (Adak et al. 2022 ; Al-Ayyoub et al. 2019 ; Alamoodi et al. ( 2021a , b ); Alonso et al. 2021 ; Angel et al. 2021 ; Boudad et al. 2018 ; Brito et al. 2021 ; Cheng et al. 2022 ; Hussain et al. 2019 ; Kastrati et al. 2021 ; Khattak et al. 2021 ; Koto and Adriani 2015 ; Kumar and Sebastian 2012 ; Ligthart et al. 2021 ; Medhat et al. 2014 ; Nassif et al. 2021 ; Nassirtoussi et al. 2014 ; Oueslati et al. 2020 ; Peng et al. 2017 ; Pereira 2021 ; Rambocas and Pacheco 2018 ; Ravi and Ravi 2015 ; Schouten and Frasincar 2015 ; Sharma and Jain 2020 ; Yue et al. 2019 ; Zhou and Ye 2020 ). They summarized the methods and application prospects of sentiment analysis under different contents and topics. As the field has grown, new topics have emerged, and knowledge from other fields has been gradually integrated into it. In recent years, the popularity of social media has aroused increasing interest in sentiment analysis research, and the number of papers published, especially those related to different topics of sentiment analysis, has grown rapidly. However, the existing surveys cover a short time range, and there has not been a survey dedicated to the evolution of research contents or topics of sentiment analysis. There have also been few survey works analyzing the connections between topics and methods, or their evolution (e.g., how the contents and topics of sentiment analysis have changed over time).

Advantages and disadvantages of the existing surveys

From the point of view of the methods of sentiment analysis

Some researchers reviewed different techniques and methods of sentiment analysis in different application areas and tasks. They analyzed and discussed sentiment analysis methods based on lexicons, rules, part of speech, term position, statistical techniques, supervised and unsupervised machine learning methods, as well as deep learning methods like LSTM, CNN, RNN, DNN, DBN, BERT, and other hybrid approaches (Acheampong et al. 2021 ; Ain et al. 2017 ; Alamoodi et al. 2021b ; Asghar et al. 2014 ; Chan et al. 2022 ; Cheng et al. 2022 ; Feldman 2013 ; Habimana et al. 2020 ; Koto and Adriani 2015 ; Kumar, Akshi and Sebastian 2012 ; Medhat et al. 2014 ; Prabha and Srikanth 2019 ; Ravi and Ravi 2015 ; Schouten and Frasincar 2015 ; Serrano-Guerrero et al. 2021 ; Taboada 2016 ; Wang et al. 2020b ; Yue et al. 2019 ; Zhang et al. 2018 ; Zucco et al. 2020 ). These researchers also compared the advantages and disadvantages of each method. As summarized in Table ​ Table1, 1 , even though existing surveys analyze the techniques and methods of sentiment analysis, providing good insights, there has not been a survey that analyzes the evolution of research methods over time. There have also been few survey works that focuses on the connections between topics and methods of sentiment analysis, and their evolution over time.

From the point of view of the overall survey methodology

The survey methods used have mainly been the content analysis method, Kitchenham and Charters' guideline, and the informetric methods. As summarized in Table ​ Table1, 1 , the content analysis method can effectively analyze the contents of research papers in depth, but it does not address the issue of the evolution of the research methods and topics (Bengtsson 2016 ; Birjali et al. 2021 ; Elo and Kyngäs 2008 ; Krippendorff 2018 ; Qazi et al. 2015 , 2017 ; Wankhade et al. 2022 ). Although the number of papers that need to be read in depth can be reduced by following Kitchenham and Charters' guideline, more effort is needed to search and screen literature than in traditional literature review methods (Angel et al. 2021 ; Kitchenham 2004 ; Kitchenham and Charters 2007 ; Kumar and Garg 2020 ; Ligthart et al. 2021 ; Lin et al. 2022 ; Obiedat et al. 2021 ; Sarsam et al. 2020 ; Zunic et al. 2020 ). The informetric methods are best suited to investigating the research methods and topics of sentiment analysis (Bar-Ilan 2008 ; Mäntylä et al. 2018 ; Piryani et al. 2017 ; Santos et al. 2019 ; Verma 2022 ). There are three surveys using informetric techniques and tools that are well suited for analysis of a large number of papers over many years (Mäntylä et al. 2018 ; Piryani et al. 2017 ; Verma 2022 ). However, the evolution of research methods and topics of sentiment analysis over time has not been studied with informetric methods. There have also been few survey works that leverages keyword co-occurrence analysis and community detection to analyze the connections between research methods and topics, and their evolution over time.

Therefore, to address the gaps in the existing surveys, this study presents a survey on the research methods and topics, and their evolution over time. It combines keyword co-occurrence analysis and informetric analysis tools to reveal the methods and topics of sentiment analysis and their evolution in this field from 2002 to 2022.

The following section, Sect.  3 , describes our proposed survey methodology in detail.

The proposed survey methodology

This section describes our proposed survey methodology, including collection of scientific publications, processing of scientific publications, as well as visualization and analysis using different methods and tools. The overall scheme of this survey (Fig.  2 ) is also presented in the end of Sect.  3 to better visualize and summarize the proposed survey methodology in this research.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig2_HTML.jpg

Graphical representation of the overall scheme of this survey. Module A: Collection of scientific publications; Module B: Processing of scientific publications; Module C: Visualization and analysis using different methods and tools; Module D: Result analysis and discussions considering various aspects

Collection of scientific publications

We collected research data from the Web of Science platform. We used keywords such as "sentiment analysis," "sentiment mining," and "sentiment classification" to search for relevant papers as data samples. In examining the retrieved papers, we found that some paper topics, paper types, and publication journals were not related to sentiment analysis, so we excluded them. The papers we included were mainly related to the sentiment analysis of texts. We excluded papers on sentiment analysis related to image processing, video processing, speech processing, biological signal processing, etc. Therefore, the retrieval strategy was as follows:

Topic Search (TS) = ("sentiment analy*" or "sentiment mining" or "sentiment classification") And Abstract (AB) = "sentiment" NOT TS = ("face image*" or "speech recognition" or "speech emotion" or "physiological signal*" or "music emotion*" or "facial feature extraction" or "video emotion" or "electroencephalography " or "biosignal*" or "image process*") NOT Title = ("facial" or "speech" or "sound*" or "face" or "dance" or "temperature" or "image*" or "spoken" or "electroencephalography" or "EEG" or "biosignal*" or "voice*" not AB = "facial."

The results in conferences are given the same relevance as journal papers. We chose four databases in the Web of Science: two conference citation databases (Conference Proceedings Citation Index—Social Sciences & Humanities [CPCI-SSH], and Conference Proceedings Citation Index—Science [CPCI-S]), and two journal citation databases (Science Citation Index Expanded [SCI-Expanded] and Social Sciences Citation Index [SSCI]). Given the various forms of words such as "analyzing" and "analysis," a truncated search technique (marked with an asterisk) was used to prevent the omission of relevant papers. The time frame of the retrieved papers was from January 2002 to January 2022, and the publication types of the papers included "article," "conference paper," "review," and "edited material." A total of 9,714 papers were obtained from the four databases above. These included 3,809 articles, 5,633 proceeding papers, 267 reviews, and 5 pieces of editorial material from 2002 to 2022. Overall, there were 104 papers from January 2022. The number of papers each year from 2002 to 2021 is shown in Fig.  1 .

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig1_HTML.jpg

The number of papers each year from 2002 to 2021

Processing of scientific publications

In this process, our purpose was to extract the key contents of the papers, which are used to analyze the research methods and topics in the field of sentiment analysis. Due to their limited number, the author keywords in each paper often cannot fully represent the key content of the paper. We found that combining the title and abstract could better reflect the core information. Therefore, we synthesized the title, abstract, and author keywords of each paper to extract keywords that represented the main research method and topic of the paper involved using KeyBERT 1 . KeyBERT is a keyword extraction technique that uses BERT embedding to create keywords and key phrases that most closely resemble document content (Grootendorst and Warmerdam 2021 ). The specific keyword extraction process was as follows:

First, we used KeyBERT to extract 8 keywords and eliminated keywords with a weight lower than 0.3. We then combined the extracted keywords with the author keywords and removed duplicates. After that, we standardized the whole collection of keywords and merged synonyms. Finally, we counted the number of keywords and removed meaningless terms like "sentiment analysis," "sentiment classification," and "sentiment mining."

After statistical analysis, we obtained 41,827 keywords with a total word frequency of 88,104. As there were 9,714 papers and 41,827 keywords, we found that most of the keywords with word frequency below 10 were not representative of the research contents of sentiment analysis. As a result, a total of 685 representative keywords were reserved for subsequent analysis. These keywords appeared a total of 30,801 times. Table ​ Table2 2 shows the keywords with word frequency in the top 50.

Keywords with word frequency in the top 50

High-frequency keywords generally represent research hotspots. We therefore extracted high-frequency keywords to serve as the basis for the subsequent analysis. We found that most of the keywords with word frequency 18 and lower, such as "ranking," "mask," "experience," "affect," "online forum," and so on, were not relevant to sentiment analysis. Therefore, the keywords with a word frequency higher than 18 were reserved for analysis. These keywords appeared 25,429 times in the collected data, accounting for close to 83% of all the keywords. We obtained 275 keywords, which were used to analyze the main methods and topics of sentiment analysis.

Visualization and analysis using different methods and tools

Analytical methods.

Keywords are the core natural language vocabulary to express the subject, content, ideas, and research methods of the literature (You et al. 2021 ). Keywords represent the topics of the domain, and cluster analysis of these words can reflect the structure and association of topics. Keyword co-occurrence analysis counts the number of occurrences of a set of keywords in the same document. The strength and number of associations between research contents can be obtained through keyword co-occurrence analysis. Dividing research methods and topics into sub-communities helps researchers to analyze hotspots and trends in methods and topics, as well as to obtain sub-fields of sentiment analysis research (Ding et al. 2001 ).

Visualization and analysis tools

BibExcel 2 is a software tool for analyzing bibliographic data or any text-based data formatted in a similar way (Persson 2017 ). The tool generates structured data files that can be read by Excel for subsequent processing (Persson et al. 2009 ). Our processing steps are as follows. First, we imported the standardized bibliographic data into BibExcel. This tool can help structure the data. Second, we checked and corrected the data and used BibExcel to count the number of co-occurrences of keywords.

We then used Pajek 3 software to visualize the keyword co-occurrence network and divided the sub-communities. Pajek is a large and complex network analysis tool (Batagelj and Andrej 2022 ; Batagelj and Mrvar 1998 ). It can calculate certain indicators to reveal the state and properties of the network involved. In addition, Pajek’s Louvain community detection algorithm can help divide the keyword co-occurrence network into sub-communities, which represent sub-fields of sentiment analysis (Blondel et al. 2008 ; Leydesdorff et al. 2014 ; Rotta and Noack 2011 ). The Louvain community-detection algorithm unfolds a complete hierarchical community structure for the network. It has an advantage in subdividing different areas of study: multiple knowledge structures and details can be shown in one network (Deng et al. 2021 ).

After that, we applied VOSviewer 4 to optimize the visualization of sub-communities (Van Eck and Waltman 2010 ; VOSviewer 2021 ; Perianes-Rodriguez et al. 2016 ; Waltman and Van Eck 2013 ; Waltman et al. 2010 ). VOSviewer can help display the core keywords in each sub-community and the correlation between keywords. It can also reflect the closeness of the association between sub-communities. Finally, we used Excel to count the frequency of keywords for each year and to map the evolution of research methods and topics in the field of sentiment analysis.

Graphical representation of the overall scheme of this survey

This paper proposes and conducts a new research survey on sentiment analysis. The graphical representation of the overall scheme of this survey is shown in Fig.  2 . The main scheme includes four modules: Module A, Collection of scientific publications; Module B, Processing of scientific publications; Module C, Visualization and analysis through different methods and tools, and Module D, Result analysis and discussions based on various aspects.

In Module A, scientific publications are collected from the Web of Science (WOS) platform, as has been detailed in Sect.  3.1 Collection of scientific publications above. Module B, Processing of scientific publications, has been detailed in Sect.  3.2 above. It performs a data processing procedure to obtain key information, which includes all the representative keywords and high-frequency keywords. The title, abstract and keywords of the papers are used to extract such key information using KeyBERT (Grootendorst and Warmerdam 2021 ). Such key information is analyzed and visualized through different methods, including different visualization tools, as introduced in Sect.  3.3 (Module C), Visualization and analysis using different methods and tools, above.

In Module C, the number of co-occurrences of keywords is obtained using BibExcel (Persson 2017 ), the co-occurrences of keywords are analyzed and visualized using Pajek (Blondel et al. 2008 ; Leydesdorff et al. 2014 ; Rotta and Noack 2011 ) and VOSviewer (Van Eck and Waltman 2010 ; VOSviewer 2021 ; Perianes-Rodriguez et al. 2016 ; Waltman and Van Eck 2013 ; Waltman et al. 2010 ). The keyword community network and the keyword community evolution are analyzed and visualized using these tools, as described in Sect.  3.3 (Module C), Visualization and analysis using different methods and tools. According to the visualization and analysis results obtained in Module C, Module D, Result analysis and discussions, will be detailed in Sect.  4 .

In the following section, Sect.  4 (Module D), results are analyzed and discussed considering various aspects, including the research methods and topics of sentiment analysis in each community, the evolution of research methods and topics along with the research hotspots and trends over time.

Results and analysis through various aspects

Research methods and topics of sentiment analysis, overall characteristic analysis.

The high-frequency keywords were presented in Table ​ Table2. 2 . These keywords can be regarded as the main research contents in the field of sentiment analysis. "Twitter" ranks at the top. It is followed by "opinion mining," "natural language processing," "machine learning," and so on. The high-frequency keywords cover the topics of the studies, the contents of the studies, and the techniques and methods used. Based on these keywords, we used Pajek’s Louvain method to construct a keyword co-occurrence network to represent the research methods and topics as shown in Fig.  3 . The keyword co-occurrence network is divided into six communities. The research methods and topics of the six communities include social media platforms (C1), machine learning methods (C2), natural language processing and deep learning methods (C3), opinion mining and text mining (C4), Arabic sentiment analysis (C5), and others, such as domain sentiment analysis and transfer learning, etc. (C6).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig3_HTML.jpg

Keyword community network

In Fig.  3 , the size of the node represents the number of keywords. The thickness of the line between the nodes represents the number of collaborations between keywords. The top 20 keywords in each community are sorted in descending order, as shown in Table ​ Table3. 3 . The keyword co-occurrence network features of the six sub-communities are described in Table ​ Table4. 4 . The number of nodes shows the number of keywords in each community, and the number of links shows the correlations between the keywords.

The top 20 keywords in each community

Global network characteristics of sub-communities

As shown in Table ​ Table4, 4 , we can see from the number of links between sub-communities that there is a strong correlation between them, especially the link between C3 and C4, which has 1306 lines. The reason may be that the research methods of C4 focus on "opinion mining" and "text mining," while those of C3 focus on "natural language processing" and "deep learning," and C3 provides more technical support for C4 research. In C5 and C6, the research methods and topics are scattered. Their internal links are also low, but the connections with C3 and C4 are relatively high. The contents of C5 and C6 may include some emerging research methods and topics. We will present a specific analysis on the methods and topics of each sub-community in the next subsection.

Analysis on research methods and topics of sub-communities

Analysis on research methods and topics of the c1 community.

Figure  4 shows the keyword co-occurrence network of the C1 community. The research methods and topics of the C1 community focus on three areas: "social media," "topic models," and "covid-19." In the context of big data, web 2.0 technology provides users with a way to express reviews and opinions of services, events, and people. Various social media platforms, such as Twitter, YouTube, and Weibo, have a large amount of users’ emotional data (Momtazi 2012 ). Compared to traditional news media, information on social media spreads more quickly, and people are able to express their feelings more freely. It is important to analyze the emotions generated by the information shared and published on social media (Abdullah and Zolkepli 2017 ; Wang et al. 2014 ). Researchers have been extracting text data from social media platforms for years to detect unexpected events (Bai and Yu 2016 ; Preethi et al. 2015 ), improve the quality of products (Abrahams et al. 2012 ; Isah et al. 2014 ; Myslin et al. 2013 ), understand the direction of public opinion (Fink et al. 2013 ; Groshek and Al-Rawi 2013 ), and so on.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig4_HTML.jpg

The keyword co-occurrence network for the C1 community

Users’ sentiments are often associated with the topics, and the accuracy of sentiment analysis can be improved through the introduction of topic models (Li et al. 2010 ). Among them, the Latent Dirichlet Allocation (LDA) method is cited most frequently. Previous studies found that the LDA method can be effective in subdividing topics and identifying the sentiments of the contents. This method is quite general, and there are also many improved models based on this one that can be applied to any type of web text, helping to enhance the accuracy of sentiment polarity calculation (Chen et al. 2019 ; Liu et al. 2020 ).

As the COVID-19 pandemic has unfolded, a large number of individuals, media and governments have been publishing news and opinions about the COVID-19 crisis on social media platforms. This has resulted in a lot of sentiment analysis studies focusing on COVID-19-related texts exploring the impact of the epidemic on people’s lives (Sari and Ruldeviyani 2020 ; Wang, T. et al. 2020a ), physical health (Berkovic et al. 2020 ; Binkheder et al. 2021 ) and mental health (Yin et al. 2020 ), and so on. Therefore, we can see many related keywords, such as "infodemiology," "healthcare," and "mental health."

Analysis on research methods and topics of the C2 community

The contents of the C2 community mainly focus on "machine learning," "text classification," "feature extraction," and "stock market" (see Fig.  5 ). Most keywords are related to the research methods of sentiment analysis. Machine learning approaches have expanded from topic recognition to more challenging tasks such as sentiment classification. It is very important to explore and compare machine learning methods applied to sentiment classification (Li and Sun 2007 ). Methods like Support Vector Machine (SVM) and Naive Bayes models are widely used (Altrabsheh et al. 2013 ; Dereli et al. 2021 ; Shofiya and Abidi 2021 ; Tan et al. 2009 ; Wang and Lin 2020 ) and are used as benchmarks for the comparisons of models proposed by many researchers (Kumar et al. 2021 ; Sadamitsu et al. 2008 ; Waila et al. 2012 ; Zhang et al. 2019 ). Many algorithms, such as random forest (Al Amrani et al. 2018 ; Fitri et al. 2019 ; Sutoyo et al. 2022 ), tf-idf (Arafin Mahtab et al. 2018 ; Awan et al. 2021 ; Dey et al. 2017 ), logistic regression (Prabhat and Khullar 2017 ; Qasem et al. 2015 ; Sutoyo et al. 2022 ), and n-gram (Ikram and Afzal 2019 ; Singh and Kumari 2016 ; Xiong et al. 2021 ) are used to enhance the accuracy of machine learning, as shown in Fig.  5 .

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig5_HTML.jpg

The keyword co-occurrence network for the C2 community

The trading volume and asset prices of financial commodities or financial instruments are influenced by a variety of factors in the online environment. Machine learning and sentiment analysis are powerful tools that can help gather vast amounts of useful information to predict financial risk effectively (Li et al. 2009 ). Research on the relationship between public sentiment and stock prices has always been the focus of many scholars (Smailović et al. 2014 ; Xing et al. 2018 ). They have used machine learning methods to explore the influence of sentiments on stock prices through sentiment analysis of news articles, and then predicted the trend changes in the stock market (Ahuja et al. 2015 ; Januário et al. 2022 ; Maqsood et al. 2020 ; Picasso et al. 2019 ).

Analysis on research methods and topics of the C3 community

The contents of the C3 community also mainly focus on the methods for sentiment analysis, like "natural language processing", "deep learning," "aspect-based sentiment analysis," and "task analysis" (Fig.  6 ). Sentiment analysis is a sub-field of natural language processing (Nicholls and Song 2010 ), and natural language processing techniques have been widely used in sentiment analysis. Using natural language processing technology can help to better parse text features, such as part-of-speech tagging, word sense disambiguation, keyword extraction, inter-word dependency recognition, semantic parsing, and dictionary construction (Abbasi et al. 2011 ; Syed et al. 2010 ; Trilla and Alías 2009 ). With the rise of deep learning technology, researchers began to introduce it to sentiment analysis. Neural network models like LSTM (Al-Dabet et al. 2021 ; Al-Smadi et al. 2019 ; Li and Qian 2016 ; Schuller et al. 2015 ; Tai et al. 2015 ), CNN (Cai and Xia 2015 ; Jia and Wang 2022 ; Ouyang et al. 2015 ), RNN (Hassan and Mahmood 2017 ; Tembhurne and Diwan 2021 ; You et al. 2016 ), and some combination of these, as well as other models (An and Moon 2022 ; Li et al. 2022 ; Liu et al. 2020a ; Salur and Aydin 2020 ; Zhao et al. 2021 ), have received significant attention.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig6_HTML.jpg

The keyword co-occurrence network for the C3 community

Sentiment analysis granularity is subdivided into document level, sentence level, and aspect level. Document-level sentiment analysis takes the entire document as a unit, but the premise is that the document needs to have a clear attitude orientation—that is, the point of view needs to be clear (Shirsat et al. 2018 ; Wang and Wan 2011 ). Sentence-level sentiment analysis is intended to perform sentiment analysis of the sentences in the document alone (Arulmurugan et al. 2019 ; Liu et al. 2009 ; Nejat et al. 2017 ). Aspect-based analysis is a fundamental and significant task in sentiment analysis. The aim of aspect-level sentiment analysis is to separately summarize positive and negative views about different aspects of a product or entity, although overall sentiment toward a product or entity may tend to be positive or negative (Rao et al. 2021 ; Thet et al. 2010 ). Aspect-level sentiment analysis facilitates a more finely-grained analysis of sentiment than either document or sentence-level analysis (Liang et al. 2022 ; Wang et al. 2020c ). The traditional levels of analysis, such as sentence-level analysis can only calculate the comprehensive sentiment polarity of paragraphs or sentences (Wang et al. 2016 ; Zhang et al. 2021 ). In recent years, the aspect level has become more and more popular, and with the application of deep learning technology, it has become better at capturing the semantic relationship between aspect terms and words in a more quantifiable way (Huang et al. 2018 ). The process of sentiment analysis involves the coordination of multiple tasks, and the subtasks include feature extraction (Bouktif et al. 2020 ; Lin et al. 2020 ), context analysis (Yu et al. 2019 ; Zuo et al. 2020 ), and the application of some analytical models (Tan et al. 2020 ).

Analysis on research methods and topics of the C4 community

The C4 community mainly shows keywords related to the research methods and topics of "opinion mining" and "user review," which is the largest of the six sub-communities (Fig.  7 ). With the popularity of platforms like online review sites and personal blogs on the Internet, opinions and user reviews are readily available on the web. Opinion mining has always been a hot field of research (Khan et al. 2009 ; Poria et al. 2016 ). From Table ​ Table4, 4 , we can see that the link between C3 and C4 has 1306 lines. In opinion mining, researchers use many text mining methods to discover users’ opinions on goods or services, and then help improve the quality of corresponding products or services (Da’u et al. 2020 ; Lo and Potdar 2009 ; Martinez-Camara et al. 2011 ). In addition, scholars have found that the consideration of user opinions can help improve the overall quality of recommender systems (Artemenko et al. 2020 ; Da’u et al. 2020 ; Garg 2021 ; Malandri et al. 2022 ). Therefore, "recommendation system" has a strong correlation with "opinion mining."

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig7_HTML.jpg

The keyword co-occurrence network for C4 community

Evaluation metrics for quantifying the existing approaches are also a popular topic related to opinion mining. There is a keyword named "performance sentiment" in the C4 community. Precision, recall, accuracy and F1-score are the most commonly used evaluation metrics (Dangi et al. 2022 ; Jain et al. 2022 ; JayaLakshmi and Kishore 2022 ; Li et al. 2017 ; Wang et al. 2021 ; Yi and Niblack 2005 ). Some researchers have also used runtimes to calculate the model efficiency (Abo et al. 2021 ; Ferilli et al. 2015 ), p-value to statistically evaluate the relationship or difference between two samples of classification results (JayaLakshmi and Kishore 2022 ; Salur and Aydin 2020 ), paired sample t-tests to verify that the results are not obtained by chance (Nhlabano and Lutu 2018 ), and standard deviation to measure the stability of the model (Chang et al. 2020 ). There have also been researchers who have used G-mean (Wang et al. 2021 ), Pearson Correlation Coefficient (Corr) (Yang et al. 2022 ), Mean Absolute Error (MAE) (Yang et al. 2022 ), Normalized Information Transfer (NIT) and Entropy-Modified Accuracy (EMA) (Valverde-Albacete et al. 2013 ), Mean Squared Error (MSE) (Mao et al. 2022 ), Hamming loss (Liu and Chen 2015 ), Area Under the Curve (AUC) (Abo et al. 2021 ), sensitivity and specificity (Thakur and Deshpande 2019 ), etc.

Analysis on research methods and topics of the C5 & C6 communities

Both sub-communities C5 (Fig.  8 ) and C6 (Fig.  9 ) are small in size. The C5 community has 25 nodes and the C6 community has 41 nodes. The core content of the C5 community is "Arabic sentiment analysis." Before 2011, most resources and systems built in the field of sentiment analysis were tailored to English and other Indo-European languages. It is increasingly necessary to design sentiment analysis systems for other languages (Korayem et al. 2012 ), and researchers are increasingly interested in the study of tweets and texts in the Arabic language (Heikal et al. 2018 ; Khasawneh et al. 2013 ; Oueslati et al. 2020 ). They use technologies such as named entity recognition (Al-Laith and Shahbaz 2021 ), deep learning (Al-Ayyoub et al. 2018 ; Heikal et al. 2018 ), and corpus construction (Alayba et al. 2018 ) to enhance the accuracy of sentiment analysis.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig8_HTML.jpg

The keyword co-occurrence network for the C5 community

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig9_HTML.jpg

The keyword co-occurrence network for the C6 community

The contents of the C6 community are not very concentrated. From the size of the circle, we can see that the keywords "domain adaptation"(Blitzer et al. 2007 ; Glorot et al. 2011 ), "domain sentiment," and "cross-domain" appear more frequently. Cross-domain sentiment classification is intended to address the lack of mass labeling data (Du et al. 2020a ). It has attracted much attention (Du et al. 2020b ; Hao et al. 2019 ; Yang et al. 2020b ). Advances in communication technology have provided valuable interactive resources for people in different regions, and the processing of multilingual user comments has gradually become a key challenge in natural language processing (Martinez-Garcia et al. 2021 ). Therefore, some keywords related to "lingual" have appeared. Other keywords, such as "transfer learning," "active learning," and "semi-supervised learning," are mainly related to sentiment analysis technologies.

Evolution of research methods and topics of sentiment analysis

Overall evolution analysis.

Annual changes in keyword frequency in sentiment analysis research can reflect the evolution of research methods and topics in this field. Based on the keyword community network (Fig.  3 ), we counted the frequency of keywords in each sub-community for each year. The keyword community evolution diagram is shown in Fig.  10 . Since there were fewer papers published before 2006, we combined the occurrences of keywords from 2002 to 2006. We can see that the C1 community and the C3 community have shown a significant growth trend. The C2 community was in a state of growth until 2019, and the frequency of keywords decreased year by year after 2019. The frequency of C4 community keywords continued to increase until 2018 and declined after 2018. The number of keywords in the C5 community and in the C6 community both had a slow growth trend, but the trend was not obvious.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig10_HTML.jpg

Keyword community evolution diagram

Evolution analysis of sub-communities

We selected the high-frequency keywords under each category and plotted the change of word frequency in each year, as shown in Figs.  11 and ​ and12. 12 . In the C1 community, "social medium," "Twitter," "social network," "covid-19," "Latent Dirichlet Allocation," "topic model," and "text analysis" all had significant increases in word frequency, and the growth trend in 2021 was obvious. "Covid-19" appears in 2020, and the word frequency increased rapidly in 2021. Social media platforms have always been the focus of researchers’ attention. Under the influence of COVID-19, more people express their emotions, stress, and thoughts through social media platforms. Sentiment analysis on data from social media platforms related to COVID-19 has become a hot topic (Boon-Itt and Skunkan 2020 ). We believe that due to the impact of COVID-19, the widespread use of social platforms in 2020–2021 has led to a surge in the number of C1-related keywords.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig11_HTML.jpg

C1, C2, C5, C6 communities: High-frequency keyword evolution diagram

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig12_HTML.jpg

C3, C4 communities: High-frequency keyword evolution diagram

The C2 community focuses on the method of "machine learning," and the C3 community focuses on the methods of "deep learning" and "natural language processing." The keywords in the two communities are mainly related to the techniques and methods of sentiment analysis. We have found that before 2016 (Fig.  10 ), the frequency of keywords in the C2 community was higher than that in the C3 community, and in 2016 and later, the frequency of keywords in the C3 community gradually accounted for a larger proportion of the total. This reflects the fact that deep learning-related technologies and methods have become a research hotspot, and the attention given to SVM, Naive Bayes, supervised learning, and other technologies in machine learning has declined. In addition to deep learning models such as Bi-LSTM, Long Short-term Memory, and recurrent neural network in the C3 community, the number of "aspect based" and "feature extraction" keywords have also been growing, which shows that researchers now pay more attention to the aspect level of text granularity in the field of sentiment analysis.

Among the keywords found in the C4 community, the word frequency of the "opinion mining" keyword has decreased since 2018. This shows that in the field of sentiment analysis, researchers have begun to reduce the attention they give to sentiment analysis of opinions on product or service quality, while still maintaining a certain degree of attention to "user review" and "online review." In addition, the number of keywords for "sentiment lexicon" and "lexicon-based" has declined. It may be because, in the context of the widespread application of deep learning technology in recent years, the lexicon-based method requires more time and higher labor costs (Kaity and Balakrishnan 2020 ). However, its accuracy still attracts attention due to the high involvement of experts, especially in non-English languages (Bakar et al. 2019 ; Kydros et al. 2021 ; Piryani et al. 2020 ; Tammina 2020 ; Xing et al. 2019 ; Yurtalan et al. 2019 ).

The high-frequency keywords in the C5 and C6 communities are "Arabic language," "Arabic sentiment analysis," and "transfer learning." Arabic has 30 variants, including the official Modern Standard Arabic (MSA) (ISO 639–3 2017). Arabic dialects are becoming increasingly popular as the language of informal communication on blogs, forums, and social media networks (Lulu and Elnagar 2018 ). This makes them challenging languages for natural language processing and sentiment analysis (Alali et al. 2019 ; Elshakankery and Ahmed 2019 ; Sayed et al. 2020 ). Transfer learning can solve the problem by leveraging knowledge obtained from a large-scale source domain to enhance the classification performance of target domains (Heaton 2018 ). In recent years, based on the success of deep learning technology, this method has gradually attracted attention.

Research hotspots and trends

Through the analysis in Sects.  4.1 and 4.2 , we found that the research methods and topics of sentiment analysis are constantly changing. The keyword topic heat map is shown in Fig.  13 . From this map, we can see that in the past two decades, research hotspots have included social media platforms (such as "social medium," "social network," and "Twitter"); sentiment analysis techniques and methods (such as "machine learning," "svm," "natural language processing," "deep learning," "aspect-based," "text mining," and "sentiment lexicon"), mining of user comments or opinions (e.g., "opinion mining," "user review," and "online review"), and sentiment analysis for non-English languages (e.g., "Arabic sentiment analysis" and "Arabic language").

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10386_Fig13_HTML.jpg

Keyword topic heat map

With the popularity of digitization, a large amount of user-generated content has appeared on the Internet, where users express their opinions and comments on different topics such as the news, events, activities, products, services, etc. through social media. This is especially so in the case of the Twitter mobile platform, launched in 2006, which has become the most popular social channel (Kumar and Jaiswal 2020 ). However, online text data is mostly unstructured. In order to accurately analyze users’ sentiments, the research methods for sentiment analysis, such as natural language processing technology, and automatic sentiment analysis models have become the focus of researchers’ works. From Fig.  11 , we can see that early technologies and methods are dominated by machine learning and that SVM and Naive Bayes have always been favored by researchers. This has also been confirmed in studies by Neha Raghuvanshi (Raghuvanshi and Patil 2016 ), Harpreet Kaur (Kaur et al. 2017 ), and Marouane Birjali (Birjali et al. 2021 ). With the improvement of neural network and artificial intelligence technology, deep learning technology has been widely used in sentiment analysis, and has resulted in good outcomes (Basiri et al. 2021 ; Ma et al. 2018 ; Prabha and Srikanth 2019 ; Yuan et al. 2020 ). However, deep learning technology still has room for improvement, and the hybrid methods combining sentiment dictionary and semantic analysis are gradually becoming a trend (Prabha and Srikanth 2019 ; Yang et al. 2020a ).

The granularity of sentiment analysis ranges from the early text level to the sentence level and finally to the aspect level, which is currently gaining strong attention. The granularity of sentiment analysis is gradually being refined, but the method is immature at present, and further research work in the future is needed (Agüero-Torales et al. 2021 ; Li et al. 2020 ; Trisna and Jie 2022 ).

Early sentiment analysis was mainly in the English language. In recent years, non-English languages such as Chinese (Lai et al. 2020 ; Peng et al. 2018 ), French (Apidianaki et al. 2016 ; Pecore and Villaneau 2019 ), Spanish (Chaturvedi et al. 2016 ; Plaza-del-Arco et al. 2020 ), Russian (Smetanin 2020 ), and Arabic (Alhumoud and Al Wazrah 2022 ; Ombabi et al. 2020 ) have attracted more and more attention. Furthermore, cross-domain sentiment analysis technology is in urgent need of research and discussion by researchers (Liu et al. 2019 ; Singh et al. 2021 ).

Conclusion and future work

Judging from the increasing number of papers related to sentiment analysis research every year, sentiment analysis has been on the rise. Although there are many surveys on sentiment analysis research, there has not been a survey dedicated to the evolution of research methods and topics of sentiment analysis. This paper has used keyword co-occurrence analysis and the informetric tools to enrich the perspectives and methods of previous studies. Its aims have been to outline the evolution of the research methods and tools, research hotspots and trends and to provide research guidance for researchers.

By adopting keyword co-occurrence analysis and community detection methods, we analyzed the research methods and topics of sentiment analysis, as well as their connections and evolution trends, and summarized the research hotspots and trends in sentiment analysis. We found that research hotspots include social media platforms, sentiment analysis techniques and methods, mining of user comments or opinions, and sentiment analysis for non-English languages. Moreover, deep learning technology, with its hybrid methods combining sentiment dictionary and semantic analysis, fine-grained sentiment analysis methods, and non-English language analysis methods, and cross-domain sentiment analysis techniques have gradually become the research trends.

Practical implications and technical directions of sentiment analysis

Sentiment analysis has a wide range of application targets, such as e-commerce platforms, social platforms, public opinion platforms, and customer service platforms. Years of development have led to many related tasks in sentiment analysis, such as sentiment analysis of different text granularity, sentiment recognition, opinion mining, dialogue sentiment analysis, irony recognition, false information detection, etc. Such analysis can help structure user reviews, support product improvement decisions, discover public opinion hotspots, identify public positions, investigate user satisfaction with products, and so on. As long as user-generated content is involved, sentiment analysis technology can be used to mine the emotions of human actors associated with the content. The improvement of sentiment analysis technology can help machines better understand the thoughts and opinions of users, make machines more intelligent, and make better decisions for policy leaders, businessmen, and service people. However, most of the current sentiment analysis methods are based on sentiment dictionaries, sentiment rules, statistics-based machine learning models, neural network-based deep learning models, and pre-training models, and have yet to achieve true language understanding in the sense of comprehension at the deep semantic level, though this does not prevent them from being useful in certain practical applications.

As an important task in natural language understanding, sentiment analysis has received extensive attention from academia and industry. Coarse-grained sentiment analysis is increasingly unable to meet people's decision-making needs, and for aspect-level sentiment analysis and complex tasks, pure machine learning is still unable to flexibly achieve true language understanding. Once the scene or domain changes, problems such as the domain incompatibility of the sentiment dictionary and the low transfer effect of the model involved keep appearing. At present, the accuracy of sentiment analysis provided by machines is far less than that of humans. To achieve human-like performance for machines, we believe that it is necessary to incorporate human commonsense knowledge and domain knowledge, as well as grounded definitions of concepts, in order for machines to understand natural language at a deeper level. These, combined with rules for affective reasoning to supplement interpretable information, will be effective in improving the performance of sentiment analysis. Future research in this direction can be strengthened to achieve true language understanding in machines.

Limitations and future work

There are some research limitations in this paper. First, we only studied papers written in English and searched from the Web of Science platform. We believe there are papers in other languages or other databases (e.g., Scopus, PubMed, Sci-hub, etc.) that also involve sentiment analysis but that were not included in our study. In addition, the keywords we chose to search in the Web of Science were mainly "sentiment analysis," "sentiment mining," and "sentiment classification." There may be papers related to our research topic that do not have these keywords. To track developments in sentiment analysis research, future studies could replicate this work by employing more precise keywords and using different literature databases.

Second, we selected the main high-frequency keywords for analysis, and some important low-frequency keywords may have been ignored. In future work, we can analyze the changes in each keyword in detail from the perspective of time and obtain more comprehensive analysis results.

Third, the results show that the themes of sentiment analysis cover many fields, such as computer science, linguistics, and electrical engineering, which indicates the trend of interdisciplinary research. Therefore, future work should apply co-citation and diversity measures to explore the interdisciplinary nature of sentiment analysis research.

Acknowledgements

The authors would like to thank the China Scholarship Council (CSC No. 202106850069) for its support for the visiting study.

This work has not received any funding.

Data availability

Declarations.

The authors declare that they have no conflict of interest or competing interest in this article.

This article does not contain any studies with human participants or animals performed by any of the authors.

1 https://github.com/MaartenGr/KeyBERT .

2 https://homepage.univie.ac.at/juan.gorraiz/bibexcel/ .

3 http://mrvar.fdv.uni-lj.si/pajek/ .

4 https://www.vosviewer.com/ .

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Jingfeng Cui, Email: nc.ude.uajn@5004129102 .

Zhaoxia Wang, Email: gs.ude.ums@gnawxz .

Seng-Beng Ho, Email: gs.ude.rats-a.cphi@bsoh .

Erik Cambria, Email: gs.ude.utn@airbmac .

  • Abbasi A, France S, Zhang Z, Chen H. Selecting attributes for sentiment classification using feature relation networks. IEEE Trans Knowl Data Eng. 2011; 23 (3):447–462. doi: 10.1109/TKDE.2010.110. [ CrossRef ] [ Google Scholar ]
  • Abdullah NSD, Zolkepli IA (2017) Sentiment analysis of online crowd input towards Brand Provocation in Facebook, Twitter, and Instagram. In: Proceedings of the international conference on big data and internet of thing, association for computing machinery, pp 67–74. 10.1145/3175684.3175689
  • Abo MEM, Idris N, Mahmud R, Qazi A, Hashem IAT, Maitama JZ, et al. A multi-criteria approach for Arabic dialect sentiment analysis for online reviews: exploiting optimal machine learning algorithm selection. Sustainability. 2021; 13 (18):10018. doi: 10.3390/su131810018. [ CrossRef ] [ Google Scholar ]
  • Abrahams AS, Jiao J, Wang GA, Fan W. Vehicle defect discovery from social media. Decis Support Syst. 2012; 54 (1):87–97. doi: 10.1016/j.dss.2012.04.005. [ CrossRef ] [ Google Scholar ]
  • Acheampong FA, Nunoo-Mensah H, Chen W. Transformer models for text-based emotion detection: a review of BERT-based approaches. Artif Intell Rev. 2021; 54 (8):5789–5829. doi: 10.1007/s10462-021-09958-2. [ CrossRef ] [ Google Scholar ]
  • Adak A, Pradhan B, Shukla N. Sentiment analysis of customer reviews of food delivery services using deep learning and explainable artificial intelligence: systematic review. Foods. 2022; 11 (10):1500. doi: 10.3390/foods11101500. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Agüero-Torales MM, Salas JIA, López-Herrera AG. Deep learning and multilingual sentiment analysis on social media data: an overview. Appl Soft Comput. 2021; 107 :107373. doi: 10.1016/j.asoc.2021.107373. [ CrossRef ] [ Google Scholar ]
  • Ahuja R, Rastogi H, Choudhuri A, Garg B (2015) Stock market forecast using sentiment analysis. In: 2015 2nd International conference on computing for sustainable global development, INDIACom 2015, Bharati Vidyapeeth, New Delhi, pp 1008–1010. 10.48550/arXiv.2204.05783
  • Ain QT, Ali M, Riaz A, Noureen A, Kamranz M, Hayat B, et al. Sentiment analysis using deep learning techniques: a review. Int J Adv Comput Sci Appl. 2017; 8 (6):424–433. doi: 10.14569/ijacsa.2017.080657. [ CrossRef ] [ Google Scholar ]
  • Al-Ayyoub M, Nuseir A, Alsmearat K, Jararweh Y, Gupta B. Deep learning for Arabic NLP: a survey. J Comput Sci. 2018; 26 :522–531. doi: 10.1016/j.jocs.2017.11.011. [ CrossRef ] [ Google Scholar ]
  • Al-Ayyoub M, Khamaiseh AA, Jararweh Y, Al-Kabi MN. A comprehensive survey of Arabic sentiment analysis. Inf Process Manag. 2019; 56 (2):320–342. doi: 10.1016/j.ipm.2018.07.006. [ CrossRef ] [ Google Scholar ]
  • Al-Dabet S, Tedmori S, AL-Smadi M. Enhancing Arabic aspect-based sentiment analysis using deep learning models. Comput Speech Lang. 2021; 69 :1224. doi: 10.1016/j.csl.2021.101224. [ CrossRef ] [ Google Scholar ]
  • Al-Laith A, Shahbaz M. Tracking sentiment towards news entities from Arabic news on social media. Futur Gener Comput Syst. 2021; 118 :467–484. doi: 10.1016/j.future.2021.01.015. [ CrossRef ] [ Google Scholar ]
  • Al-Smadi M, Talafha B, Al-Ayyoub M, Jararweh Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern. 2019; 10 (8):2163–2175. doi: 10.1007/s13042-018-0799-4. [ CrossRef ] [ Google Scholar ]
  • Alali M, Sharef NM, Murad MAA, Hamdan H, Husin NA. Narrow convolutional neural network for Arabic dialects polarity classification. IEEE Access. 2019; 7 :96272–96283. doi: 10.1109/ACCESS.2019.2929208. [ CrossRef ] [ Google Scholar ]
  • Alamoodi AH, Zaidan BB, Al-Masawa M, Taresh SM, Noman S, Ahmaro IYY, et al. Multi-perspectives systematic review on the applications of sentiment analysis for vaccine hesitancy. Comput Biol Med. 2021; 139 :4957. doi: 10.1016/j.compbiomed.2021.104957. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alamoodi AH, Zaidan BB, Zaidan AA, Albahri OS, Mohammed KI, Malik RQ, et al. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: a systematic review. Expert Syst Appl. 2021; 167 :114155. doi: 10.1016/j.eswa.2020.114155. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alayba AM, Palade V, England M, Iqbal R (2018) Improving sentiment analysis in arabic using word representation. In: 2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), IEEE, pp 13–18. 10.1109/ASAR.2018.8480191
  • Alhumoud SO, Al Wazrah AA. Arabic sentiment analysis using recurrent neural networks: a review. Artif Intell Rev. 2022; 55 (1):707–748. doi: 10.1007/s10462-021-09989-9. [ CrossRef ] [ Google Scholar ]
  • Alonso MA, Vilares D, Gómez-Rodríguez C, Vilares J. Sentiment analysis for fake news detection. Electronics. 2021; 10 (11):1348. doi: 10.3390/electronics10111348. [ CrossRef ] [ Google Scholar ]
  • Altrabsheh N, Gaber MM, Cocea M (2013) SA-E: sentiment analysis for education. In: The 5th KES International Conference on Intelligent Decision Technologies (KES-IDT), Sesimbra, Portugal, pp 353–362. 10.3233/978-1-61499-264-6-353
  • Al Amrani Y, Lazaar M, El Kadirp KE. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput Sci. 2018; 127 :511–520. doi: 10.1016/j.procs.2018.01.150. [ CrossRef ] [ Google Scholar ]
  • An H, Moon N. Design of recommendation system for tourist spot using sentiment analysis based on CNN-LSTM. J Ambient Intell Hum Comput. 2022; 13 :1653–1663. doi: 10.1007/s12652-019-01521-w. [ CrossRef ] [ Google Scholar ]
  • Angel SO, Negron APP, Espinoza-Valdez A. Systematic literature review of sentiment analysis in the spanish language. Data Technol Appl. 2021; 55 (4):461–479. doi: 10.1108/DTA-09-2020-0200. [ CrossRef ] [ Google Scholar ]
  • Apidianaki M, Tannier X, Richart C (2016) Datasets for aspect-based sentiment analysis in French. In: Proceedings of the tenth international conference on language resources and evaluation (LREC’16), Portorož, Slovenia: European Language Resources Association (ELRA), pp 1122–1126. https://aclanthology.org/L16-1179
  • Arafin Mahtab S, Islam N, Mahfuzur Rahaman M (2018) Sentiment analysis on Bangladesh cricket with support vector machine. In: 2018 International conference on Bangla Speech and language processing (ICBSLP), IEEE, pp 1–4. 10.1109/ICBSLP.2018.8554585
  • Artemenko O, Pasichnyk V, Kunanets N, Shunevych K (2020) Using sentiment text analysis of user reviews in social media for E-Tourism mobile recommender systems. In: COLINS, CEUR-WS, Aachen, pp 259–271. http://ceur-ws.org/Vol-2604/paper20.pdf
  • Arulmurugan R, Sabarmathi KR, Anandakumar H. Classification of sentence level sentiment analysis using cloud machine learning techniques. Clust Comput. 2019; 22 (1):1199–1209. doi: 10.1007/s10586-017-1200-1. [ CrossRef ] [ Google Scholar ]
  • Asghar MZ, Khan A, Ahmad S, Kundi FM. A review of feature selection techniques in sentiment analysis. J Basic Appl Sci Res. 2014; 4 (3):181–186. doi: 10.3233/IDA-173763. [ CrossRef ] [ Google Scholar ]
  • Awan MJ, Yasin A, Nobanee H, Ali AA, Shahzad Z, Nabeel M, et al. Fake news data exploration and analytics. Electronics. 2021; 10 (19):2326. doi: 10.3390/electronics10192326. [ CrossRef ] [ Google Scholar ]
  • Bai H, Yu G. A Weibo-based approach to disaster informatics: incidents monitor in post-disaster situation via weibo text negative sentiment analysis. Nat Hazards. 2016; 83 (2):1177–1196. doi: 10.1007/s11069-016-2370-5. [ CrossRef ] [ Google Scholar ]
  • Bakar MFRA, Idris N, Shuib L (2019) An enhancement of Malay social media text normalization for Lexicon-based sentiment analysis. In: 2019 International conference on Asian language processing (IALP), IEEE, pp 211–215. 10.1109/IALP48816.2019.9037700
  • Bar-Ilan J. Informetrics at the beginning of the 21st century—a review. J Informet. 2008; 2 (1):1–52. doi: 10.1016/j.joi.2007.11.001. [ CrossRef ] [ Google Scholar ]
  • Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Futur Gener Comput Syst. 2021; 115 :279–294. doi: 10.1016/j.future.2020.08.005. [ CrossRef ] [ Google Scholar ]
  • Batagelj V, Andrej M (2022) Pajek [Software]. http://mrvar.fdv.uni-lj.si/pajek/
  • Batagelj V, Mrvar A (1998) Pajek-program for large network analysis eds. M. Jünger and P Mutzel. Connections 21(2): 47–57. http://vlado.fmf.uni-lj.si/pub/networks/doc/pajek.pdf
  • Bengtsson M. How to plan and perform a qualitative study using content analysis. NursingPlus Open. 2016; 2 :8–14. doi: 10.1016/j.npls.2016.01.001. [ CrossRef ] [ Google Scholar ]
  • Berkovic D, Ackerman IN, Briggs AM, Ayton D. Tweets by people with arthritis during the COVID-19 pandemic: content and sentiment analysis. J Med Internet Res. 2020; 22 (12):e24550. doi: 10.2196/24550. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Binkheder S, Aldekhyyel RN, Almogbel A, Al-Twairesh N, Alhumaid N, Aldekhyyel SN, et al. Public perceptions around Mhealth applications during Covid-19 pandemic: a network and sentiment analysis of tweets in Saudi Arabia. Int J Environ Res Public Health. 2021; 18 (24):1–22. doi: 10.3390/ijerph182413388. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Birjali M, Kasri M, Beni-Hssane A. A comprehensive survey on sentiment analysis: approaches, challenges and trends. Knowl Based Syst. 2021; 226 :107134. doi: 10.1016/j.knosys.2021.107134. [ CrossRef ] [ Google Scholar ]
  • Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: 45th Annual Meeting of the association of computational linguistics, association for computational linguistics, pp 440–447. 10.1287/ijoc.2013.0585
  • Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008; 2008 (10):P10008. doi: 10.1088/1742-5468/2008/10/P10008. [ CrossRef ] [ Google Scholar ]
  • Boon-Itt S, Skunkan Y. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health Surv. 2020; 6 (4):1978. doi: 10.2196/21978. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Boudad N, Faizi R, Thami ROH, Chiheb R. Sentiment analysis in Arabic: a review of the literature. Ain Shams Eng J. 2018; 9 (4):2479–2490. doi: 10.1016/j.asej.2017.04.007. [ CrossRef ] [ Google Scholar ]
  • Bouktif S, Fiaz A, Awad M. Augmented textual features-based stock market prediction. IEEE Access. 2020; 8 :40269–40282. doi: 10.1109/ACCESS.2020.2976725. [ CrossRef ] [ Google Scholar ]
  • Brito KDS, Filho RLCS, Adeodato PJL. A systematic review of predicting elections based on social media data: research challenges and future directions. IEEE Trans Comput Soc Syst. 2021; 8 (4):819–843. doi: 10.1109/TCSS.2021.3063660. [ CrossRef ] [ Google Scholar ]
  • Cai G, Xia B (2015) Convolutional neural networks for multimedia sentiment analysis. In: Natural Language Processing and Chinese Computing, Springer, Cham, p 159–167. 10.1007/978-3-319-25207-0_14
  • Callon M, Courtial J-P, Turner WA, Bauin S. From translations to problematic networks: an introduction to co-word analysis. Soc Sci Inf. 1983; 22 (2):191–235. doi: 10.1177/053901883022002003. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Liu Q, Decherchi S, Xing F, Kwok K (2022a) SenticNet 7: a commonsense-based neurosymbolic AI Framework for Explainable Sentiment Analysis. In: LREC, Marseille: European Language Resources Association (ELRA), pp 3829–3839. https://sentic.net/senticnet-7.pdf
  • Cambria E, Dragoni M, Kessler B, Donadello I. Ontosenticnet 2: enhancing reasoning within sentiment analysis. IEEE Intell Syst. 2022; 37 (2):103–110. doi: 10.1109/MIS.2021.3093659. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Kumar A, Al-Ayyoub M, Howard N. Guest editorial: explainable artificial intelligence for sentiment analysis. Knowl Based Syst. 2022; 238 (3):107920. doi: 10.1016/j.knosys.2021.107920. [ CrossRef ] [ Google Scholar ]
  • Cambria E, Xing F, Thelwall M, Welsch R. Sentiment analysis as a multidisciplinary research area. IEEE Trans Artif Intell. 2022; 3 (2):1–4. [ Google Scholar ]
  • Chan JYL, Bea KT, Leow SMH, Phoong SW, Cheng WK. State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev. 2022 doi: 10.1007/s10462-022-10183-8. [ CrossRef ] [ Google Scholar ]
  • Chang J-R, Liang H-Y, Chen L-S, Chang C-W. Novel feature selection approaches for improving the performance of sentiment classification. J Ambient Intell Hum Comput. 2020 doi: 10.1007/s12652-020-02468-z. [ CrossRef ] [ Google Scholar ]
  • Chaturvedi I, Cambria E, Vilares D (2016) Lyapunov filtering of objectivity for Spanish sentiment model. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), IEEE, pp 4474–4481. 10.1109/IJCNN.2016.7727785
  • Chen Z, Teng S, Zhang W, Tang H, Zhang Z, He J, et al (2019) LSTM sentiment polarity analysis based on LDA clustering. In: Communications in Computer and Information Science, Springer, Singapore, pp 342–355. 10.1007/978-981-13-3044-5_25
  • Cheng WK, Bea KT, Leow SMH, Chan JY-L, Hong Z-W, Chen Y-L. A review of sentiment, semantic and event-extraction-based approaches in stock forecasting. Mathematics. 2022; 10 (14):2437. doi: 10.3390/math10142437. [ CrossRef ] [ Google Scholar ]
  • Da’u A, Salim N, Rabiu I, Osman A. Recommendation System Exploiting Aspect-Based Opinion Mining with Deep Learning Method. Inf Sci. 2020; 512 :1279–1292. doi: 10.1016/j.ins.2019.10.038. [ CrossRef ] [ Google Scholar ]
  • Dangi D, Bhagat A, Dixit DK. Sentiment analysis of social media data based on chaotic coyote optimization algorithm based time weight-adaboost support vector machine approach. Concurr Comput. 2022; 34 (3):6581. doi: 10.1002/cpe.6581. [ CrossRef ] [ Google Scholar ]
  • Deng S, Xia S, Hu J, Li H, Liu Y. Exploring the topic structure and evolution of associations in information behavior research through co-word analysis. J Librariansh Inf Sci. 2021; 53 (2):280–297. doi: 10.1177/0961000620938120. [ CrossRef ] [ Google Scholar ]
  • Dereli T, Eligüzel N, Çetinkaya C. Content analyses of the international federation of Red Cross and Red Crescent Societies (Ifrc) based on machine learning techniques through Twitter. Nat Hazards. 2021; 106 (3):2025–2045. doi: 10.1007/s11069-021-04527-w. [ CrossRef ] [ Google Scholar ]
  • Dey A, Jenamani M, Thakkar JJ (2017) Lexical Tf-Idf: An n-Gram Feature Space for Cross-Domain Classification of Sentiment Reviews. In: International Conference on Pattern Recognition and Machine Intelligence, Springer, Cham, pp 380–386. 10.1007/978-3-319-69900-4_48
  • Ding Y, Chowdhury GG, Foo S. Bibliometric cartography of information retrieval research by using co-word analysis. Inf Process Manag. 2001; 37 (6):817–842. doi: 10.1016/S0306-4573(00)00051-0. [ CrossRef ] [ Google Scholar ]
  • Du C, Sun H, Wang J, Qi Q, Liao J (2020a) Adversarial and domain-aware BERT for cross-domain sentiment analysis. In: Proceedings of the 58th Annual meeting of the association for computational linguistics, association for computational linguistics, p 4019–4028. 10.18653/v1/2020a.acl-main.370
  • Du Y, He M, Wang L, Zhang H. Wasserstein based transfer network for cross-domain sentiment classification. Knowl Based Syst. 2020; 204 :6162. doi: 10.1016/j.knosys.2020.106162. [ CrossRef ] [ Google Scholar ]
  • Elo S, Kyngäs H. The qualitative content analysis process. J Adv Nurs. 2008; 62 (1):107–115. doi: 10.1111/j.1365-2648.2007.04569.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Elshakankery K, Ahmed MF. HILATSA: a hybrid incremental learning approach for arabic tweets sentiment analysis. Egypt Inform J. 2019; 20 (3):163–171. doi: 10.1016/j.eij.2019.03.002. [ CrossRef ] [ Google Scholar ]
  • Feldman R. Techniques and applications for sentiment analysis. Commun ACM. 2013; 56 (4):82–89. doi: 10.1145/2436256.2436274. [ CrossRef ] [ Google Scholar ]
  • Ferilli S, De Carolis B, Esposito F, Redavid D (2015) Sentiment analysis as a text categorization task: a study on feature and algorithm selection for Italian language. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), IEEE, pp 1–10. 10.1109/DSAA.2015.7344882
  • Fink C, Bos N, Perrone A, Liu E, Kopecky J (2013) Twitter, public opinion, and the 2011 Nigerian Presidential Election. In: 2013 International conference on social computing, IEEE, pp 311–320. 10.1109/SocialCom.2013.50
  • Fitri VA, Andreswari R, Hasibuan MA. Sentiment analysis of social media Twitter with case of anti-LGBT campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest Algorithm. Procedia Comput Sci. 2019; 161 :765–772. doi: 10.1016/j.procs.2019.11.181. [ CrossRef ] [ Google Scholar ]
  • Garg S (2021) Drug recommendation system based on sentiment analysis of drug reviews using machine learning. In: 2021 11th International conference on cloud computing, data science & engineering (confluence), IEEE, pp 175–181. 10.1109/Confluence51648.2021.9377188
  • Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: 28th International Conference on Machine Learning, International Machine Learning Society (IMLS), pp 513–520. https://dl.acm.org/doi/10.5555/3104482.3104547
  • Grootendorst M, Warmerdam VD (2021) MaartenGr/KeyBERT (Version 0.5) [Computer program]. 10.5281/ZENODO.5534341.
  • Groshek J, Al-Rawi A. Public sentiment and critical framing in social media content during the 2012 US Presidential Campaign. Soc Sci Comput Rev. 2013; 31 (5):563–576. doi: 10.1177/0894439313490401. [ CrossRef ] [ Google Scholar ]
  • Habimana O, Li Y, Li R, Gu X, Yu G. Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci. 2020; 63 (1):1–36. doi: 10.1007/s11432-018-9941-6. [ CrossRef ] [ Google Scholar ]
  • Hao Y, Mu T, Hong R, Wang M, Liu X, Goulermas JY. Cross-domain sentiment encoding through stochastic word embedding. IEEE Trans Knowl Data Eng. 2019; 32 (10):1909–1922. doi: 10.1109/TKDE.2019.2913379. [ CrossRef ] [ Google Scholar ]
  • Hassan A, Mahmood A (2017) Efficient deep learning model for text classification based on recurrent and convolutional layers. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, pp 1108–1113. 10.1109/ICMLA.2017.00009
  • Heaton J (2018). Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep Learning. Genetic Programming and Evolvable Machines 19: 305–307. 10.1007/s10710-017-9314-z
  • Heikal M, Torki M, El-Makky N. Sentiment analysis of Arabic tweets using deep learning. Procedia Comput Sci. 2018; 142 :114–122. doi: 10.1016/j.procs.2018.10.466. [ CrossRef ] [ Google Scholar ]
  • Huang B, Ou Y, Carley KM (2018) Aspect level sentiment classification with attention-over-attention neural networks. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Springer, Cham, pp 197–206. 10.1007/978-3-319-93372-6_22
  • Hussain N, Mirza HT, Rasool G, Hussain I, Kaleem M. Spam review detection techniques: a systematic literature review. Appl Sci. 2019; 9 (5):987. doi: 10.3390/app9050987. [ CrossRef ] [ Google Scholar ]
  • Hussein DMEDM. A survey on sentiment analysis challenges. J King Saud Univ. 2018; 30 (4):330–338. doi: 10.1016/j.jksues.2016.04.002. [ CrossRef ] [ Google Scholar ]
  • Ikram MT, Afzal MT. Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics. 2019; 119 (1):73–95. doi: 10.1007/s11192-019-03028-9. [ CrossRef ] [ Google Scholar ]
  • Injadat MN, Salo F, Nassif AB. Data mining techniques in social media: a survey. Neurocomputing. 2016; 214 :654–670. doi: 10.1016/j.neucom.2016.06.045. [ CrossRef ] [ Google Scholar ]
  • Isah H, Trundle P, Neagu D (2014) Social media analysis for product safety using text mining and sentiment analysis. In: 2014 14th UK Workshop on Computational Intelligence (UKCI), IEEE, pp 1–7. 10.1109/UKCI.2014.6930158
  • ISO 639-3 (2017) Registration Authority. https://iso639-3.sil.org/
  • Jain DK, Boyapati P, Venkatesh J, Prakash M. An intelligent cognitive-inspired computing with big data analytics framework for sentiment analysis and classification. Inf Process Manag. 2022; 59 (1):2758. doi: 10.1016/j.ipm.2021.102758. [ CrossRef ] [ Google Scholar ]
  • Januário BA, de Carosia AEO, da Silva AEA, Coelho GP. Sentiment analysis applied to news from the Brazilian stock market. IEEE Latin Am Trans. 2022; 20 (3):512–518. doi: 10.1109/TLA.2022.9667151. [ CrossRef ] [ Google Scholar ]
  • JayaLakshmi ANM, Kishore KVK. Performance evaluation of DNN with other machine learning techniques in a cluster using apache spark and MLlib. J King Saud Univ. 2022; 34 (1):1311–1319. doi: 10.1016/j.jksuci.2018.09.022. [ CrossRef ] [ Google Scholar ]
  • Jia X, Wang L. Attention enhanced capsule network for text classification by encoding syntactic dependency trees with graph convolutional neural network. PeerJ Comput Sci. 2022; 7 :e831. doi: 10.7717/PEERJ-CS.831. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jiang D, Luo X, Xuan J, Xu Z. Sentiment computing for the news event based on the social media big data. IEEE Access. 2017; 5 :2373–2382. doi: 10.1109/ACCESS.2016.2607218. [ CrossRef ] [ Google Scholar ]
  • Kaity M, Balakrishnan V. Sentiment Lexicons and non-English languages: a survey. Knowl Inf Syst. 2020; 62 (12):4445–4480. doi: 10.1007/s10115-020-01497-6. [ CrossRef ] [ Google Scholar ]
  • Kastrati Z, Dalipi F, Imran AS, Nuci KP, Wani MA. Sentiment analysis of students’ feedback with Nlp and deep learning: a systematic mapping study. Appl Sci. 2021; 11 (9):3986. doi: 10.3390/app11093986. [ CrossRef ] [ Google Scholar ]
  • Kaur H, Mangat V, Nidhi (2017) A survey of sentiment analysis techniques. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), IEEE, pp 921–925. 10.1109/I-SMAC.2017.8058315
  • Khan K, Baharudin BB, Khan A (2009) Mining opinion from text documents: a survey. In: 2009 3rd IEEE International conference on digital ecosystems and technologies, IEEE, pp 217–222. 10.4304/jetwi.5.4.343-353
  • Khasawneh RT, Wahsheh HA, Al-Kabi MN, Alsmadi IM (2013) Sentiment analysis of Arabic social media content: a comparative study. In: 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013), IEEE, pp 101–106. 10.1109/ICITST.2013.6750171
  • Khattak A, Asghar MZ, Saeed A, Hameed IA, Asif Hassan S, Ahmad S. A survey on sentiment analysis in Urdu: a resource-poor language. Egypt Inform J. 2021; 22 (1):53–74. doi: 10.1016/j.eij.2020.04.003. [ CrossRef ] [ Google Scholar ]
  • Khatua A, Khatua A, Cambria E. Predicting political sentiments of voters from Twitter in multi-party contexts. Appl Soft Comput J. 2020; 97 :106743. doi: 10.1016/j.asoc.2020.106743. [ CrossRef ] [ Google Scholar ]
  • Kitchenham B. Procedures for performing systematic reviews, version 1.0. Empir Softw Eng. 2004; 33 (2004):1–26. [ Google Scholar ]
  • Kitchenham B, Charters SM. Guidelines for performing systematic literature reviews in software engineering. Tech Rep. 2007; 5 :1–57. [ Google Scholar ]
  • Korayem M, Crandall D, Abdul-Mageed M (2012) Subjectivity and sentiment analysis of Arabic: a survey. In: International conference on advanced machine learning technologies and applications, Springer, Berlin, Heidelberg, p 128–139. 10.1007/978-3-642-35326-0_14
  • Koto F, Adriani M (2015) A comparative study on Twitter sentiment analysis: Which Features Are Good? In: International conference on applications of natural language to information systems, Springer, Cham, p 453–457. 10.1007/978-3-319-19581-0_46
  • Krippendorff K (2018) Content analysis: an introduction to its methodology. Sage publications.
  • Kumar A, Garg G. Systematic literature review on context-based sentiment analysis in social multimedia. Multimed Tools Appl. 2020; 79 (21):15349–15380. doi: 10.1007/s11042-019-7346-5. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Jaiswal A. Systematic literature review of sentiment analysis on twitter using soft computing techniques. Concurr Comput. 2020; 32 (1):e5107. doi: 10.1002/cpe.5107. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Sebastian TM. Sentiment analysis: a perspective on its past, present and future. Int J Intell Syst Appl. 2012; 4 (10):1–14. doi: 10.5815/ijisa.2012.10.01. [ CrossRef ] [ Google Scholar ]
  • Kumar A, Narapareddy VT, Gupta P, Srikanth VA, Neti LB, Malapati A (2021) Adversarial and auxiliary features-aware BERT for sarcasm detection. In: 8th ACM IKDD CODS and 26th COMAD, association for computing machinery, p 163–170. 10.1145/3430984.3431024
  • Kydros D, Argyropoulou M, Vrana V. A content and sentiment analysis of Greek tweets during the pandemic. Sustainability (switzerland) 2021; 13 (11):6150. doi: 10.3390/su13116150. [ CrossRef ] [ Google Scholar ]
  • Lai Y, Zhang L, Han D, Zhou R, Wang G. Fine-grained emotion classification of chinese microblogs based on graph convolution networks. World Wide Web. 2020; 23 (5):2771–2787. doi: 10.1007/s11280-020-00803-0. [ CrossRef ] [ Google Scholar ]
  • Leiden University's Centre for Science and Technology Studies (CWTS) (2021) VOSviewer (Version 1.6.17)[Software]. https://www.vosviewer.com/
  • Leydesdorff L, Park HW, Wagner C. International co-authorship relations in the social science citation index: is internationalization leading the network? J Assoc Inf Sci Technol. 2014; 65 (10):2111–2126. doi: 10.48550/arXiv.1305.4242. [ CrossRef ] [ Google Scholar ]
  • Li D, Qian J (2016) Text sentiment analysis based on long short-term memory. In: 2016 First IEEE International Conference on Computer Communication and the Internet (ICCCI), IEEE, pp 471–475. 10.1109/CCI.2016.7778967
  • Li F, Huang M, Zhu X (2010) Sentiment analysis with global topics and local dependency. In: Proceedings of the AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA: AAAI Press, Palo Alto, California USA, pp 1371–1376. 10.1609/aaai.v24i1.7523
  • Li J, Sun M (2007) Experimental study on sentiment classification of chinese review using machine learning techniques. In: 2007 International Conference on Natural Language Processing and Knowledge Engineering, IEEE, pp 393–400. 10.1109/NLPKE.2007.4368061
  • Li N, Liang X, Li X, Wang C, Wu DD. Network environment and financial risk using machine learning and sentiment analysis. Hum Ecol Risk Assess. 2009; 15 (2):227–252. doi: 10.1080/10807030902761056. [ CrossRef ] [ Google Scholar ]
  • Li W, Zhu L, Shi Y, Guo K, Cambria E. User reviews: sentiment analysis using Lexicon integrated two-channel CNN–LSTM family models. Appl Soft Comput J. 2020; 94 :6435. doi: 10.1016/j.asoc.2020.106435. [ CrossRef ] [ Google Scholar ]
  • Li W, Shao W, Ji S, Cambria E. BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing. 2022; 467 :73–82. doi: 10.1016/j.neucom.2021.09.057. [ CrossRef ] [ Google Scholar ]
  • Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput. 2017; 9 (6):843–851. doi: 10.1007/s12559-017-9492-2. [ CrossRef ] [ Google Scholar ]
  • Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl Based Syst. 2022; 235 :107643. doi: 10.1016/j.knosys.2021.107643. [ CrossRef ] [ Google Scholar ]
  • Ligthart A, Catal C, Tekinerdogan B. Systematic reviews in sentiment analysis: a tertiary study. Artif Intell Rev. 2021; 54 (7):4997–5053. doi: 10.1007/s10462-021-09973-3. [ CrossRef ] [ Google Scholar ]
  • Lin B, Cassee N, Serebrenik A, Bavota G, Novielli N, Lanza M. Opinion mining for software development: a systematic literature review. ACM Trans Softw Eng Methodol. 2022; 31 (3):1–41. doi: 10.1145/3490388. [ CrossRef ] [ Google Scholar ]
  • Lin Y, Li J, Yang L, Xu K, Lin H. Sentiment analysis with comparison enhanced deep neural network. IEEE Access. 2020; 8 :78378–78384. doi: 10.1109/ACCESS.2020.2989424. [ CrossRef ] [ Google Scholar ]
  • Liu F, Zheng J, Zheng L, Chen C. Combining attention-based bidirectional gated recurrent neural network and two-dimensional convolutional neural network for document-level sentiment classification. Neurocomputing. 2020; 371 :39–50. doi: 10.1016/j.neucom.2019.09.012. [ CrossRef ] [ Google Scholar ]
  • Liu L, Nie X, Wang H (2012) Toward a fuzzy domain sentiment ontology tree for sentiment analysis. In: 2012 5th International congress on image and signal processing, IEEE, pp 1620–1624. 10.1109/CISP.2012.6469930
  • Liu R, Shi Y, Ji C, Jia M. A survey of sentiment analysis based on transfer learning. IEEE Access. 2019; 7 :85401–85412. doi: 10.1109/ACCESS.2019.2925059. [ CrossRef ] [ Google Scholar ]
  • Liu S, Lee K, Lee I. Document-level multi-topic sentiment classification of email data with BiLSTM and data augmentation. Knowl Based Syst. 2020; 197 :105918. doi: 10.1016/j.knosys.2020.105918. [ CrossRef ] [ Google Scholar ]
  • Liu SM, Chen JH. A multi-label classification based approach for sentiment classification. Expert Syst Appl. 2015; 42 (3):1083–1093. doi: 10.1016/j.eswa.2014.08.036. [ CrossRef ] [ Google Scholar ]
  • Liu X, Zeng D, Li J, Wang F-Y, Zuo W. Sentiment analysis of Chinese documents: from sentence to document level. J Am Soc Inform Sci Technol. 2009; 60 (12):2474–2487. doi: 10.1002/asi.21206. [ CrossRef ] [ Google Scholar ]
  • Lo YW, Potdar V (2009) A review of opinion mining and sentiment classification framework in social networks. In: 2009 3rd IEEE International conference on digital ecosystems and technologies, IEEE, pp 396–401. 10.1109/DEST.2009.5276705
  • Lulu L, Elnagar A. Automatic arabic dialect classification using deep learning models. Procedia Comput Sci. 2018; 142 :262–269. doi: 10.1016/j.procs.2018.10.489. [ CrossRef ] [ Google Scholar ]
  • Ma Y, Peng H, Cambria E (2018) Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: 32nd AAAI conference on artificial intelligence, New Orleans, Louisiana, USA: AAAI Press, Palo Alto, California USA, pp 5876–5883. 10.1609/aaai.v32i1.12048
  • Malandri L, Porcel C, Xing F, Serrano-Guerrero J, Cambria E. Soft computing for recommender systems and sentiment analysis. Appl Soft Comput. 2022 doi: 10.1016/j.asoc.2021.108246. [ CrossRef ] [ Google Scholar ]
  • Mäntylä MV, Graziotin D, Kuutila M. The evolution of sentiment analysis-a review of research topics, venues and top cited papers. Comput Sci Rev. 2018; 27 :16–32. doi: 10.1016/j.cosrev.2017.10.002. [ CrossRef ] [ Google Scholar ]
  • Mao Y, Zhang Y, Jiao L, Zhang H. Document-level sentiment analysis using attention-based bi-directional long short-term memory network and two-dimensional convolutional neural network. Electronics. 2022; 11 (12):1906. doi: 10.3390/electronics11121906. [ CrossRef ] [ Google Scholar ]
  • Maqsood H, Mehmood I, Maqsood M, Yasir M, Afzal S, Aadil F, et al. A local and global event sentiment based efficient stock exchange forecasting using deep learning. Int J Inf Manag. 2020; 50 :432–451. doi: 10.1016/j.ijinfomgt.2019.07.011. [ CrossRef ] [ Google Scholar ]
  • Martinez-Camara E, Martin-Valdivia MT, Urena-Lopez LA (2011) Opinion classification techniques applied to a Spanish Corpus. In: International conference on application of natural language to information systems, Springer, Berlin, Heidelberg, pp 169–176. 10.1007/978-3-642-22327-3_17
  • Martinez-Garcia A, Badia T, Barnes J (2021) Evaluating morphological typology in zero-shot cross-lingual transfer. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, association for computational linguistics, pp 3136–3153. 10.18653/v1/2021.acl-long.244
  • Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J. 2014; 5 (4):1093–1113. doi: 10.1016/j.asej.2014.04.011. [ CrossRef ] [ Google Scholar ]
  • Momtazi S (2012) Fine-grained German sentiment analysis on social media. In: Proceedings of the 8th International conference on language resources and evaluation (LREC’12), European Language Resources Association (ELRA), pp 1215–1220. http://www.lrec-conf.org/proceedings/lrec2012/pdf/999_Paper.pdf
  • Myslin M, Zhu SH, Chapman W, Conway M. Using Twitter to examine smoking behavior and perceptions of emerging tobacco products. J Med Int Res. 2013; 15 (8):174. doi: 10.2196/jmir.2534. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Nair RR, Mathew J, Muraleedharan V, Deepa Kanmani S (2019) Study of machine learning techniques for sentiment analysis. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), IEEE, pp 978–984. 10.1109/ICCMC.2019.8819763
  • Nassif AB, Elnagar A, Shahin I, Henno S. Deep learning for Arabic subjective sentiment analysis: challenges and research opportunities. Appl Soft Comput. 2021; 98 :6836. doi: 10.1016/j.asoc.2020.106836. [ CrossRef ] [ Google Scholar ]
  • Nassirtoussi AK, Aghabozorgi S, Wah TY, Ngo DCL. Text mining for market prediction: a systematic review. Expert Syst Appl. 2014; 41 (16):7653–7670. doi: 10.1016/j.eswa.2014.06.009. [ CrossRef ] [ Google Scholar ]
  • Nejat B, Carenini G, Ng R (2017) Exploring joint neural model for sentence level discourse parsing and sentiment analysis. In: Proceedings of the 18th annual sigdial meeting on discourse and dialogue, association for computational linguistics, pp 289–298. 10.18653/v1/w17-5535
  • Nhlabano VV, Lutu PEN (2018). Impact of text pre-processing on the performance of sentiment analysis models for social media data. In: 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (IcABCD), IEEE, pp 1–6. 10.1109/ICABCD.2018.8465135
  • Nicholls C, Song F (2010) Comparison of feature selection methods for sentiment analysis. In: Canadian conference on artificial intelligence, Springer, Berlin, Heidelberg, pp 286–289. 10.1007/978-3-319-96292-4_21
  • Nielsen FA (2011) A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs. In: Proceedings of the ESWC2011 workshop on “Making Sense of Microposts”: big things come in small packages, Heraklion, Crete, Greece: CEUR-WS, Aachen, pp 93–98. 10.48550/arXiv.1103.2903
  • Obiedat R, Al-Darras D, Alzaghoul E, Harfoushi O. Arabic aspect-based sentiment analysis: a systematic literature review. IEEE Access. 2021; 9 :152628–152645. doi: 10.1109/ACCESS.2021.3127140. [ CrossRef ] [ Google Scholar ]
  • Ombabi AH, Ouarda W, Alimi AM. Deep learning CNN–LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Soc Netw Anal Min. 2020; 10 (1):1–13. doi: 10.1007/s13278-020-00668-1. [ CrossRef ] [ Google Scholar ]
  • Oueslati O, Cambria E, Ben HM, Ounelli H. A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst. 2020; 112 :408–430. doi: 10.1016/j.future.2020.05.034. [ CrossRef ] [ Google Scholar ]
  • Ouyang X, Zhou P, Li CH, Liu L (2015) Sentiment Analysis Using Convolutional Neural Network. In: 2015 IEEE International conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing, IEEE, p 2359–2364. 10.1109/CIT/IUCC/DASC/PICOM.2015.349
  • Pecore S, Villaneau J (2019) Complex and Precise Movie and Book Annotations in French Language for Aspect Based Sentiment Analysis. In: LREC 2018—11th International conference on language resources and evaluation, European Language Resources Association (ELRA), p 2647–2652. https://aclanthology.org/L18-1419
  • Peng H, Cambria E, Hussain A. A review of sentiment analysis research in Chinese language. Cogn Comput. 2017; 9 (4):423–435. doi: 10.1007/s12559-017-9470-8. [ CrossRef ] [ Google Scholar ]
  • Peng H, Ma Y, Li Y, Cambria E. Learning multi-grained aspect target sequence for Chinese sentiment analysis. Knowl Based Syst. 2018; 148 :167–176. doi: 10.1016/j.knosys.2018.02.034. [ CrossRef ] [ Google Scholar ]
  • Pereira DA. A survey of sentiment analysis in the Portuguese language. Artif Intell Rev. 2021; 54 (2):1087–1115. doi: 10.1007/s10462-020-09870-1. [ CrossRef ] [ Google Scholar ]
  • Perianes-Rodriguez A, Waltman L, van Eck NJ. Constructing bibliometric networks: a comparison between full and fractional counting. J Informetr. 2016; 10 (4):1178–1195. doi: 10.1016/j.joi.2016.10.006. [ CrossRef ] [ Google Scholar ]
  • Persson O (2017) BibExcel [Software]. Available from https://homepage.univie.ac.at/juan.gorraiz/bibexcel/
  • Persson O, Danell R, Schneider JW (2009) How to Use Bibexcel for Various Types of Bibliometric Analysis. In: Celebrating scholarly communication studies: a festschrift for Olle Persson at his 60th birthday, ed. J. Schneider F. Åström, R. Danell, B. Larsen. Leuven, Belgium: International Society for Scientometrics and Informetrics, pp 9–24
  • Picasso A, Merello S, Ma Y, Oneto L, Cambria E. Technical analysis and sentiment embeddings for market trend prediction. Expert Syst Appl. 2019; 135 :60–70. doi: 10.1016/j.eswa.2019.06.014. [ CrossRef ] [ Google Scholar ]
  • Piryani R, Madhavi D, Singh VK. Analytical mapping of opinion mining and sentiment analysis research during 2000–2015. Inf Process Manag. 2017; 53 (1):122–150. doi: 10.1016/j.ipm.2016.07.001. [ CrossRef ] [ Google Scholar ]
  • Piryani R, Piryani B, Singh VK, Pinto D. Sentiment analysis in Nepali: exploring machine learning and lexicon-based approaches. J Intell Fuzzy Syst. 2020; 39 (2):2201–2212. doi: 10.3233/JIFS-179884. [ CrossRef ] [ Google Scholar ]
  • Plaza-del-Arco FM, Martín-Valdivia MT, Ureña-López LA, Mitkov R. Improved emotion recognition in spanish social media through incorporation of lexical knowledge. Futur Gener Comput Syst. 2020; 110 :1000–1008. doi: 10.1016/j.future.2019.09.034. [ CrossRef ] [ Google Scholar ]
  • Poria S, Cambria E, Gelbukh A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst. 2016; 108 :42–49. doi: 10.1016/j.knosys.2016.06.009. [ CrossRef ] [ Google Scholar ]
  • Prabha MI, Srikanth GU (2019). Survey of Sentiment Analysis Using Deep Learning Techniques. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), IEEE, p 1–9. 10.1109/ICIICT1.2019.8741438
  • Prabhat A, Khullar V (2017). Sentiment Classification on Big Data Using Naïve Bayes and Logistic Regression. In: 2017 International Conference on Computer Communication and Informatics (ICCCI), IEEE, p 1–5. 10.1109/ICCCI.2017.8117734
  • Preethi PG, Uma V, Kumar A. Temporal sentiment analysis and causal rules extraction from Tweets for event prediction. Procedia Comput Sci. 2015; 48 :84–89. doi: 10.1016/j.procs.2015.04.154. [ CrossRef ] [ Google Scholar ]
  • Qasem M, Thulasiram R, Thulasiram P (2015) Twitter Sentiment Classification Using Machine Learning Techniques for Stock Markets. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, p 834–840. 10.1109/ICACCI.2015.7275714
  • Qazi A, Fayaz H, Wadi A, Raj RG, Rahim NA, Khan WA. The artificial neural network for solar radiation prediction and designing solar systems: a systematic literature review. J Clean Prod. 2015; 104 :1–12. doi: 10.1016/j.jclepro.2015.04.041. [ CrossRef ] [ Google Scholar ]
  • Qazi A, Raj RG, Hardaker G, Standing C. A systematic literature review on opinion types and sentiment analysis techniques: tasks and challenges. Internet Res. 2017; 27 (3):608–630. doi: 10.1108/IntR-04-2016-0086. [ CrossRef ] [ Google Scholar ]
  • Raghuvanshi N, Patil JM (2016) A Brief Review on Sentiment Analysis. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), IEEE, p 2827–2831. 10.1109/ICEEOT.2016.7755213
  • Rambocas M, Pacheco BG. Online sentiment analysis in marketing research: a review. J Res Interact Mark. 2018; 12 (2):146–163. doi: 10.1108/JRIM-05-2017-0030. [ CrossRef ] [ Google Scholar ]
  • Rao G, Gu X, Feng Z, Cong Q, Zhang L (2021) A Novel Joint Model with Second-Order Features and Matching Attention for Aspect-Based Sentiment Analysis. In: 2021 International Joint Conference on Neural Networks (IJCNN), IEEE, p 1–8. 10.1109/IJCNN52387.2021.9534321
  • Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst. 2015; 89 :14–46. doi: 10.1016/j.knosys.2015.06.015. [ CrossRef ] [ Google Scholar ]
  • Rotta R, Noack A. Multilevel local search algorithms for modularity clustering. ACM J Exp Algorithmics. 2011; 16 (2):1–27. doi: 10.1145/1963190.1970376. [ CrossRef ] [ Google Scholar ]
  • Sadamitsu K, Sekine S, Yamamoto M (2008) Sentiment Analysis Based on Probabilistic Models Using Inter-Sentence Information. In: Proceedings of the sixth international conference on language resources and evaluation (LREC’08), European Language Resources Association (ELRA), p 2892–2896. http://www.lrec-conf.org/proceedings/lrec2008/pdf/736_paper.pdf
  • Salur MU, Aydin I. A novel hybrid deep learning model for sentiment classification. IEEE Access. 2020; 8 :58080–58093. doi: 10.1109/ACCESS.2020.2982538. [ CrossRef ] [ Google Scholar ]
  • Sánchez-Rada JF, Iglesias CA. Social context in sentiment analysis: formal definition, overview of current trends and framework for comparison. Inf Fusion. 2019; 52 :344–356. doi: 10.1016/j.inffus.2019.05.003. [ CrossRef ] [ Google Scholar ]
  • Santos R, Costa AA, Silvestre JD, Pyl L. Informetric analysis and review of literature on the role of BIM in sustainable construction. Autom Constr. 2019; 103 :221–234. doi: 10.1016/j.autcon.2019.02.022. [ CrossRef ] [ Google Scholar ]
  • Sari IC, Ruldeviyani Y (2020) Sentiment Analysis of the Covid-19 Virus Infection in Indonesian Public Transportation on Twitter Data: A Case Study of Commuter Line Passengers. In: 2020 International Workshop on Big Data and Information Security (IWBIS), IEEE, pp 23–28. 10.1109/IWBIS50925.2020.9255531
  • Sarsam SM, Al-Samarraie H, Alzahrani AI, Wright B. Sarcasm detection using machine learning algorithms in Twitter: a systematic review. Int J Mark Res. 2020; 62 (5):578–598. doi: 10.1177/1470785320921779. [ CrossRef ] [ Google Scholar ]
  • Sayed AA, Elgeldawi E, Zaki AM, Galal AR (2020) Sentiment Analysis for Arabic Reviews Using Machine Learning Classification Algorithms. In: 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE), IEEE, p 56–63. 10.1109/ITCE48509.2020.9047822
  • Schouten K, Frasincar F. Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng. 2015; 28 (3):813–830. doi: 10.1109/TKDE.2015.2485209. [ CrossRef ] [ Google Scholar ]
  • Schuller B, Mousa AED, Vryniotis V. Sentiment analysis and opinion mining: on optimal parameters and performances. Wiley Interdiscip Rev. 2015; 5 (5):255–263. doi: 10.1002/widm.1159. [ CrossRef ] [ Google Scholar ]
  • Serrano-Guerrero J, Romero FP, Olivas JA. Fuzzy logic applied to opinion mining: a review. Knowl Based Syst. 2021; 222 :107018. doi: 10.1016/j.knosys.2021.107018. [ CrossRef ] [ Google Scholar ]
  • Sharma S, Jain A. Role of sentiment analysis in social media security and analytics. Wiley Interdiscip Rev. 2020; 10 (5):e1366. doi: 10.1002/widm.1366. [ CrossRef ] [ Google Scholar ]
  • Shirsat VS, Jagdale RS, Deshmukh SN (2018) Document Level Sentiment Analysis from News Articles. In: 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), IEEE, pp 1–4. 10.1109/ICCUBEA.2017.8463638
  • Shofiya C, Abidi S. Sentiment analysis on Covid-19-related social distancing in Canada using Twitter data. Int J Environ Res Public Health. 2021; 18 (11):5993. doi: 10.3390/ijerph18115993. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Singh RK, Sachan MK, Patel RB. 360 Degree view of cross-domain opinion classification: a survey. Artif Intell Rev. 2021; 54 (2):1385–1506. doi: 10.1007/s10462-020-09884-9. [ CrossRef ] [ Google Scholar ]
  • Singh T, Kumari M. Role of text pre-processing in twitter sentiment analysis. Procedia Comput Sci. 2016; 89 :549–554. doi: 10.1016/j.procs.2016.06.095. [ CrossRef ] [ Google Scholar ]
  • Smailović J, Grčar M, Lavrač N, Žnidaršič M. Stream-based active learning for sentiment analysis in the financial domain. Inf Sci. 2014; 285 (1):181–203. doi: 10.1016/j.ins.2014.04.034. [ CrossRef ] [ Google Scholar ]
  • Smetanin S. The applications of sentiment analysis for Russian language texts: current challenges and future perspectives. IEEE Access. 2020; 8 :110693–110719. doi: 10.1109/ACCESS.2020.3002215. [ CrossRef ] [ Google Scholar ]
  • Stemler S. An overview of content analysis. Pract Assess Res Eval. 2000; 7 (1):1–16. doi: 10.1362/146934703771910080. [ CrossRef ] [ Google Scholar ]
  • Sutoyo E, Rifai AP, Risnumawan A, Saputra M. A comparison of text weighting schemes on sentiment analysis of government policies: a case study of replacement of national examinations. Multimed Tools Appl. 2022; 81 (5):6413–6431. doi: 10.1007/s11042-022-11900-9. [ CrossRef ] [ Google Scholar ]
  • Syed AZ, Aslam M, Martinez-Enriquez AM (2010) Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits. In: Mexican international conference on artificial intelligence, Springer, Berlin, Heidelberg, pp 32–43. 10.1007/978-3-642-16761-4_4
  • Taboada M. Sentiment analysis: an overview from linguistics. Annu Rev Linguist. 2016; 2 :325–347. doi: 10.1146/annurev-linguistics-011415-040518. [ CrossRef ] [ Google Scholar ]
  • Tai KS, Socher R, Manning CD (2015) Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, association for computational linguistics, pp 1556–1566. 10.3115/v1/p15-1150
  • Tammina S (2020) A Hybrid Learning Approach for Sentiment Classification in Telugu Language. In: 2020 International conference on Artificial Intelligence and Signal Processing (AISP), IEEE, p 1–6. 10.1109/AISP48273.2020.9073109
  • Tan S, Cheng X, Wang Y, Xu H (2009) Adapting Naive Bayes to Domain Adaptation for Sentiment Analysis. In: European Conference on Information Retrieval, Springer, Berlin, Heidelberg, p 337–349. 10.1007/978-3-642-00958-7_31
  • Tan X, Cai Y, Xu J, Leung H-F, Chen W, Li Q. Improving aspect-based sentiment analysis via aligning aspect embedding. Neurocomputing. 2020; 383 :336–347. doi: 10.1016/j.neucom.2019.12.035. [ CrossRef ] [ Google Scholar ]
  • Tembhurne JV, Diwan T. Sentiment analysis in textual, visual and multimodal inputs using recurrent neural networks. Multimed Tools Appl. 2021; 80 (5):6871–6910. doi: 10.1007/s11042-020-10037-x. [ CrossRef ] [ Google Scholar ]
  • Thakur RK, Deshpande MV. Kernel optimized-support vector machine and mapreduce framework for sentiment classification of train reviews. Int J Uncertain Fuzziness Knowl Based Syst. 2019; 27 (6):1025–1050. doi: 10.1142/S0218488519500454. [ CrossRef ] [ Google Scholar ]
  • Thelwall M, Buckley K, Paltoglou G. Sentiment strength detection for the social web. J Am Soc Inform Sci Technol. 2012; 63 (1):163–173. doi: 10.1002/asi.21662. [ CrossRef ] [ Google Scholar ]
  • Thet TT, Na JC, Khoo CSG. Aspect-based sentiment analysis of movie reviews on discussion boards. J Inf Sci. 2010; 36 (6):823–848. doi: 10.1177/0165551510388123. [ CrossRef ] [ Google Scholar ]
  • Trilla A, Alías F (2009) Sentiment Classification in English from Sentence-Level Annotations of Emotions Regarding Models of Affect. In: 10th Annual Conference of the International Speech Communication Association, International Speech Communication Association (ISCA), p 516–519. 10.21437/interspeech.2009-189
  • Trisna KW, Jie HJ. Deep learning approach for aspect-based sentiment classification: a comparative review. Appl Artif Intell. 2022 doi: 10.1080/08839514.2021.2014186. [ CrossRef ] [ Google Scholar ]
  • Valverde-Albacete FJ, Carrillo-de-Albornoz J, Peláez-Moreno C (2013) A Proposal for New Evaluation Metrics and Result Visualization Technique for Sentiment Analysis Tasks. In: International conference of the cross-language evaluation forum for European languages, Springer, Berlin, Heidelberg, p 41–52. 10.1007/978-3-642-40802-1_5
  • Van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics. 2010; 84 (2):523–538. doi: 10.1007/s11192-009-0146-3. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Verma S. Sentiment analysis of public services for smart society: literature review and future research directions. Gov Inf Quart. 2022; 39 (3):101708. doi: 10.1016/j.giq.2022.101708. [ CrossRef ] [ Google Scholar ]
  • Waila P, Marisha S, Singh VK, Singh MK (2012) Evaluating Machine Learning and Unsupervised Semantic Orientation Approaches for Sentiment Analysis of Textual Reviews. In: 2012 IEEE International conference on computational intelligence and computing research, IEEE, pp 1–6. 10.1109/ICCIC.2012.6510235
  • Waltman L, Van Eck NJ. A smart local moving algorithm for large-scale modularity-based community detection. Eur Phys J B. 2013; 86 (11):1–33. doi: 10.1140/epjb/e2013-40829-0. [ CrossRef ] [ Google Scholar ]
  • Waltman L, Van Eck NJ, Noyons ECM. A unified approach to mapping and clustering of bibliometric networks. J Inform. 2010; 4 (4):629–635. doi: 10.1016/j.joi.2010.07.002. [ CrossRef ] [ Google Scholar ]
  • Wang C, Yang X, Ding L. Deep learning sentiment classification based on weak tagging information. IEEE Access. 2021; 9 :66509–66518. doi: 10.1109/ACCESS.2021.3077059. [ CrossRef ] [ Google Scholar ]
  • Wang L, Wan Y (2011) Sentiment Classification of Documents Based on Latent Semantic Analysis. In: International conference on computer education, simulation and modeling, Springer, Berlin, Heidelberg, p 356–361. 10.1007/978-3-642-21802-6_57
  • Wang T, Lu K, Chow KP, Zhu Q. COVID-19 sensing: negative sentiment analysis on social media in China via BERT model. IEEE Access. 2020; 8 :138162–138169. doi: 10.1109/ACCESS.2020.3012595. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang Z, Chong CS, Lan L, Yang Y, Ho S-B, Tong JC (2016) Fine-Grained Sentiment Analysis of Social Media with Emotion Sensing. In: 2016 Future Technologies Conference (FTC), IEEE, pp 1361–1364. 10.1109/FTC.2016.7821783
  • Wang Z, Ho S-B, Cambria E. A review of emotion sensing: categorization models and algorithms. Multimed Tools Appl. 2020; 79 (47):35553–35582. doi: 10.1007/s11042-019-08328-z. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Ho S-B, Cambria E. Multi-level fine-scaled sentiment sensing with ambivalence handling. Int J Uncertain Fuzziness Knowl-Based Syst. 2020; 28 (4):683–697. doi: 10.1142/S0218488520500294. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. Cogn Comput. 2020; 12 (1):238–248. doi: 10.1007/s12559-019-09669-5. [ CrossRef ] [ Google Scholar ]
  • Wang Z, Tong VJC, Chan D (2014) Issues of Social Data Analytics with a New Method for Sentiment Analysis of Social Media Data. In: 2014 IEEE 6th International conference on cloud computing technology and science, IEEE, pp 899–904. 10.1109/CloudCom.2014.40
  • Wang ZY, Li G, Li CY, Li A. Research on the semantic-based co-word analysis. Scientometrics. 2012; 90 (3):855–875. doi: 10.1007/s11192-011-0563-y. [ CrossRef ] [ Google Scholar ]
  • Wankhade M, Rao ACS, Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artif Intell Rev. 2022; 55 :5731–5780. doi: 10.1007/s10462-022-10144-1. [ CrossRef ] [ Google Scholar ]
  • Xing FZ, Cambria E, Welsch RE. Natural language based financial forecasting: a survey. Artif Intell Rev. 2018; 50 (1):49–73. doi: 10.1007/s10462-017-9588-9. [ CrossRef ] [ Google Scholar ]
  • Xing FZ, Pallucchini F, Cambria E. Cognitive-inspired domain adaptation of sentiment lexicons. Inf Process Manage. 2019; 56 (3):554–564. doi: 10.1016/j.ipm.2018.11.002. [ CrossRef ] [ Google Scholar ]
  • Xiong Z, Qin K, Yang H, Luo G. Learning Chinese word representation better by cascade morphological N-Gram. Neural Comput Appl. 2021; 33 (8):3757–3768. doi: 10.1007/s00521-020-05198-7. [ CrossRef ] [ Google Scholar ]
  • Yang B, Shao B, Wu L, Lin X. Multimodal sentiment analysis with unidirectional modality translation. Neurocomputing. 2022; 467 :130–137. doi: 10.1016/j.neucom.2021.09.041. [ CrossRef ] [ Google Scholar ]
  • Yang L, Li Y, Wang J, Sherratt RS. Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access. 2020; 8 :23522–23530. doi: 10.1109/ACCESS.2020.2969854. [ CrossRef ] [ Google Scholar ]
  • Yang M, Qu Q, Shen Y, Lei K, Zhu J. Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning. Neural Comput Appl. 2020; 32 (11):6421–6433. doi: 10.1007/s00521-018-3825-2. [ CrossRef ] [ Google Scholar ]
  • Yi J, Niblack W (2005) Sentiment Mining in WebFountain. In: 21st International Conference on Data Engineering (ICDE’05), IEEE, p 1073–1083. 10.1109/ICDE.2005.132
  • Yin H, Yang S, Li J (2020) Detecting Topic and Sentiment Dynamics Due to COVID-19 Pandemic Using Social Media. In: International conference on advanced data mining and applications, Springer, Cham, p 610–623. 10.1007/978-3-030-65390-3_46
  • You L, Li Y, Wang Y, Zhang J, Yang Y (2016) A deep learning-based RNNs model for automatic security audit of short messages. In: 2016 16th International Symposium on Communications and Information Technologies (ISCIT), IEEE, p 225–229. 10.1109/ISCIT.2016.7751626
  • You T, Yoon J, Kwon O-H, Jung W-S. Tracing the evolution of physics with a keyword co-occurrence network. J Korean Phys Soc. 2021; 78 (3):236–243. doi: 10.1007/s40042-020-00051-5. [ CrossRef ] [ Google Scholar ]
  • Yu J, Jiang J, Xia R. Entity-sensitive attention and fusion network for entity-level multimodal sentiment classification. IEEE/ACM Trans Audio Speech Lang Process. 2019; 28 :429–439. doi: 10.1109/TASLP.2019.2957872. [ CrossRef ] [ Google Scholar ]
  • Yuan JH, Wu Y, Lu X, Zhao YY, Qin B, Liu T. Recent advances in deep learning based sentiment analysis. Sci China Technol Sci. 2020; 63 (10):1947–1970. doi: 10.1007/s11431-020-1634-3. [ CrossRef ] [ Google Scholar ]
  • Yue L, Chen W, Li X, Zuo W, Yin M. A survey of sentiment analysis in social media. Knowl Inf Syst. 2019; 60 (2):617–663. doi: 10.1007/s10115-018-1236-4. [ CrossRef ] [ Google Scholar ]
  • Yurtalan G, Koyuncu M, Turhan Ç. A polarity calculation approach for lexicon-based Turkish sentiment analysis. Turk J Electr Eng Comput Sci. 2019; 27 (2):1325–1339. doi: 10.3906/elk-1803-92. [ CrossRef ] [ Google Scholar ]
  • Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev. 2018; 8 (4):e1253. doi: 10.1002/widm.1253. [ CrossRef ] [ Google Scholar ]
  • Zhang Yin, Du J, Ma X, Wen H, Fortino G. Aspect-based sentiment analysis for user reviews. Cogn Comput. 2021; 13 (5):1114–1127. doi: 10.1007/s12559-021-09855-4. [ CrossRef ] [ Google Scholar ]
  • Zhang Y, Zhang Z, Miao D, Wang J. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci. 2019; 477 :55–64. doi: 10.1016/j.ins.2018.10.030. [ CrossRef ] [ Google Scholar ]
  • Zhao N, Gao H, Wen X, Li H. Combination of convolutional neural network and gated recurrent unit for aspect-based sentiment analysis. IEEE Access. 2021; 9 :15561–15569. doi: 10.1109/ACCESS.2021.3052937. [ CrossRef ] [ Google Scholar ]
  • Zhou J, Ye J. Sentiment analysis in education research: a review of journal publications. Interact Learn Environ. 2020 doi: 10.1080/10494820.2020.1826985. [ CrossRef ] [ Google Scholar ]
  • Zucco C, Calabrese B, Agapito G, Guzzi PH, Cannataro M. Sentiment analysis for mining texts and social networks data: methods and tools. Wiley Interdiscip Rev. 2020; 10 (1):e1333. doi: 10.1002/widm.1333. [ CrossRef ] [ Google Scholar ]
  • Zunic A, Corcoran P, Spasic I. Sentiment analysis in health and well-being: systematic review. JMIR Med Inform. 2020; 8 (1):e16023. doi: 10.2196/16023. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zuo E, Zhao H, Chen B, Chen Q. Context-specific heterogeneous graph convolutional network for implicit sentiment analysis. IEEE Access. 2020; 8 :37967–37975. doi: 10.1109/ACCESS.2020.2975244. [ CrossRef ] [ Google Scholar ]

IMAGES

  1. Sentiment Analysis

    sentiment analysis paper review

  2. Here’s What You Need to Know About Sentiment Analysis For Project Managers

    sentiment analysis paper review

  3. Sentiment Analysis

    sentiment analysis paper review

  4. Businesses can no longer ignore social media sentiment analysis

    sentiment analysis paper review

  5. Textblob Vs Vader Library For Sentiment Analysis In Python

    sentiment analysis paper review

  6. Sentiment Analysis using Deep Learning

    sentiment analysis paper review

VIDEO

  1. REVIEW SENTIMENT ANALYSIS SHOPEE USING NLP and K-NN Algorithm

  2. analysis paper from Group 5

  3. JEE-Advanced 2023 Paper Analysis

  4. Analysis paper- 2 B.Sc.(III year) 2022 paper ( Kumaun & Ssj university)

  5. sentiment analysis part 2

  6. Analysis paper video for COM 216

COMMENTS

  1. CTET vs. Other Teaching Exams: A Comparative Analysis of Question Papers

    If you are aspiring to become a teacher in India, you must be familiar with various teaching exams that you need to clear. One such exam is the Central Teacher Eligibility Test (CTET), which is conducted by the Central Board of Secondary Ed...

  2. What Is a Literary Analysis?

    A literary analysis is when a writer analyzes literature by looking at the characters in the story, the theme of the story, the tone and rhythm present in the writing, the plot and the various literary devices used within the story. Most li...

  3. What Is a Reaction Paper?

    A reaction paper is a student’s response to something that he has read, typically for a class assignment. The student reflects on the message received from the story and demonstrates effective analysis and writing clarity.

  4. A review on sentiment analysis from social media platforms

    This paper proposes a comprehensive review of the multifaceted reality of sentiment analysis in social networks. We not only review the existing

  5. Systematic reviews in sentiment analysis: a tertiary study

    The field of sentiment analysis has been the topic of extensive research in the past decades. In this paper, we present the results of a

  6. A Review Paper on the Role of Sentiment Analysis in Quality

    Sentiment analysis or opinion mining plays a promising role in Quality Education (SDG4). Most of the information shared by students on social

  7. (PDF) A Review on Sentiment Analysis Approaches

    Opinion Mining (or sentiment analysis) is a computational analysis of texts found on the web. It arose with the need to classify the opinions

  8. Sentiment analysis using product review data

    Sentiment analysis has gain much attention in recent years. In this paper, we aim to tackle the problem of sentiment polarity categorization

  9. Sentiment analysis of product reviews: A review

    Sentiment analysis is such a research area which understands and extracts the opinion from the given review and the analysis process includes natural language

  10. Sentiment Analysis on Textual Reviews

    In this paper Sentiment Analysis is done in view of Rule based mechanism and machine learning approach. Both of these strategies are analyzed and discovered

  11. Sentiment Analysis on Product Reviews

    Some techniques also help in rating the product value based on user's opinion. This paper is a literature survey including various authors and their sentiment

  12. Survey on sentiment analysis: evolution of research methods and

    ... article," "conference paper," "review," and "edited material." A total of 9,714 papers were obtained from the four databases above. These

  13. A Review On Sentiment Analysis Methodologies, Practices And

    People share their thoughts or feelings thereon. During this review paper, we tend to match on opinion mining or feeling assessment which is an area of web data

  14. Sentiment analysis on newspaper article reviews

    Sentiment analysis also helps organizations to measure the marketing campaigns and improve their customer service. As sentiment analysis gives