Research Output

NgramPOS: A Bigram-based Linguistic and Statistical Feature Process Model for Unstructured Text Classification

  Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. With the ever growing social inetworking and online marketing sites, the reviews obtained from those, act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like clustering or classification to provide different polarity categories such as positive and negative in order to extract a meaningful information for future uses. Accordingly, in this study we investigate the use of Natural Language processing (NLP) in a way to improve the sentiment classification performance to evaluate the information content of financial news as an instrument for using in investment
decisions system.
Since the proposed feature extraction approach is based on the occurrence frequency of words, low-frequency linguist features that could be critical in sentiment classification are typically ignored. In this research, therefore, we attempt to improve current sentiment analysis approaches for financial news classification in consideration of low-frequency, informative, linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for text sentiment analysis and classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy than other types feature sets.

  • Type:

    Conference Paper

  • Date:

    03 July 2018

  • Publication Status:

    Accepted

  • ISSN:

    1022-0038

  • Library of Congress:

    QA75 Electronic computers. Computer science

  • Dewey Decimal Classification:

    005 Computer programming, programs & data

  • Funders:

    Edinburgh Napier Funded

Citation

Yazdania, S., Tan, Z., Kakavand, M., & Lau, S. (in press). NgramPOS: A Bigram-based Linguistic and Statistical Feature Process Model for Unstructured Text Classification. Wireless Networks,

Authors

Keywords

Unstructured Text Classification, Bigram-based Linguistic and Statistical Feature,

Monthly Views:

Available Documents