Sentiment Analysis: Transformer Encoder, Neural Network Classifier, and Rich Feature Representation
Author: Gebriel Gidey
Faculty Supervisor: Matt Pico
Department: Computer Science
Sentiment analysis has become a crucial task in understanding customer opinions and preferences, particularly in the domain of product reviews. This project presents a novel approach to sentiment analysis by leveraging a custom-built Transformer encoder and a neural network classifier. The proposed method aims to capture the rich semantic information and contextual dependencies within the text data to improve sentiment classification accuracy.
The project preprocesses the product review dataset through tokenization and mapping the tokens to dense vector representations using word embeddings. Positional encoding is applied to incorporate the sequential order of the tokens. The custom-built Transformer encoder utilizes self-attention mechanisms to capture the relationships and dependencies between the tokens, generating a rich feature representation. This representation is then fed into a fully connected feed-forward network.
To gain insights into the Transformer encoder, various techniques are employed. Visualization methods are used to illustrate the attention weights and the relationships captured by the encoder. Ablation studies are conducted to evaluate the impact of the encoder model's performance. These efforts aim to provide a deeper understanding of how the Transformer encoder contributes to the sentiment classification task and serves as a stepping stone toward building a robust, general-purpose Transformer encoder.