Skip to the content.

Emotion Classification Dataset

The emotion dataset curated from the work of Bangla Text Classification using Transformers. Experimental scripts of that paper can be found in https://github.com/xashru/bangla-text-classification. The dataset is originally hosted on Kaggle: https://www.kaggle.com/nit003/bangla-youtube-sentiment-and-emotion-datasets.

Dataset

The dataset contains 2,890 youtube comments in Bangla, English, and romanized Bangla. The annotation of the dataset consists of five emotion labels such as (i) anger/disgust, (ii) joy, (iii) sadness, (iv) fear/surprise, and (v) none.

Directory Structure:

Licensing

The original dataset does not have any license information. The data split version is licensed under MIT license.

Citation

Please cite the following papers if you are using the data:

@article{alam2021review,
  title={A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models},
  author={Alam, Firoj and Hasan, Md Arid and Alam, Tanvir and Khan, Akib and Tajrin, Janntatul and Khan, Naira and Chowdhury, Shammur Absar},
  journal={arXiv preprint arXiv:2107.03844},
  year={2021}
}
@article{alam2020bangla,
  title={Bangla Text Classification using Transformers},
  author={Alam, Tanvirul and Khan, Akib and Alam, Firoj},
  journal={arXiv preprint arXiv:2011.04446},
  year={2020}
}

@inproceedings{tripto2018detecting,
 author = {Tripto, Nafis Irtiza and Ali, Mohammed Eunus},
 booktitle = {2018 International Conference on Bangla Speech and Language Processing (ICBSLP)},
 organization = {IEEE},
 pages = {1--6},
 title = {Detecting multilabel sentiment and emotions from {B}angla youtube comments},
 year = {2018}
}