Filter by type:

Sort by year:
Elsevier NLP '23

Leveraging Contextual Representations with BiLSTM-based Regressor for Lexical Complexity Prediction.

Abdul Aziz*, Md. Akram Hossain*, Abu Nowshed Chy, Md Zia Ullah, and Masaki Aono.
Journal PaperElsevier Natural Language Processing Journal, Vol. 05, No. 100039, 2023.

Abstract

Lexical complexity prediction (LCP) determines the complexity level of words or phrases in a sentence. LCP has a significant impact on the enhancement of language translations, readability assessment, and text generation. However, the domain-specific technical word, the complex grammatical structure, the polysemy problem, the inter-word relationship, and dependencies make it challenging to determine the complexity of words or phrases. In this paper, we propose an integrated transformer regressor model named ITRM-LCP to estimate the lexical complexity of words and phrases where diverse contextual features are extracted from various transformer models. The transformer models are fine-tuned using the text-pair data. Then, a bidirectional LSTM-based regressor module is plugged on top of each transformer to learn the long-term dependencies and estimate the complexity scores. The predicted scores of each module are then aggregated to determine the final complexity score. We assess our proposed model using two benchmark datasets from shared tasks. Experimental findings demonstrate that our ITRM-LCP model obtains 10.2% and 8.2% improvement on the news and Wikipedia corpus of the CWI-2018 dataset, compared to the top-performing systems (DAT, CAMB, and TMU). Additionally, our ITRM-LCP model surpasses state-of-the-art LCP systems (DeepBlueAI, JUST-BLUE) by 1.5% and 1.34% for single and multi-word LCP tasks defined in the SemEval LCP-2021 task.

Neurocomputing '25

MMTF-DES: A Fusion of Multimodal Transformer Models for Desire, Emotion, and Sentiment Analysis of Social Media Data [Undergrad Thesis Journal: Accepted]

Abdul Aziz, Nihad Karim Chowdhury, Muhammad Ashad Kabir, Abu Nowshed Chy, and Md. Jawad Siddique
Journal PaperElsevier Neurocomputing Journal, Vol. -, No. -, pp. -, 2025.

Abstract

Desire is a set of human aspirations and wishes that comprise verbal and cognitive aspects that drive human feelings and behaviors, distinguishing humans from other animals. Understanding human desire has the potential to be one of the most fascinating and challenging research domains. It is tightly coupled with sentiment analysis and emotion recognition tasks. It is beneficial for increasing human-computer interactions, recognizing human emotional intelligence, understanding interpersonal relationships, and making decisions. However, understanding human desire is challenging and under-explored because ways of eliciting desire might be different among humans. The task gets more difficult due to the diverse cultures, countries, and languages. Prior studies overlooked the use of image-text pairwise feature representation, which is crucial for the task of human desire understanding. In this research, we have proposed a unified multimodal transformer-based framework with image-text pair settings to identify human desire, sentiment, and emotion. The core of our proposed method lies in the encoder module, which is built using two state-of-the-art multimodal transformer models. These models allow us to extract diverse features. To effectively extract visual and contextualized embedding features from social media image and text pairs, we conducted joint fine-tuning of two pre-trained multimodal transformer models: Vision-and-Language Transformer (ViLT) and Vision-and-Augmented-Language Transformer (VAuLT). Subsequently, we use an early fusion strategy on these embedding features to obtain combined diverse feature representations of the image-text pair. This consolidation incorporates diverse information about this task, enabling us to robustly perceive the context and image pair from multiple perspectives. Moreover, we leverage a multi-sample dropout mechanism to enhance the generalization ability and expedite the training process of our proposed method. To evaluate our proposed approach, we used the multimodal dataset MSED for the human desire understanding task. Through our experimental evaluation, we demonstrate that our method excels in capturing both visual and contextual information, resulting in superior performance compared to other state-of-the-art techniques. Specifically, our method outperforms existing approaches by 3% for sentiment analysis, 2.2% for emotion analysis, and approximately 1% for desire analysis.

ACL '25 Long

Multitask Contrastive Learning for Multimodal Complaint Detection: A Sentiment and Emotion-Driven Approach [Submitted]

Abdul Aziz, Md. Akram Hossain, Abu Nowshed Chy, and, Md Zia Ullah.
Conference PaperSubmitted to ACL Rolling Review, 2025.

Abstract

ACL '25 Short

Can Contrastive Learning Enhance Code-mixed Text Understanding? A study on Multitask Sentiment and Emotion Analysis [Submitted]

Abdul Aziz, Md. Akram Hossain, and Abu Nowshed Chy.
Conference PaperSubmitted to ACL Rolling Review, 2025.

Abstract

ACL '25 Long

BiGCAT: A Hybrid Graph Neural Network-Based Approach to Named Entity Recognition for Financial Text [Submitted]

Md. Akram Hossain, Abdul Aziz, Muhammad Anwarul Azim, Abu Nowshed Chy, Md Zia Ullah, and Mohammad Khairul Islam
Conference PaperSubmitted to ACL Rolling Review '25

Abstract

LREC-COLING '24 Long

BengaliLCP: A Dataset for Lexical Complexity Prediction in the Bengali Texts.

Nabila Ayman, Md. Akram Hossain*, Abdul Aziz*, Rokan Uddin Faruqui, and Abu Nowshed Chy.
Conference PaperThe 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING '24), 2024.

Abstract

Encountering intricate or ambiguous terms within a sentence produces distress for the reader during comprehension. Lexical Complexity Prediction (LCP) deals with predicting the complexity score of a word or a phrase considering its context. This task poses a number of challenges including ambiguity, context sensitivity, and subjectivity in perceiving complexity. Despite having 300 million native speakers and ranking as the seventh most spoken language in the world, Bengali falls behind in the research on lexical complexity when compared to other languages. To bridge this gap, we introduce the first annotated Bengali dataset, that assists in performing the task of Lexical Complexity Prediction in this language. Besides, we propose a transformer-based deep neural approach with a pairwise multi-head attention mechanism and LSTM model to predict the lexical complexity of tokens. The outcomes demonstrate that the proposed neural approach outperformed the existing state-of-the-art models for the Bengali language.

CheckThat!@CLEF '23

CSECU-DSG at CheckThat! 2023: Transformer-based Fusion Approach for Multimodal and Multigenre Check-Worthiness. [1st place]

Abdul Aziz, Md. Akram Hossain, and Abu Nowshed Chy.
Workshop PaperCLEF workshop on Check-Worthiness, Subjectivity, Political Bias, Factuality, and Authority of News Articles and Their Sources (CheckThat!@CLEF), 2023.

Abstract

Check-worthiness is identifying verifiable factual claims present or not in content. It might be beneficial to automatically verify the political discourses, social media posts, and newspaper content. However, the multifaceted nature and hidden meaning of the content make it difficult to automatically identify the factual claims. To address these challenges, CheckThat! 2023 introduced a task to build automatic Check-worthiness classifiers in tweets with multimodal and multigenre settings. This paper presented our participation in CheckThat! 2023 Task 1. We perform fine-tuning on language-specific and vision pretrained transformer models to extract the visual-contextualized or contextualized features representation for the multimodal and multigenre check-worthiness task. We add a BiLSTM layer on top of the contextual features and concatenate it with the other visual or contextualized features to get an enrich unified representation. Later, we employ a multi-sample dropout strategy to predict a more accurate class label. Experimental results show that our proposed method achieved competitive performance among the participants and obtained 1st place in the multimodal Arabic check-worthiness task.

CASE@RANLP '23

CSECU-DSG@ Multimodal Hate Speech Event Detection 2023: Transformer-based Multimodal Hierarchical Fusion Model For Multimodal Hate Speech Detection. [5th place]

Abdul Aziz, Md. Akram Hossain, and Abu Nowshed Chy.
Workshop PaperChallenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), CASE@RANLP 2023.

Abstract

The emergence of social media and ecommerce platforms enabled the perpetrator to spread negativity and abuse individuals or organisations worldwide rapidly. It is critical to detect hate speech in both visual and textual content so that it may be moderated or excluded from online platforms to keep it sound and safe for users. However, multimodal hate speech detection is a complex and challenging task as people sarcastically present hate speech and different modalities i.e., image and text are involved in their content. This paper describes our participation in the CASE 2023 multimodal hate speech event detection task. In this task, the objective is to automatically detect hate speech and its target from the given textembedded image. We proposed a transformerbased multimodal hierarchical fusion model to detect hate speech present in the visual content. We jointly fine-tune a language and a vision pre-trained transformer models to extract the visual-contextualized features representation of the text-embedded image. We concatenate these features and fed them to the multisample dropout strategy. Moreover, the contextual feature vector is fed into the BiLSTM module and the output of the BiLSTM module also passes into the multi-sample dropout. We employed arithmetic mean fusion to fuse all sample dropout outputs that predict the final label of our proposed method. Experimental results demonstrate that our proposed method obtains competitive performance and ranked 5th among the participants.

CASE@RANLP '23

CSECU-DSG @ Causal News Corpus 2023: Leveraging RoBERTa and DeBERTa Transformer Model with Contrastive Learning for Causal Event Classification. [4th place]

Md. Akram Hossain, Abdul Aziz, and Abu Nowshed Chy.
Workshop PaperChallenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), CASE@RANLP 2023.

Abstract

Cause-effect relationships play a crucial role in human cognition, and distilling cause-effect relations from text helps in ameliorating causal networks for predictive tasks including natural language-based financial forecasting, text summarization, and question-answering. However, the lack of syntactic clues, the ambivalent semantic meaning of words, and complex sentence structures make it one of the challenging tasks in NLP. To address these challenges, CASE-2023 introduced a shared task 3 with two subtasks focusing on event causality identification with causal news corpus. In this paper, we demonstrate our participant systems for this task. We leverage two transformers models including DeBERTa and Twitter-RoBERTa along with the weighted average fusion technique to tackle the challenges of subtask 1 where we need to identify whether a text belongs to either causal or not. For subtask 2 where we need to identify the cause, effect, and signal tokens from the text, we proposed a unified neural network of DeBERTa and DistilRoBERTa transformer variants with contrastive learning techniques. The experimental results showed that our proposed method achieved competitive performance among the participants’ systems and achieved 4th and 3rd rank in subtasks 1 and 2 respectively.

SemEval@ACL '23

CSECU-DSG at SemEval-2023 Task 4: Fine-tuning DeBERTa Transformer Model with Cross-fold Training and Multi-sample Dropout for Human Values Identification.

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperSemEval workshop on Multi-lingual Human Value Detection in Texts (ValueEval), SemEval@ACL 2023.

Abstract

Human values identification from a set of argument is becoming a prominent area of research in argument mining. Among some options, values convey what may be the most desirable and widely accepted answer. The diversity of human beliefs, random texture and implicit meaning within the arguments makes it more difficult to identify human values from the arguments. To address these challenges, SemEval-2023 Task 4 introduced a shared task ValueEval focusing on identifying human values categories based on given arguments. This paper presents our participation in this task where we propose a finetuned DeBERTa transformers-based classification approach to identify the desire human value category. We utilize different training strategy with the finetuned DeBERTa model to enhance contextual representation on this downstream task. Our proposed method achieved competitive performance among the participants’ methods.

ALTA '22

Enhancing the DeBERTa Transformers Model for Classifying Sentences from Biomedical Abstracts. [2nd place]

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperALTA workshop on automatic sentence classifiers that can map the content of biomedical abstracts into a set of pre-defined categories, which are used for Evidence-Based Medicine (EBM), ALTA 2022.

Abstract

Evidence-based medicine (EBM) is defined as making clinical decisions about individual patients based on the best available evidence. It is beneficial for making better clinical decisions, caring for patients and providing information about the therapy, prognosis, diagnosis, and other health care and clinical issues. However, it is a challenging task to build an automatic sentence classifier for EBM owing to a lack of clinical context, uncertainty in medical knowledge, difficulty in finding the best evidence, and domain-specific words in the abstract of medical articles. To address these challenges, ALTA 2022 introduced a task to build automatic sentence classifiers for EBM that can map the content of biomedical abstracts into a set of pre-defined categories. This paper presents our participation in this task where we propose a transformers based classification approach to identify the category of the content from biomedical abstracts. We perform fine-tuning on DeBERTa pre-trained transformers model to extract the contextualized features representation. Later, we employ a multi-sample dropout strategy and 5-fold cross fold training to predict more accurate class label. Experimental results show that our proposed method achieved the competitive performance among the participants.

CASE@EMNLP '22

CSECU-DSG @ Causal News Corpus 2022: Fusion of RoBERTa Transformers Variants for Causal Event Classification. [1st place]

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperCASE@EMNLP | Challenges and Applications of Automated Extraction of Socio-political Events from Text, CASE@EMNLP 2022.

Abstract

Identifying cause-effect relationships in sentences is one of the formidable tasks to tackle the challenges of inference and understanding of natural language. However, the diversity of word semantics and sentence structure makes it challenging to determine the causal relationship effectively. To address these challenges, CASE-2022 shared task 3 introduced a task focusing on event causality identification with causal news corpus. This paper presents our participation in this task, especially in subtask 1 which is the causal event classification task. To tackle the task challenge, we propose a unified neural model through exploiting two fine-tuned transformer models including RoBERTa and Twitter-RoBERTa. For the score fusion, we combine the prediction scores of each component model using weighted arithmetic mean to generate the probability score for class label identification. The experimental results showed that our proposed method achieved the top performance (ranked 1st) among the participants.

SemEval@NAACL '22

CSECU-DSG at SemEval-2022 Task 3: Investigating the Taxonomic Relationship Between Two Arguments using Fusion of Multilingual Transformer Models. [4th place]

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperSemEval 2022 Task 3: PreTENS | identifying presupposed taxonomies, SemEval@NAACL-2022

Abstract

Recognizing lexical relationships between words is one of the formidable tasks in computational linguistics. It plays a vital role in the improvement of various NLP tasks. However, the diversity of word semantics, sentence structure as well as word order information make it challenging to distill the relationship effectively. To address these challenges, SemEval-2022 Task 3 introduced a shared task PreTENS focusing on semantic competence to determine the taxonomic relations between two nominal arguments. This paper presents our participation in this task where we proposed an approach through exploiting an ensemble of multilingual transformer methods. We employed two fine-tuned multilingual transformer models including XLM-RoBERTa and mBERT to train our model. To enhance the performance of individual models, we fuse the predicted probability score of these two models using weighted arithmetic mean to generate a unified probability score. The experimental results showed that our proposed method achieved competitive performance among the participants’ methods.

SemEval@NAACL '22

CSECU-DSG at SemEval-2022 Task 11: Identifying the Multilingual Complex Named Entity in Text Using Stacked Embeddings and Transformer based Approach.

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperSemEval 2022 Task 11: MultiCoNER | Multilingual Complex Named Entity Recognition, SemEval@NAACL-2022

Abstract

Recognizing complex and ambiguous named entities (NEs) is one of the formidable tasks in the NLP domain. However, the diversity of linguistic constituents, syntactic structure, semantic ambiguity as well as differences from traditional NEs make it challenging to identify the complex NEs. To address these challenges, SemEval-2022 Task 11 introduced a shared task MultiCoNER focusing on complex named entity recognition in multilingual settings. This paper presents our participation in this task where we propose two different approaches including a BiLSTM-CRF model with stacked-embedding strategy and a transformer-based approach. Our proposed method achieved competitive performance among the participants’ methods in a few languages.

SemEval@ACL-IJCNLP '21

CSECU-DSG at SemEval-2021 Task 1: Fusion of Transformer Models for Lexical Complexity Prediction.

Abdul Aziz*, Md. Akram Hossain*, and Abu Nowshed Chy.
Workshop PaperSemEval 2021 Task 1: LCP | Lexical Complexity Prediction, SemEval@ACL-IJCNLP-2021

Abstract

Lexical complexity prediction (LCP) conveys the anticipation of the complexity level of a token or a set of tokens in a sentence. It plays a vital role in the improvement of various NLP tasks including lexical simplification, translations, and text generation. However, multiple meaning of a word in multiple circumstances, grammatical complex structure, and the mutual dependency of words in a sentence make it difficult to estimate the lexical complexity. To address these challenges, SemEval-2021 Task 1 introduced a shared task focusing on LCP and this paper presents our participation in this task. We proposed a transformer-based approach with sentence pair regression. We employed two fine-tuned transformer models. Including BERT and RoBERTa to train our model and fuse their predicted score to the complexity estimation. Experimental results demonstrate that our proposed method achieved competitive performance compared to the participants’ systems.

CEREX@FIRE '20

Feature Fusion with Hand-crafted and Transfer Learning Embeddings for Cause-Effect Relation Extraction. [1st place]

Abdul Aziz*, Afrin Sultana*, Md. Akram Hossain*, Nabila Ayman*, and Abu Nowshed Chy.
Workshop Paper Workshop on Cause Effect Relation Extraction, CEREX@FIRE2020.

Abstract

Cause-effect relation extraction is the problem of detecting causal relations expressed in a text. The extraction of causal-relations from texts might be beneficial for the improvement of various natural language processing (NLP) tasks including Q/A, text-summarization, opinion mining, and event analysis. However, cause-effect relation in the text is sparse, ambiguous, sometimes implicit, and has a linguistically complex construct. To address these challenges FIRE-2020 introduced a shared task focusing on cause-effect relation extraction (CEREX). We propose a feature based supervised classification model with a naive rule-based classifier. We define a set of rules based on a causal connective dictionary and stop-words. Besides, we use a fusion of hand-crafted features and transfer learning embeddings to train our SVM based supervised classification model. Experimental results exhibit that our proposed method achieved the topnotch performance for cause-effect relation extraction and causal word annotation.

Multimodal Desire Understanding on Social Media Data

Abdul Aziz and Nihad Karim Chowdhury
PosterB.Sc. Engineering thesis predefence poster presentation in Department of CSE, University of Chittagong, Bangladesh, 2023.