Natural Language Processing: Trends and Challenges

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 February 2024) | Viewed by 8578

Special Issue Editors


E-Mail Website
Guest Editor
School of Software, Northwestern Polytechnical University (NPU), Xi'an, China
Interests: software engineering; empirical software engineering; education; game design; requirements engineering; agent based modeling; human factor; NLP

E-Mail Website
Guest Editor
Foundation of Software Engineering (FSE) Group, Department of Software Engineering, Faculty of Physics, Engineering, and Computer Science, University of Hertfordshire, Hatfield, UK
Interests: NLP; software engineering; requirement engineering; text mining; opinion and sentiment analysis; argumentation mining; empirical software engineering

E-Mail Website
Guest Editor
School of Software, Tsinghua University, Beijing, China
Interests: named entity recognition; relation extraction; natural language inference; abstract meaning representation; text to SQL; robustness and watermark of LLMs

Special Issue Information

Dear Colleagues,

Recently, Natural Language Processing (NLP) has witnessed pivotal advancements evolving various fields and transforming how we communicate and interact with computers by understanding human languages and dialects. However, many challenges still need to be addressed or improved to improve user performance. For example, mining software repositories have many open challenges, i.e., developing efficient techniques to handle and process massive research datasets, including source code, commit history, and bug reports. Similarly, researchers must develop state-of-the-art approaches to improve the performance of existing supervised and unsupervised learning approaches in classifying, clustering, and summarizing various social-media-based problems. This Special Issue aims to provide a comprehensive overview of the current trends, emerging technologies, and persistent challenges in NLP. It seeks to highlight the cutting-edge developments and address the hurdles the NLP community faces in this dynamic field.

Scope and Topics: This Special Issue will encompass a wide range of topics related to NLP, including but not limited to:

  1. Deep Learning in NLP: Advances in deep learning architectures, such as transformers, and their applications in various NLP tasks.
  2. Multimodal NLP: Integrating text with other modalities like images, audio, and video for more comprehensive language understanding.
  3. Conversational AI: Innovations in chatbots, virtual assistants, and dialogue systems for natural and engaging human–computer interactions.
  4. Cross-lingual and Multilingual NLP: Techniques and resources for NLP tasks across multiple languages and diverse linguistic settings.
  5. Ethical and Fair NLP: Addressing bias, fairness, and ethical concerns in NLP models and applications.
  6. Low-resource NLP: Strategies for NLP tasks in resource-scarce languages and domains.
  7. Semantic Understanding: Techniques for extracting, representing, and reasoning about meaning in natural language text.
  8. NLP for Healthcare: Applications of NLP in medical record analysis, clinical decision support, and biomedical text mining.
  9. NLP for Social Good: Using NLP for societal challenges like disaster response, fake news detection, and mental health support.
  10. Challenges and Benchmarking: Identifying and discussing persistent challenges in NLP and proposing benchmark datasets and evaluation metrics.
  11. Explainability and Interpretability: Methods for making NLP models more transparent and interpretable.
  12. Transfer Learning: Strategies for transferring knowledge from pre-trained models to specific NLP tasks.
  13. NLP for Software Engineering: Approaches to efficiently extract requirements, design, and maintenance-related information by mining software repositories and social media platforms for the software evolution. 

Dr. Affan Yasin
Dr. Javed Ali Khan
Dr. Lijie Wen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

21 pages, 4151 KiB  
Article
Formation Control of Multiple Autonomous Mobile Robots Using Turkish Natural Language Processing
by Kadir Aram, Gokhan Erdemir and Burhanettin Can
Appl. Sci. 2024, 14(9), 3722; https://0-doi-org.brum.beds.ac.uk/10.3390/app14093722 - 27 Apr 2024
Viewed by 291
Abstract
People use natural language to express their thoughts and wishes. As robots reside in various human environments, such as homes, offices, and hospitals, the need for human–robot communication is increasing. One of the best ways to achieve this communication is the use of [...] Read more.
People use natural language to express their thoughts and wishes. As robots reside in various human environments, such as homes, offices, and hospitals, the need for human–robot communication is increasing. One of the best ways to achieve this communication is the use of natural languages. Natural language processing (NLP) is the most important approach enabling robots to understand natural languages and improve human–robot interaction. Also, due to this need, the amount of research on NLP has increased considerably in recent years. In this study, commands were given to a multiple-mobile-robot system using the Turkish natural language, and the robots were required to fulfill these orders. Turkish is classified as an agglutinative language. In agglutinative languages, words combine different morphemes, each carrying a specific meaning, to create complex words. Turkish exhibits this characteristic by adding various suffixes to a root or base form to convey grammatical relationships, tense, aspect, mood, and other semantic nuances. Since the Turkish language has an agglutinative structure, it is very difficult to decode its sentence structure in a way that robots can understand. Parsing of a given command, path planning, path tracking, and formation control were carried out. In the path-planning phase, the A* algorithm was used to find the optimal path, and a PID controller was used to follow the generated path with minimum error. A leader–follower approach was used to control multiple robots. A platoon formation was chosen as the multi-robot formation. The proposed method was validated on a known map containing obstacles, demonstrating the system’s ability to navigate the robots to the desired locations while maintaining the specified formation. This study used Turtlebot3 robots within the Gazebo simulation environment, providing a controlled and replicable setting for comprehensive experimentation. The results affirm the feasibility and effectiveness of employing NLP techniques for the formation control of multiple mobile robots, offering a robust and effective method for further research and development on human–robot interaction. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

15 pages, 935 KiB  
Article
Using Natural Language Processing for a Computer-Aided Rapid Assessment of the Human Condition in Terms of Anorexia Nervosa
by Stella Maćkowska, Bartosz Koścień, Michał Wójcik, Katarzyna Rojewska and Dominik Spinczyk
Appl. Sci. 2024, 14(8), 3367; https://0-doi-org.brum.beds.ac.uk/10.3390/app14083367 - 16 Apr 2024
Viewed by 334
Abstract
This paper demonstrates how natural language processing methods can support the computer-aided rapid assessment of young adults suffering from anorexia nervosa. We applied natural language processing and machine learning techniques to develop methods that classified body image notes into four categories (sick/healthy, past [...] Read more.
This paper demonstrates how natural language processing methods can support the computer-aided rapid assessment of young adults suffering from anorexia nervosa. We applied natural language processing and machine learning techniques to develop methods that classified body image notes into four categories (sick/healthy, past tense, irony, and sentiment) and analyzed personal vocabulary. The datasets consisted of notes from 115 anorexic patients, 85 healthy participants, and 50 participants with head and neck cancer. To evaluate the usefulness of the proposed approach, we interviewed ten professional psychologists who were experts in eating disorders, eight direct (first contact) staff, and fourteen school counselors and school psychologists. The developed tools correctly differentiated the individuals suffering from anorexia nervosa, which was reflected in the linguistic profile and the results of the machine learning classification of the body image notes. The developed tool also received a positive evaluation from the psychologists specializing in treating eating disorders, school psychologists, and nurses. The obtained results indicate the potential of using natural language processing techniques for the computer-aided rapid assessment of a person’s condition in terms of anorexia nervosa. This method could be applied as both a screening tool and for the regular monitoring of people at risk of eating disorders. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

22 pages, 8296 KiB  
Article
Alignment of Unsupervised Machine Learning with Human Understanding: A Case Study of Connected Vehicle Patents
by Raj Bridgelall
Appl. Sci. 2024, 14(2), 474; https://0-doi-org.brum.beds.ac.uk/10.3390/app14020474 - 5 Jan 2024
Viewed by 874
Abstract
As official public records of inventions, patents provide an understanding of technological trends across the competitive landscape of various industries. However, traditional manual analysis methods have become increasingly inadequate due to the rapid expansion of patent information and its unstructured nature. This paper [...] Read more.
As official public records of inventions, patents provide an understanding of technological trends across the competitive landscape of various industries. However, traditional manual analysis methods have become increasingly inadequate due to the rapid expansion of patent information and its unstructured nature. This paper contributes an original approach to enhance the understanding of patent data, with connected vehicle (CV) patents serving as the case study. Using free, open-source natural language processing (NLP) libraries, the author introduces a novel metric to quantify the alignment of classifications by a subject matter expert (SME) and using machine learning (ML) methods. The metric is a composite index that includes a purity factor, evaluating the average ML conformity across SME classifications, and a dispersion factor, assessing the distribution of ML assigned topics across these classifications. This dual-factor approach, labeled the H-index, quantifies the alignment of ML models with SME understanding in the range of zero to unity. The workflow utilizes an exhaustive combination of state-of-the-art tokenizers, normalizers, vectorizers, and topic modelers to identify the best NLP pipeline for ML model optimization. The study offers manifold visualizations to provide an intuitive understanding of the areas where ML models align or diverge from SME classifications. The H-indices reveal that although ML models demonstrate considerable promise in patent analysis, the need for further advancements remain, especially in the domain of patent analysis. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

20 pages, 2249 KiB  
Article
Real-Time Machine Learning for Human Activities Recognition Based on Wrist-Worn Wearable Devices
by Alexandru Iulian Alexan, Anca Roxana Alexan and Stefan Oniga
Appl. Sci. 2024, 14(1), 329; https://0-doi-org.brum.beds.ac.uk/10.3390/app14010329 - 29 Dec 2023
Viewed by 731
Abstract
Wearable technologies have slowly invaded our lives and can easily help with our day-to-day tasks. One area where wearable devices can shine is in human activity recognition, as they can gather sensor data in a non-intrusive way. We describe a real-time activity recognition [...] Read more.
Wearable technologies have slowly invaded our lives and can easily help with our day-to-day tasks. One area where wearable devices can shine is in human activity recognition, as they can gather sensor data in a non-intrusive way. We describe a real-time activity recognition system based on a common wearable device: a smartwatch. This is one of the most inconspicuous devices suitable for activity recognition as it is very common and worn for extensive periods of time. We propose a human activity recognition system that is extensible, due to the wide range of sensing devices that can be integrated, and that provides a flexible deployment system. The machine learning component recognizes activity based on plot images generated from raw sensor data. This service is exposed as a Web API that can be deployed locally or directly in the cloud. The proposed system aims to simplify the human activity recognition process by exposing such capabilities via a web API. This web API can be consumed by small-network-enabled wearable devices, even with basic processing capabilities, by leveraging a simple data contract interface and using raw data. The system replaces extensive pre-processing by leveraging high performance image recognition based on plot images generated from raw sensor data. We have managed to obtain an activity recognition rate of 94.89% and to implement a fully functional real-time human activity recognition system. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

Review

Jump to: Research

42 pages, 533 KiB  
Review
A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs)
by Rajvardhan Patil and Venkat Gudivada
Appl. Sci. 2024, 14(5), 2074; https://0-doi-org.brum.beds.ac.uk/10.3390/app14052074 - 1 Mar 2024
Viewed by 3268
Abstract
Natural language processing (NLP) has significantly transformed in the last decade, especially in the field of language modeling. Large language models (LLMs) have achieved SOTA performances on natural language understanding (NLU) and natural language generation (NLG) tasks by learning language representation in self-supervised [...] Read more.
Natural language processing (NLP) has significantly transformed in the last decade, especially in the field of language modeling. Large language models (LLMs) have achieved SOTA performances on natural language understanding (NLU) and natural language generation (NLG) tasks by learning language representation in self-supervised ways. This paper provides a comprehensive survey to capture the progression of advances in language models. In this paper, we examine the different aspects of language models, which started with a few million parameters but have reached the size of a trillion in a very short time. We also look at how these LLMs transitioned from task-specific to task-independent to task-and-language-independent architectures. This paper extensively discusses different pretraining objectives, benchmarks, and transfer learning methods used in LLMs. It also examines different finetuning and in-context learning techniques used in downstream tasks. Moreover, it explores how LLMs can perform well across many domains and datasets if sufficiently trained on a large and diverse dataset. Next, it discusses how, over time, the availability of cheap computational power and large datasets have improved LLM’s capabilities and raised new challenges. As part of our study, we also inspect LLMs from the perspective of scalability to see how their performance is affected by the model’s depth, width, and data size. Lastly, we provide an empirical comparison of existing trends and techniques and a comprehensive analysis of where the field of LLM currently stands. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

30 pages, 2234 KiB  
Review
Contemporary Approaches in Evolving Language Models
by Dina Oralbekova, Orken Mamyrbayev, Mohamed Othman, Dinara Kassymova and Kuralai Mukhsina
Appl. Sci. 2023, 13(23), 12901; https://0-doi-org.brum.beds.ac.uk/10.3390/app132312901 - 1 Dec 2023
Cited by 1 | Viewed by 1363
Abstract
This article provides a comprehensive survey of contemporary language modeling approaches within the realm of natural language processing (NLP) tasks. This paper conducts an analytical exploration of diverse methodologies employed in the creation of language models. This exploration encompasses the architecture, training processes, [...] Read more.
This article provides a comprehensive survey of contemporary language modeling approaches within the realm of natural language processing (NLP) tasks. This paper conducts an analytical exploration of diverse methodologies employed in the creation of language models. This exploration encompasses the architecture, training processes, and optimization strategies inherent in these models. The detailed discussion covers various models ranging from traditional n-gram and hidden Markov models to state-of-the-art neural network approaches such as BERT, GPT, LLAMA, and Bard. This article delves into different modifications and enhancements applied to both standard and neural network architectures for constructing language models. Special attention is given to addressing challenges specific to agglutinative languages within the context of developing language models for various NLP tasks, particularly for Arabic and Turkish. The research highlights that contemporary transformer-based methods demonstrate results comparable to those achieved by traditional methods employing Hidden Markov Models. These transformer-based approaches boast simpler configurations and exhibit faster performance during both training and analysis. An integral component of the article is the examination of popular and actively evolving libraries and tools essential for constructing language models. Notable tools such as NLTK, TensorFlow, PyTorch, and Gensim are reviewed, with a comparative analysis considering their simplicity and accessibility for implementing diverse language models. The aim is to provide readers with insights into the landscape of contemporary language modeling methodologies and the tools available for their implementation. Full article
(This article belongs to the Special Issue Natural Language Processing: Trends and Challenges)
Show Figures

Figure 1

Back to TopTop