Customers and buyers have benefited greatly from advances in Internet connectivity in recent years. Rapidly growing e-commerce companies have produced real big data as a result of these developments. The enormous popularity of big data on social media allows buyers to express their opinions and views on a wide range of topics, such as the economic situation, to express your dissatisfaction with certain products or services or to express your pleasure with your purchases.
A significant number of consumer reviews and product reviews provide a wealth of useful information and have recently emerged as an important resource for consumers and businesses alike. Consumers frequently seek quality information from online reviews before purchasing a product, and many businesses use online reviews as crucial input for your products, marketing and customer relationship management. Hence, understanding the psychology of online consumer behavior has become key to competing in today’s markets characterized by increased competition and globalization.
Sentiment analysis and text analysis are applications of big data analysis aimed at aggregating and extracting emotions and sentiments from many types of ratings. This exponentially growing big data is mostly in an unstructured format that is impossible to interpret by humans power. Therefore, it is crucial to use machine learning with natural language processing (NLP), which focuses on gathering facts and opinions from the vast amount of information on the internet. Applying a machine learning NLP model to predict sentiment based on consumer product reviews received from social media and e-commerce websites. The NLP process consists of several steps:
1.Data preprocessing and feature extraction, which converts your text into a predictable and predictable format for your task, it can also help you extract features to understand the layout of your review text. Marking parts of speech are some of the steps involved in data preprocessing and feature extraction.
Performance Benchmarking on Workstation and HPC Cloud
The NLP – Machine Learning Algorithm for E-Commerce is a very computationally intensive technique, especially the LDA algorithm, as already mentioned above. To complete the study, we first carried out a performance analysis with a high-performance desktop computer with 16 CPU cores. and 32 GB of RAM. The performance analysis was conducted to examine the computer system requirements to process up to 20 million verification data with the following benchmark results:
The effort involved in modeling topics increases exponentially due to the LDA algorithm. To overcome this disadvantage, we have found parallel LDA theme modeling methods based on the MapReduce architecture using a distributed programming model, the parallel implementation of the model LDA theme using the parallel computing platform Hadoop. The results show that with a large number of patches, this parallel approach can achieve a well-suited near-linear acceleration for local HPC and HPC resources in the cloud. The HPC environment has the Python-based Anaconda platform, which is supported in data analysis and predictive modeling. As we have shown, dealing with such large amounts of data is a real challenge for this NLP project and requires significant computing power. Ideally, such a huge amount of data is possible by scaling the algorithm to HPC in the cloud.
Further experiments, carried out in HPC’s cloud environment, will demonstrate the ability to remotely configure and run big data analytics and build AI models in the cloud. The configuration requirements for the AI machine learning model are then pre-installed in the HPC application containers on the Uber Cloud Engineering Simulation Platform, which allows the user to access and run the NLP workflow without installing any pre-configuration.
Source link