Abstract:
The dataset is the IMDB dataset which comprise of the reviews of 50k movies and it is available on the Kaggle, which has 2 columns in it. The two columns in the dataset contains the reviews and the sentiment which will help us to identify the reviews is a positive review, or a negative review about the movie. Machine Learning is a modern-day concept in which the machine can be trained in the same manner as an infant baby learns the things in an environment, with the use of machine learning certain patterns are used from the data to give the machine a learning, reasoning and predicting ability. The aim is to train and build a machine learning model that will be able to predict whether a movie review provided is a positive or negative review, this is also known as binary text Classification which will be done by using Scikit-learn library, which is a very common machine learning library used in python programming language.