Machine Learning : Logistic Regression
Logistic Regression
Let’s first understand what is Regression Analysis?
Regression analysis is a technique in which we predict value of independent variable given the value of dependent variable. It is a predictive modelling technique and estimates the relation between a dependent and an independent variable that we need to predict.
Regression is classified in three types:-
1. Logistic Regression
2. Linear Regression
3. Polynomial Regression
Logistic Regression produces result in binary format which is used to predict the outcome of a categorical dependent variable. So, the outcome should be discrete/categorical such as:
· 0 or 1
· Yes or No
· True or False
· High and Low
We use logistic regression when we need the output to be in the logic i.e. 0 or 1. We need the output to be True or False. We not use linear regression because the value of linear regression in the particular range but here we only need value between 0 and 1, With this our resulting curve cannot be formulated into a single formula. Hence, we came up with Logistic!
When we deal with Logistic Regression, we deal with sigmoid curve which converts any value between negative infinity to positive infinity to the discrete value 0 and 1 When datapoint is above 0.5 then we have concept of threshold value that converts it into particular value. The logistic Regression is derived from the straight-line equation
Sigmoid Curve:-
Equation of straight Line:
Y=c+B1X1+B2X2+……
But in the logistic equation we can have only 1 and 0 as our output
To get the range of Y between 0 and infinity to let’s transform Y
Y when Y=0 then 0
1-Y when Y=1 then infinity
Now let’s transform it further to get the range between -∞ to +∞.
Log[Y÷(1-Y)] =>Y=C+B1X1+B2X2
|
Difference between Linear Regression and Logistic Regression
Linear Regression
1)Continuous Variable Output
2)Solve Regression Problem
Here y variable needs to be in range
3)Straight Value
Logistic Regression
1)Categorical Variable Output
2)Solve Classification Problems
We can use this for a classification problem as we can
3)S-Curve
Logistic Regression popular use cases
1. Whether Prediction
2. Whether It is raining
3. Whether it is cloudy or not
4. Whether it will be snow
5. Whether it’s a bird or not. It will be use in a classification problem
6. Determine whether it is ill or not
7. Now we will deal with coding parts and analyze several and use logistic regression in that to predict the outcome.
We need to do the following steps in our coding section:
1. Collecting Data
2. Analyzing Data
3. Data Wrangling-Cleaning Your Data
4. Train & Test
5. Accuracy Check
Open Your Jupyter notebook and download the titanic dataset from the link mentioned below and with the help of my blog perform the prediction on the dataset
We have to train our model through train.csv given by the model and perform the prediction on test.csv and check the accuracy of the result on test.csv and make the final submission of the prediction made to Kaggle.
We will use pandas library to import the data in jupyter notebook and then perform the future analysis on data.
Since we have to use pandas several time, we will make it short form as pd. To read the CSV file we will use read_csv and import that dataset as dataframe in the notebook.
To display the data we will use head() which will display the first 5 data of dataframe.
Then we have to countplot the data to check the no of passenger who have survived,
sns.countplot(x=”surived”,hue=’Sex’,data=titanic_data)
To see my approach how I approached the titanic dataset go to my Kaggle link I have mentioned all the thing you need to do :-




Comments
Post a Comment