Machine Learning Algorithms Part 7: Linear Support Vector Machine In Python
Linear Support Vector Machine (or LSVM) is a supervised learning method that looks at data and sorts it into one of two categories. LSVM works by drawing a line between two classes. All the data points that fall on one side of the line will be labeled as one class and vice versa. Sounds simple enough, but there’s an infinite amount of lines to choose from. How do we know which line will do the best job of classifying new incoming data? This is where LSVM shines.
LSVM chooses the line that maximizes the distance between the support vectors and the hyperplane. Support vectors are the extreme points in the datasets. They’re called vectors because the data points are represented in terms of position vectors.
Let’s take a look at how we could go about classifying data using the linear support vector machine algorithm with python. As always, we need to start by importing the required libraries.
from sklearn.datasets.samples_generator import make_blobs
from matplotlib import pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import numpy as np
import pandas as pd
In this tutorial, we’ll generate our own data using the make_blobs
function from the sklearn.datasets
module. We separate the data into training and test sets to verify the accuracy of our model.
X, y = make_blobs(n_samples=125, centers=2,
random_state=0, cluster_std=0.60)
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size = 0.20, random_state=0)
plt.scatter(train_X[:, 0], train_X[:, 1], c=train_y, cmap='winter');
When creating a support vector classifier, we’ll explicitly tell it to use the linear kernel.
svc = SVC(kernel='linear')
svc.fit(train_X, train_y)
coef_ is a readonly property derived from dual_coef_ and support_vectors_, where dual_coef_ are the coefficients of the support vector in the decision function.
We can get the slope of our line by taking the negative of the element first of the list and dividing it by the second. Similarly, we can get the intercept by dividing first element of the intercept_ list by the second element of the coefficients list.
plt.scatter(train_X[:, 0], train_X[:, 1], c=train_y, cmap='winter');
ax = plt.gca()
xlim = ax.get_xlim()
ax.scatter(test_X[:, 0], test_X[:, 1], c=test_y, cmap='winter', marker="s")
w = svc.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(xlim[0], xlim[1])
yy = a * xx - (svc.intercept_[0]) / w[1]
plt.plot(xx, yy)
plt.show()
From the preceding image, we can visually see how our model classified the test set (represented as squares).
We could also use the confusion matrix to view the predictions made by our model. The numbers on the diagonal of the confusion matrix correspond to correct predictions whereas the others imply false positives and false negatives.
pred_y = svc.predict(test_X)
confusion_matrix(test_y, pred_y)
Given our confusion matrix, our model has an accuracy of 100%. In the real world, however, the data we’ll be working with will have many more features and won’t be (in all likelihood) so nicely segregated.
Cory Maklin
_Sign in now to see your channels and recommendations!_www.youtube.com