SVM is a supervised learning algorithm used for classification and regression tasks. It finds the optimal decision boundary (hyperplane) that separates classes in the best possible way. SVM is particularly effective in high-dimensional spaces and works well for small datasets.
Examples of SVM Applications
Spam Detection (Spam / Not Spam) Face Recognition Medical Diagnosis (Cancer Detection) Stock Market Prediction
1. How Does SVM Work?
Finding the Optimal Hyperplane
A hyperplane is a decision boundary that separates different classes.
The best hyperplane is the one that maximizes the margin (distance between the nearest data points of different classes).
The nearest data points to the hyperplane are called Support Vectors.
Handling Non-Linearly Separable Data
If data is not linearly separable, SVM uses kernel functions to map data into a higher-dimensional space where it becomes linearly separable.
Common kernel functions:
Linear Kernel: Used when data is linearly separable.
Polynomial Kernel: Maps data to a higher degree polynomial space.
Radial Basis Function (RBF) Kernel: Commonly used for complex datasets.
Sigmoid Kernel: Similar to a neural network activation function.
2. Mathematical Formulation of SVM
1. Hard Margin SVM (Linearly Separable Case)
1.1. Given a Dataset
Consider a binary classification problem with a dataset:
1.2. Hyperplane Equation
A hyperplane is defined as:
1.3. Margin Calculation
For a given data point , the functional margin is:
To ensure correct classification, we require:
The geometric margin is:
1.4. Optimization Problem
To maximize the margin, we minimize while ensuring all data points are correctly classified:
subject to:
2. Soft Margin SVM (Linearly Non-Separable Case)
When data is not perfectly separable, we introduce slack variables to allow misclassification:
3. Dual Formulation (Using Lagrange Multipliers)
We derive the dual problem:
subject to:
4. Kernel Trick for Non-Linear SVM
For non-linearly separable data, we use a kernel function to transform data into a higher-dimensional space where it becomes linearly separable.
Common Kernel Functions
Linear Kernel:
Polynomial Kernel:
Radial Basis Function (RBF) Kernel:
Sigmoid Kernel:
3. Implementing SVM in Python
Step 1: Install Required Libraries
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_iris
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_iris
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_iris
4. SVM for Classification (Iris Dataset Example)
Step 1: Load Dataset
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Load dataset
iris = load_iris()
X = iris.data # Features
y = iris.target # Target classes
# Convert to DataFrame
df = pd.DataFrame(X, columns=iris.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
# Load dataset
iris = load_iris()
X = iris.data # Features
y = iris.target # Target classes
# Convert to DataFrame
df = pd.DataFrame(X, columns=iris.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
# Load dataset
iris = load_iris()
X = iris.data # Features
y = iris.target # Target classes
# Convert to DataFrame
df = pd.DataFrame(X, columns=iris.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
Step 2: Split Data into Training and Testing Sets
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
5. SVM for Regression (California Housing Dataset Example)
Step 1: Load Dataset
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.datasets import fetch_california_housing
# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target
# Convert to DataFrame
df = pd.DataFrame(X, columns=data.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
from sklearn.datasets import fetch_california_housing
# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target
# Convert to DataFrame
df = pd.DataFrame(X, columns=data.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
from sklearn.datasets import fetch_california_housing
# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target
# Convert to DataFrame
df = pd.DataFrame(X, columns=data.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
Step 2: Split Data into Training and Testing Sets
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create SVM Regressor model with RBF kernel
svm_regressor = SVR(kernel='rbf', C=1.0, gamma='scale')
# Train the model
svm_regressor.fit(X_train, y_train)
# Create SVM Regressor model with RBF kernel
svm_regressor = SVR(kernel='rbf', C=1.0, gamma='scale')
# Train the model
svm_regressor.fit(X_train, y_train)
Step 4: Make Predictions
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Predict on test data
y_pred = svm_regressor.predict(X_test)
# Predict on test data
y_pred = svm_regressor.predict(X_test)
# Predict on test data
y_pred = svm_regressor.predict(X_test)
Step 5: Evaluate Model Performance
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.metrics import mean_squared_error
# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
from sklearn.metrics import mean_squared_error
# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
from sklearn.metrics import mean_squared_error
# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
6. Understanding the Output
Accuracy Score → Percentage of correctly classified instances. Classification Report → Precision, Recall, F1-score for each class. Mean Squared Error (Regression) → Measures how well the model predicts continuous values.
7. Choosing the Right Kernel
Kernel
When to Use
Linear Kernel
When data is linearly separable
Polynomial Kernel
When data has polynomial relationships
RBF Kernel
When data is not linearly separable
Sigmoid Kernel
When data has similarity-based relationships
8. Advantages & Disadvantages of SVM
Advantages
Works well for small datasets. Effective in high-dimensional spaces. Robust to outliers (with soft margin tuning).
Disadvantages
Slow for large datasets. Choosing the right kernel is tricky. Difficult to interpret results compared to Decision Trees.
Summary
SVM finds the best hyperplane that separates classes with the maximum margin. It uses kernel functions (Linear, RBF, Polynomial) to handle non-linear data. SVM can be used for both classification (SVC) and regression (SVR). It works best for small-to-medium-sized datasets.