Data Science Support Vector Machine SVM

🔹 SVM is a supervised learning algorithm used for classification and regression tasks.
🔹 It finds the optimal decision boundary (hyperplane) that separates classes in the best possible way.
🔹 SVM is particularly effective in high-dimensional spaces and works well for small datasets.

Examples of SVM Applications

✅ Spam Detection (Spam / Not Spam)
✅ Face Recognition
✅ Medical Diagnosis (Cancer Detection)
✅ Stock Market Prediction

1. How Does SVM Work?

1️⃣ Finding the Optimal Hyperplane

  • A hyperplane is a decision boundary that separates different classes.
  • The best hyperplane is the one that maximizes the margin (distance between the nearest data points of different classes).
  • The nearest data points to the hyperplane are called Support Vectors.

2️⃣ Handling Non-Linearly Separable Data

  • If data is not linearly separable, SVM uses kernel functions to map data into a higher-dimensional space where it becomes linearly separable.
  • Common kernel functions:
    • Linear Kernel: Used when data is linearly separable.
    • Polynomial Kernel: Maps data to a higher degree polynomial space.
    • Radial Basis Function (RBF) Kernel: Commonly used for complex datasets.
    • Sigmoid Kernel: Similar to a neural network activation function.

2. Mathematical Formulation of SVM

1. Hard Margin SVM (Linearly Separable Case)

1.1. Given a Dataset

Consider a binary classification problem with a dataset:

D={(xi,yi)}i=1n,yi{1,+1},xiRd

1.2. Hyperplane Equation

A hyperplane is defined as:

wx+b=0

1.3. Margin Calculation

For a given data point (xi,yi), the functional margin is:

γi=yi(wxi+b)

To ensure correct classification, we require:

yi(wxi+b)1,i

The geometric margin is:

1w

1.4. Optimization Problem

To maximize the margin, we minimize 12w2 while ensuring all data points are correctly classified:

minw,b12w2

subject to:

yi(wxi+b)1,i

2. Soft Margin SVM (Linearly Non-Separable Case)

When data is not perfectly separable, we introduce slack variables ξi to allow misclassification:

yi(wxi+b)1ξi,ξi0

3. Dual Formulation (Using Lagrange Multipliers)

We derive the dual problem:

maxαi=1nαi12i=1nj=1nαiαjyiyjK(xi,xj)

subject to:

i=1nαiyi=0,0αiC

4. Kernel Trick for Non-Linear SVM

For non-linearly separable data, we use a kernel function K(xi,xj) to transform data into a higher-dimensional space where it becomes linearly separable.

Common Kernel Functions

  • Linear Kernel: K(xi,xj)=xixj
  • Polynomial Kernel: K(xi,xj)=(xixj+c)d
  • Radial Basis Function (RBF) Kernel: K(xi,xj)=exp(xixj22σ2)
  • Sigmoid Kernel: K(xi,xj)=tanh(βxixj+c)

3. Implementing SVM in Python

Step 1: Install Required Libraries

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_iris
import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.svm import SVC, SVR from sklearn.metrics import accuracy_score, classification_report from sklearn.datasets import load_iris
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_iris

4. SVM for Classification (Iris Dataset Example)

Step 1: Load Dataset

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Load dataset
iris = load_iris()
X = iris.data # Features
y = iris.target # Target classes
# Convert to DataFrame
df = pd.DataFrame(X, columns=iris.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
# Load dataset iris = load_iris() X = iris.data # Features y = iris.target # Target classes # Convert to DataFrame df = pd.DataFrame(X, columns=iris.feature_names) df['Target'] = y # Display first 5 rows print(df.head())
# Load dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target classes

# Convert to DataFrame
df = pd.DataFrame(X, columns=iris.feature_names)
df['Target'] = y

# Display first 5 rows
print(df.head())

Step 2: Split Data into Training and Testing Sets

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Split into training (80%) and testing (20%) sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Train the SVM Classifier

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Create SVM model with RBF kernel
svm_model = SVC(kernel='rbf', C=1.0, gamma='scale')
# Train the model
svm_model.fit(X_train, y_train)
# Create SVM model with RBF kernel svm_model = SVC(kernel='rbf', C=1.0, gamma='scale') # Train the model svm_model.fit(X_train, y_train)
# Create SVM model with RBF kernel
svm_model = SVC(kernel='rbf', C=1.0, gamma='scale')

# Train the model
svm_model.fit(X_train, y_train)

Step 4: Make Predictions

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Predict on test data
y_pred = svm_model.predict(X_test)
# Predict on test data y_pred = svm_model.predict(X_test)
# Predict on test data
y_pred = svm_model.predict(X_test)

Step 5: Evaluate Model Performance

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Accuracy Score
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Classification Report
print("Classification Report:\n", classification_report(y_test, y_pred))
# Accuracy Score accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy) # Classification Report print("Classification Report:\n", classification_report(y_test, y_pred))
# Accuracy Score
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

# Classification Report
print("Classification Report:\n", classification_report(y_test, y_pred))

5. SVM for Regression (California Housing Dataset Example)

Step 1: Load Dataset

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.datasets import fetch_california_housing
# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target
# Convert to DataFrame
df = pd.DataFrame(X, columns=data.feature_names)
df['Target'] = y
# Display first 5 rows
print(df.head())
from sklearn.datasets import fetch_california_housing # Load dataset data = fetch_california_housing() X = data.data y = data.target # Convert to DataFrame df = pd.DataFrame(X, columns=data.feature_names) df['Target'] = y # Display first 5 rows print(df.head())
from sklearn.datasets import fetch_california_housing

# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target

# Convert to DataFrame
df = pd.DataFrame(X, columns=data.feature_names)
df['Target'] = y

# Display first 5 rows
print(df.head())

Step 2: Split Data into Training and Testing Sets

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Train the SVM Regressor

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Create SVM Regressor model with RBF kernel
svm_regressor = SVR(kernel='rbf', C=1.0, gamma='scale')
# Train the model
svm_regressor.fit(X_train, y_train)
# Create SVM Regressor model with RBF kernel svm_regressor = SVR(kernel='rbf', C=1.0, gamma='scale') # Train the model svm_regressor.fit(X_train, y_train)
# Create SVM Regressor model with RBF kernel
svm_regressor = SVR(kernel='rbf', C=1.0, gamma='scale')

# Train the model
svm_regressor.fit(X_train, y_train)

Step 4: Make Predictions

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Predict on test data
y_pred = svm_regressor.predict(X_test)
# Predict on test data y_pred = svm_regressor.predict(X_test)
# Predict on test data
y_pred = svm_regressor.predict(X_test)

Step 5: Evaluate Model Performance

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.metrics import mean_squared_error
# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
from sklearn.metrics import mean_squared_error # Calculate Mean Squared Error mse = mean_squared_error(y_test, y_pred) print("Mean Squared Error:", mse)
from sklearn.metrics import mean_squared_error

# Calculate Mean Squared Error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

6. Understanding the Output

🔹 Accuracy Score → Percentage of correctly classified instances.
🔹 Classification Report → Precision, Recall, F1-score for each class.
🔹 Mean Squared Error (Regression) → Measures how well the model predicts continuous values.

7. Choosing the Right Kernel

Kernel When to Use
Linear Kernel When data is linearly separable
Polynomial Kernel When data has polynomial relationships
RBF Kernel When data is not linearly separable
Sigmoid Kernel When data has similarity-based relationships

8. Advantages & Disadvantages of SVM

✅ Advantages

✔ Works well for small datasets.
✔ Effective in high-dimensional spaces.
✔ Robust to outliers (with soft margin tuning).

❌ Disadvantages

❌ Slow for large datasets.
❌ Choosing the right kernel is tricky.
❌ Difficult to interpret results compared to Decision Trees.

Summary

✔ SVM finds the best hyperplane that separates classes with the maximum margin.
✔ It uses kernel functions (Linear, RBF, Polynomial) to handle non-linear data.
✔ SVM can be used for both classification (SVC) and regression (SVR).
✔ It works best for small-to-medium-sized datasets.