Data visualization is a key component of Data Science, helping to understand patterns, relationships, and insights from data. Two of the most popular Python libraries for visualization are Matplotlib and Seaborn.
1. Matplotlib: The Foundation of Data Visualization
Matplotlib is a low-level, highly customizable library for creating static, animated, and interactive plots in Python.
1.1. Installing Matplotlib
If you haven’t installed Matplotlib yet, you can do so using:
pip install matplotlib
1.2. Basic Line Plot Using Matplotlib
import matplotlib.pyplot as plt # Sample data x = [1, 2, 3, 4, 5] y = [10, 20, 25, 30, 40] # Create a line plot plt.plot(x, y, marker='o', linestyle='-', color='b', label="Data Line") plt.xlabel("X-axis") plt.ylabel("Y-axis") plt.title("Basic Line Plot") plt.legend() plt.show()
1.3. Bar Chart Using Matplotlib
# Sample data categories = ['A', 'B', 'C', 'D'] values = [5, 7, 3, 8] # Create a bar chart plt.bar(categories, values, color=['blue', 'red', 'green', 'orange']) plt.title("Basic Bar Chart") plt.xlabel("Categories") plt.ylabel("Values") plt.show()
1.4. Histogram Using Matplotlib
import numpy as np # Generate random data data = np.random.randn(1000) # Create a histogram plt.hist(data, bins=30, edgecolor="black", color="blue") plt.title("Histogram Example") plt.xlabel("Value") plt.ylabel("Frequency") plt.show()
1.5. Scatter Plot Using Matplotlib
# Sample data x = np.random.rand(50) y = np.random.rand(50) # Create scatter plot plt.scatter(x, y, color='purple', alpha=0.6) plt.title("Scatter Plot Example") plt.xlabel("X-axis") plt.ylabel("Y-axis") plt.show()
2. Seaborn: High-Level Statistical Visualization
Seaborn is built on top of Matplotlib and provides an easier and more visually appealing way to create statistical graphics.
2.1. Installing Seaborn
pip install seaborn
2.2. Basic Line Plot Using Seaborn
import seaborn as sns # Sample data x = np.linspace(0, 10, 100) y = np.sin(x) # Seaborn line plot sns.lineplot(x=x, y=y) plt.title("Seaborn Line Plot") plt.show()
2.3. Bar Chart Using Seaborn
# Sample data data = {'Category': ['A', 'B', 'C', 'D'], 'Values': [5, 7, 3, 8]} df = pd.DataFrame(data) # Seaborn bar plot sns.barplot(x='Category', y='Values', data=df, palette='coolwarm') plt.title("Seaborn Bar Chart") plt.show()
2.4. Histogram Using Seaborn
# Generate random data data = np.random.randn(1000) # Seaborn histogram sns.histplot(data, bins=30, kde=True, color='green') plt.title("Seaborn Histogram with KDE") plt.show()
2.5. Scatter Plot Using Seaborn
# Generate sample data df = pd.DataFrame({'X': np.random.rand(50), 'Y': np.random.rand(50)}) # Seaborn scatter plot sns.scatterplot(x='X', y='Y', data=df, color='red') plt.title("Seaborn Scatter Plot") plt.show()
2.6. Box Plot Using Seaborn
# Generate sample data df = pd.DataFrame({'Category': ['A', 'B', 'A', 'B', 'A', 'B'], 'Values': np.random.randn(6)}) # Seaborn box plot sns.boxplot(x='Category', y='Values', data=df) plt.title("Seaborn Box Plot") plt.show()
3. Matplotlib vs. Seaborn: When to Use Which?
Feature | Matplotlib | Seaborn |
---|---|---|
Customization | High | Moderate |
Ease of Use | Medium | High |
Default Styling | Simple | Beautiful |
Statistical Visualization | Limited | Built-in |
Best For | Basic plots | Advanced, statistical plots |
- Use Matplotlib when you need fine-grained control over visualizations.
- Use Seaborn for quick and beautiful statistical graphics with minimal effort.
Summary
- Matplotlib provides flexible and powerful plotting but requires more customization.
- Seaborn simplifies statistical visualization with built-in themes and color palettes.
- Both libraries work well together and can be used based on the visualization needs.