Module 2: Data Visualization
This module transforms you from someone who makes charts to someone who reveals insight through visual design.
You’ll learn:
- How perception psychology shapes what we see in visualizations and why color, Gestalt principles, and preattentive attributes matter.
- The golden rule: show all the data whenever possible, and why summary statistics alone mislead.
- How to choose the right visualization for one-dimensional, two-dimensional, and high-dimensional datasets.
- Specialized techniques for networks and time-series data that reveal structure and patterns.
The Journey
Let’s talk about where this module takes you. We begin with the surprising ways your visual system deceives you, then build up practical skills for every major data type.
Each part answers a crucial question about effective visualization.
Principles of Effective Visualization
Before we can make good visualizations, we need to understand how humans see. Your visual perception is contextual, subjective, and driven by unconscious processes.
You’ll learn how color fools us, how Gestalt principles organize information, and how preattentive attributes guide attention.
The golden rule of visualization: show all the data, whenever possible. Why do dynamite plots (bar charts with error bars) persist despite being misleading?
You’ll discover swarm plots, violin plots, and kernel density estimation. The goal is honesty.
Shift your attention to relationships between two variables. Scatter plots are just the beginning.
You’ll learn when to use hexbin plots for dense data, how to add marginal distributions, and why correlation doesn’t imply causation.
Visualizing High-Dimensional Data
What happens when you have dozens or hundreds of variables? Parallel coordinates, radar charts, and dimensionality reduction techniques like PCA, t-SNE, and UMAP reveal structure in high-dimensional spaces.
You’ll understand when each method shines and when it misleads.
Networks capture relationships: social connections, citations, biological interactions. Node-link diagrams are intuitive but quickly become hairballs.
You’ll explore force-directed layouts, matrix visualizations, and when to use network embeddings instead.
Time-series data has special structure. Trends, seasonality, and anomalies emerge through clever visual design.
You’ll learn about scale choices (linear vs. log), smoothing techniques, and how to handle multiple time-series simultaneously.
Why This Matters
Here’s something remarkable. Visualization is not decoration. It’s exploration, communication, and decision-making rolled into one. When you visualize data well, patterns jump out immediately, outliers reveal themselves, and relationships become obvious.
Questions you didn’t know to ask suddenly demand answers. Exploratory visualization drives discovery. But visualization also communicates: a well-designed figure conveys your finding instantly, while a poorly designed figure obscures truth (or worse, misleads).
Understanding perception psychology and design principles separates effective communication from noise. These skills matter beyond academia, where data-driven organizations make decisions based on dashboards and reports. The ability to transform raw numbers into clear visual stories is increasingly valuable. Whether you’re analyzing experimental results, monitoring systems, or presenting to stakeholders, visualization is your interface to insight.
Prerequisites
You should be comfortable with basic Python programming and familiar with NumPy arrays and Pandas DataFrames. Experience with matplotlib or seaborn helps but isn’t required. We’ll teach you the visualization libraries as we go, focusing on principles and design choices rather than syntax. Understanding when to use a particular chart type matters more than memorizing function parameters.
What You’ll Build
By the end of this module, you’ll have a diverse visualization portfolio: honest one-dimensional visualizations that show all data points, effective scatter plots with marginal distributions, high-dimensional datasets reduced to interpretable 2D projections, networks visualized using multiple complementary approaches, and time-series data handled with appropriate scale and smoothing choices. Most importantly, you’ll develop intuition for chart selection. Given a dataset and a question, you’ll know which visualization reveals the answer.
Let’s begin by understanding how your visual system really works.