Module 7: Representations & Structuralism

What you’ll learn in this module

This module explores how machine learning sees the world through a structuralist lens.

You’ll learn:

  • Why we draw boundaries where none exist in continuous reality and how observation shapes categories.
  • How meaning emerges from relationships rather than fixed definitions in Saussurean linguistics and structural anthropology.
  • What vector representations reveal about the continuous nature of concepts and why embeddings restore lost gradation.
  • How embeddings dissolve artificial divisions while preserving relational structure across text, images, networks, and time-series.

The Journey

Have you ever wondered where one category ends and another begins? The world is continuous, but we carve it into categories. This module traces a philosophical thread from linguistic structuralism to modern representation learning.

Part 1: When Boundaries Blur

We start by questioning the categories we take for granted. Regional dialects, generational labels, and even physical objects have fuzzy boundaries that we artificially sharpen. Is a 17-year-old fundamentally different from an 18-year-old, and where exactly does “blue” end and “green” begin? These divisions feel natural because language demands them, but they’re artifacts of how we choose to see.

Part 2: Meaning as Difference

Here’s a radical idea: language doesn’t label pre-existing categories but creates them through contrast. A word means what it means because of what it is not. We explore how Saussure, Buddhist logic, and structural anthropology all converge on this insight that meaning is relational, not absolute.

Part 3: From Categories to Coordinates

What happens when we take structuralism seriously? Word2Vec and modern embedding techniques operationalize this philosophical insight by mapping concepts into continuous vector space, restoring the gradation that categorical thinking destroys. “King” - “man” + “woman” ≈ “queen” isn’t magic but geometry encoding relational meaning.

Part 4: Universal Representations

The representational insight extends far beyond language to networks (embedded in continuous space), images (mapped to semantic coordinates), and time-series (trajectories through latent dimensions). You’ll see how this structuralist perspective unifies disparate areas of machine learning.

Why This Matters

Traditional machine learning forces continuous phenomena into discrete boxes through classification boundaries and sharp node edges, but a new generation of techniques embraces continuity. Understanding representation learning changes how you think about data by shifting from “What category does this belong to?” to “What relationships does this participate in?”

This perspective matters for fairness. When we recognize that categories are observer-dependent rather than natural, we become more careful about how we define them, and when we see bias in embeddings, we understand it’s reflecting the relational structure of our training data, not objective truth. These ideas connect to broader themes in science and philosophy, as machine learning provides computational answers to questions philosophers have asked for centuries about dividing continuous reality into discrete concepts.

Prerequisites

You should have completed Module 4 on text and word embeddings, understanding how word2vec works and what embeddings represent as we build on those concrete techniques to explore their philosophical implications. Basic familiarity with dimensionality reduction (PCA, t-SNE, UMAP) from Module 2 helps, as we’ll discuss how these techniques reveal structure by projecting high-dimensional representations into interpretable spaces. No formal philosophy background is required since we introduce structuralist concepts as needed, but curiosity about the conceptual foundations of machine learning is essential.

What You’ll Build

By the end of this module, you’ll develop a conceptual framework for understanding representation learning, seeing connections between linguistic structuralism and modern embeddings while understanding why continuous representations often outperform discrete categories. You’ll analyze how different embedding techniques preserve different aspects of relational structure, recognize when discrete boundaries help and when they hurt, and think critically about how categorization shapes both human cognition and machine learning systems. Most importantly, you’ll gain philosophical depth that complements your technical skills, transforming you from someone who uses tools to someone who understands their foundations.

Let’s begin by examining the boundaries we take for granted.