Big Idea 5 – Computing Bias

Mar 17, 2025 • Avika, Gabi, Zoe

What is Computing Bias?

Bias: A prejudice in favor of or against a person or group in a way that is usually unfair.

Computing Bias occurs when algorithms or systems produce results that disadvantage certain groups. It often arises from:

Biased or incomplete data
Flawed design
Unintended consequences of programming choices

Example: Netflix Recommendation Bias

Netflix uses algorithms to recommend content, but those algorithms can introduce bias by:

Majority Preference Bias

Recommends only popular shows, hiding niche or diverse options.

Filtering Bias

Filters out content based on limited viewing history.
If you mostly watch rom-coms, you may never see documentaries or foreign films.

How Does Computing Bias Happen?

Unrepresentative or Incomplete Data
- Models trained on limited datasets don’t reflect real-world diversity.
Flawed or Biased Data
- If existing data includes prejudice (e.g., historical hiring patterns), the system learns and repeats those biases.
Biased Data Labeling
- Human annotators may unconsciously inject cultural or personal bias during labeling.

Explicit vs. Implicit Data

Type	Definition	Netflix Example
Explicit Data	Data directly provided by users	Entering your name, age, or rating a movie
Implicit Data	Data inferred from user behavior	Viewing history, time spent watching, click patterns

Why It Matters:

Implicit data can reinforce user habits, creating feedback loops that limit discovery.
Explicit data may still be biased if limited by design or user understanding.

Popcorn Hack #1

Question: What is an example of Explicit Data?
Options:
A) Netflix recommends shows based on your viewing history
B) You provide your name, age, and preferences when creating a Netflix account
C) Netflix tracks the time you spend watching certain genres

Answer: B – This is explicit data, because it’s provided directly by the user.

Types of Bias

Algorithmic Bias

Comes from faulty system logic that repeats discrimination.
Example: Amazon’s hiring tool favored men because it was trained on past hiring data that was male-dominated.

Data Bias

Arises when training data is incomplete or unbalanced.
Example: A health AI system underestimates disease risk for underrepresented groups.

Cognitive Bias

Introduced by researchers or developers due to personal assumptions.
Example: A researcher only selects data supporting their belief about screen time affecting grades.

Popcorn Hack #2

Question: What is an example of Data Bias?
Options:
A) A hiring algorithm favors men due to biased past resumes
B) A dataset underrepresents people with darker skin tones
C) A researcher selects data that supports their screen time theory

Answer: B – Underrepresentation in data leads to performance issues for certain groups.

Intentional vs. Unintentional Bias

Intentional Bias

Purposefully embedding prejudice to favor one group.
Example: A hiring algorithm is designed to rank resumes from certain schools or companies higher, favoring specific demographics.

Unintentional Bias

Occurs accidentally due to flawed datasets.
Example: A facial recognition tool trained on mostly light-skinned faces struggles to recognize darker skin tones—not due to intent, but poor data variety.

Popcorn Hack #3

Activity: Describe a biased scenario. Have classmates guess: was it intentional or unintentional?

Mitigation Strategies

To reduce bias in algorithms, apply these techniques at every phase:

1. Pre-processing (Planning & Data Collection)

Check for data diversity and completeness
Remove irrelevant or biased variables

Goal: Prepare balanced data to avoid bias in training.

2. In-processing (Training & Validation)

Use cross-validation
Add synthetic data to represent minorities

Goal: Ensure fairness during model development.

3. Post-processing (Deployment & Real-World Use)

Monitor system performance
Adjust output if unfair results appear

Goal: Maintain equity as the model operates in real settings.

Homework Questions

Multiple Choice

(Each worth 0.1 points)

Which phase includes inserting synthetic samples?
What is an example of cognitive bias?
What’s the key difference between implicit and explicit data?
Which type of bias occurs due to flawed system logic?

(More questions provided in-class or online)

Short-Answer

Prompt:
Explain the difference between implicit and explicit data. Give an example of each.

Scoring Rubric (Total: 1.0 point):

Criteria	Description	Points
Multiple-Choice (7 total)	0.1 point each	0.7
Short-Answer - Clarity	Clear explanation	0.15
Short-Answer - Examples	Two accurate examples provided	0.15

Computer Bias