Computer Bias
Computer Bias popcorn hacks and Homework
Big Idea 5 – Computing Bias
Mar 17, 2025 • Avika, Gabi, Zoe
What is Computing Bias?
Bias: A prejudice in favor of or against a person or group in a way that is usually unfair.
Computing Bias occurs when algorithms or systems produce results that disadvantage certain groups. It often arises from:
- Biased or incomplete data
- Flawed design
- Unintended consequences of programming choices
Example: Netflix Recommendation Bias
Netflix uses algorithms to recommend content, but those algorithms can introduce bias by:
Majority Preference Bias
- Recommends only popular shows, hiding niche or diverse options.
Filtering Bias
- Filters out content based on limited viewing history.
- If you mostly watch rom-coms, you may never see documentaries or foreign films.
How Does Computing Bias Happen?
- Unrepresentative or Incomplete Data
- Models trained on limited datasets don’t reflect real-world diversity.
- Flawed or Biased Data
- If existing data includes prejudice (e.g., historical hiring patterns), the system learns and repeats those biases.
- Biased Data Labeling
- Human annotators may unconsciously inject cultural or personal bias during labeling.
Explicit vs. Implicit Data
| Type | Definition | Netflix Example |
|---|---|---|
| Explicit Data | Data directly provided by users | Entering your name, age, or rating a movie |
| Implicit Data | Data inferred from user behavior | Viewing history, time spent watching, click patterns |
Why It Matters:
- Implicit data can reinforce user habits, creating feedback loops that limit discovery.
- Explicit data may still be biased if limited by design or user understanding.
Popcorn Hack #1
Question: What is an example of Explicit Data?
Options:
A) Netflix recommends shows based on your viewing history
B) You provide your name, age, and preferences when creating a Netflix account
C) Netflix tracks the time you spend watching certain genres
Answer: B – This is explicit data, because it’s provided directly by the user.
Types of Bias
Algorithmic Bias
- Comes from faulty system logic that repeats discrimination.
Example: Amazon’s hiring tool favored men because it was trained on past hiring data that was male-dominated.
Data Bias
- Arises when training data is incomplete or unbalanced.
Example: A health AI system underestimates disease risk for underrepresented groups.
Cognitive Bias
- Introduced by researchers or developers due to personal assumptions.
Example: A researcher only selects data supporting their belief about screen time affecting grades.
Popcorn Hack #2
Question: What is an example of Data Bias?
Options:
A) A hiring algorithm favors men due to biased past resumes
B) A dataset underrepresents people with darker skin tones
C) A researcher selects data that supports their screen time theory
Answer: B – Underrepresentation in data leads to performance issues for certain groups.
Intentional vs. Unintentional Bias
Intentional Bias
- Purposefully embedding prejudice to favor one group.
Example: A hiring algorithm is designed to rank resumes from certain schools or companies higher, favoring specific demographics.
Unintentional Bias
- Occurs accidentally due to flawed datasets.
Example: A facial recognition tool trained on mostly light-skinned faces struggles to recognize darker skin tones—not due to intent, but poor data variety.
Popcorn Hack #3
Activity: Describe a biased scenario. Have classmates guess: was it intentional or unintentional?
Mitigation Strategies
To reduce bias in algorithms, apply these techniques at every phase:
1. Pre-processing (Planning & Data Collection)
- Check for data diversity and completeness
- Remove irrelevant or biased variables
Goal: Prepare balanced data to avoid bias in training.
2. In-processing (Training & Validation)
- Use cross-validation
- Add synthetic data to represent minorities
Goal: Ensure fairness during model development.
3. Post-processing (Deployment & Real-World Use)
- Monitor system performance
- Adjust output if unfair results appear
Goal: Maintain equity as the model operates in real settings.
Homework Questions
Multiple Choice
(Each worth 0.1 points)
- Which phase includes inserting synthetic samples?
- What is an example of cognitive bias?
- What’s the key difference between implicit and explicit data?
- Which type of bias occurs due to flawed system logic?
(More questions provided in-class or online)
Short-Answer
Prompt:
Explain the difference between implicit and explicit data. Give an example of each.
Scoring Rubric (Total: 1.0 point):
| Criteria | Description | Points |
|---|---|---|
| Multiple-Choice (7 total) | 0.1 point each | 0.7 |
| Short-Answer - Clarity | Clear explanation | 0.15 |
| Short-Answer - Examples | Two accurate examples provided | 0.15 |