Lesson 3: Quality Attributes & Tradeoffs

Redis (Paper-Server Edition)

Your fast-lookup reference for architectural thinking

Why "Redis (Paper-Server Edition)"?

Redis is an in-memory data store - lightning-fast lookups for frequently accessed data.

This handout is your paper-server cache - lightning-fast lookups for frequently needed architecture knowledge.

Just like Redis:

Fast retrieval (flip to page, find answer)
Frequently accessed (use during every assignment)
Volatile storage (you don't memorize, you query)
Key-value lookups (problem → solution)

Unlike Redis:

Doesn't expire or get evicted
Doesn't need server infrastructure
Can't handle millions of requests/sec (but you only need ~5/week)
Works offline
No network latency
100% availability (assuming you don't lose it)

How to use this cache:

Keep it nearby during assignments
Query it when you need to remember tactics or concepts
Reference it during case studies
Don't try to memorize - just know it exists and where to find it

Cache hit rate goal: 90%+ of your architecture questions answered here

Before You Begin: Nobody Knows Everything

This is important to internalize:

Nobody knows everything about software architecture.

Not your instructor
Not the authors of famous books
Not the architects at FAANG companies
Not the people who seem confident on Twitter

Everyone is figuring it out as they go.

The difference between junior and senior developers isn't that seniors know everything. It's that seniors:

Know what they don't know
Know how to figure things out
Have made more mistakes and learned from them
Are comfortable with ambiguity

You're supposed to feel uncertain sometimes. That's what architecture feels like.

If you finish this week thinking "I still don't know THE right answer," that's correct.

Architecture rarely has one right answer. It has:

Answers that work in this context
Answers that optimize for these priorities
Answers with acceptable tradeoffs

Your job is to make well-reasoned choices, not perfect ones.

The Big Five Quality Attributes

Quality attributes are characteristics that make a system valuable beyond just functional requirements ("it works").

1. Performance

What it is: How fast the system responds and how much work it can do

Two sub-types:

Response time: How long until I get an answer? (measured in milliseconds)
Throughput: How many requests can you handle? (measured in requests/second)

Real-world example: Amazon found that every 100ms of latency cost them 1% in sales. For them, performance directly impacts revenue.

When to prioritize:

E-commerce checkout
Real-time chat/gaming
High-frequency trading
User-facing APIs

When NOT to prioritize:

Overnight batch processing
Internal admin tools
Analytics dashboards
Report generation

Context matters: A billing system that runs overnight doesn't need sub-second response times.

Common tradeoffs:

Performance vs Simplicity (optimized code is often complex)
Performance vs Security (encryption adds latency)
Performance vs Cost (faster servers cost more)

2. Scalability

What it is: Can the system handle increased load?

Different from performance:

Performance = fast for one user
Scalability = fast for many users

Two types:

Vertical scaling: Bigger server (more RAM, faster CPU)
- Easier to implement
- Has a ceiling (can't infinitely upgrade one machine)
Horizontal scaling: More servers (add machines)
- More complex (requires load balancing, stateless design)
- No theoretical ceiling

The Mexican market reality: Everyone expects to scale to millions of users day one. Reality? Most startups take 3 years to reach 100K users. You have time to refactor.

When to prioritize:

Proven rapid growth
Viral products
Seasonal spikes (e-commerce at holidays)

When NOT to prioritize:

New startups with 0 users
Internal tools with known user count
Niche B2B products

The trap: Don't optimize for scale you don't have. Build for your actual current needs.

Common tradeoffs:

Scalability ⚖️ Simplicity (horizontal scaling is complex)
Scalability ⚖️ Cost (more servers = more money)
Scalability ⚖️ Development time (takes longer to build)

3. Availability

What it is: Is the system up when users need it?

Measured in "nines":

Nines	Downtime/Year	Typical Use
99%	3.65 days	Unacceptable for most
99.9%	8.76 hours	Acceptable for many
99.99%	52 minutes	E-commerce, SaaS
99.999%	5 minutes	Financial, medical

Cost increases exponentially per nine:

Going from 99.9% → 99.99% requires:

Redundant servers
Automatic failover
Multi-region deployment
Load balancers
24/7 on-call team
Disaster recovery procedures

When to prioritize:

Revenue-generating systems
Medical/safety-critical systems
Financial transactions
Customer-facing SaaS

When NOT to prioritize:

Internal tools
Development environments
Analytics dashboards
Batch processing systems

Reality check: Most systems don't need five nines. Three nines (99.9%) is often plenty.

Common tradeoffs:

Availability vs Cost (high availability is expensive)
Availability vs Complexity (redundancy adds moving parts)
Availability vs Development speed (takes longer to build)

4. Security

What it is: Protection from unauthorized access, data breaches, and attacks

Why it matters: One breach can destroy a company. Example: Equifax breach exposed 147 million people, cost $1.4 billion, executives resigned.

Security has costs:

Development time (implementing auth, encryption)
Performance (encryption is slow)
User experience (multi-factor auth is friction)
Operational complexity (key management, compliance)

When to prioritize:

Payment information
Medical records
Personal identifiable information (PII)
Financial data
Authentication systems

When less critical:

Public marketing content
Already-public data
Read-only product catalogs

The question isn't "should we be secure?"

The question is "how secure do we need to be for THIS data?"

Social media post? Different security needs than medical records.

Common tradeoffs:

Security vs Performance (encryption adds latency)
Security vs User experience (MFA adds friction)
Security vs Development speed (security takes time)
Security vs Cost (security tools and audits cost money)

Note: Week 8 covers security in depth. For now, just know it's always a quality attribute to consider.

5. Maintainability

What it is: How easy is it to change, fix, and extend the system?

This is the quality attribute that kills you slowly.

Bad architecture doesn't crash immediately. It makes:

Every change take longer
Every bug harder to fix
Every new feature a nightmare

Six months in: What took 2 days now takes 2 weeks. That's poor maintainability.

Signs of poor maintainability:

"I don't dare touch that code"
"Only Juan knows how this works"
"Changing X always breaks Y"
"We have to test everything manually"
"The original developer left, now we're stuck"

Maintainability is like technical debt - it compounds.

When to prioritize:

Long-lived systems
Small teams
High turnover
Frequent changes expected

This is why "boring technology" wins: Simple, well-understood patterns are maintainable:

Team already knows them
Lots of documentation
StackOverflow answers exist
Easy to hire for
Debugging is easier

Common tradeoffs:

Maintainability vs Performance (simple code often slower than optimized)
Maintainability vs Flexibility (generic solutions are complex)
Maintainability vs Innovation (boring tech is more maintainable)

Common Quality Attribute Tradeoffs

Tradeoff	What This Means	Example
Performance vs Security	Encryption is slow. Faster systems may skip some security.	Encrypt data at rest (DB) but not every API call
Scalability vs Simplicity	Horizontal scaling requires complex orchestration.	Start with monolith, scale when needed
Availability vs Cost	High availability requires redundancy, which is expensive.	Three nines probably enough, not five
Performance vs Maintainability	Optimized code is often complex and hard to maintain.	Use ORM until performance is actually a problem
Security vs User Experience	Strong security adds friction (passwords, 2FA, etc.)	Balance security needs with user convenience
Development Speed vs Everything	Building fast often means sacrificing other qualities.	Ship quickly, refactor later

Key insight: You cannot optimize for everything. Every decision improves some qualities and hurts others.

Architecture is the art of conscious tradeoffs.

The Context Questions Framework

How do you decide which quality attributes to prioritize? Ask six context questions:

1. What's the actual business priority?

Ask:

Is uptime life-or-death (medical) or just annoying (social media)?
Does every millisecond count (trading) or is 1 second fine (reports)?
Is this core product or internal tool?
What happens if this fails?

Why it matters: Business priority determines which quality attribute wins. If client says "we need it fast," performance wins. If they say "it can never go down," availability wins.

Example:

Medical device monitoring: Availability is critical (lives at stake)
Internal admin dashboard: Performance less critical (used occasionally)

2. What's your actual scale?

Ask:

Do you have 10 users or 10 million?
What's realistic growth in 6 months? 1 year?
Is this proven demand or hypothetical future?

Why it matters: Don't optimize for scale you don't have. Most startups take years to reach significant scale. You'll know when you hit scaling problems - servers slow down, costs spike. Then refactor.

Example:

Current: 100 users
Growth: Maybe 1,000 in a year
Don't build for: 1 million users you don't have

Mexican market reality: Everyone expects millions of users day one. Reality is 3 years of slow growth. You have time to refactor.

3. What's your budget?

Three types of budget:

Money budget:

How much can you spend on infrastructure?
$50/month? Different choices than $5,000/month
Serverless vs dedicated servers

Time budget:

How long do you have to build?
2 weeks? Different than 6 months
Tight timeline = simpler solutions

People budget:

How many developers?
1 person? Different than 10-person team
Junior team? Different than senior team

All three constrain your architectural choices.

Example:

Small budget → Heroku monolith, PostgreSQL
Large budget → Kubernetes, microservices, multiple regions

4. What's your timeline?

Ask:

Need it in 2 months or 2 years?
Fast iteration or perfect first time?
Is there a hard deadline (regulatory, market window)?

Why it matters: Tight deadlines favor simpler solutions. You can refactor later once you're generating revenue or meeting the deadline.

Example:

2-month deadline → Simple monolith, ship fast, refactor later
1-year timeline → Can afford more architectural planning

Principle: Ship something that works, then improve it.

5. What can your team actually build and maintain?

This is a quality attribute nobody talks about: Team capability.

Ask:

Team of 3 juniors? Different than 10 seniors.
Does team know this technology?
Can they debug it when it breaks?
Who's on-call at 2 AM when something goes wrong?
Can we hire for this technology?

Why it matters: The best architecture on paper is useless if your team can't build or maintain it.

Real example: 3-person team shouldn't use Kubernetes because:

Learning curve is months
Debugging is hard
Operations are complex
When something breaks at 2 AM, who fixes it?

The best solution for YOUR team ≠ the best solution for Google.

Google has:

100-person teams
Dedicated SREs
Custom tools
24/7 support

You don't.

Principle: Choose technology your team can actually handle.

6. What's your risk tolerance?

Ask:

Is 99% uptime acceptable or unacceptable?
Can you afford to be down for an hour?
What's the cost of downtime?
What's the cost of a security breach?

Why it matters: Risk tolerance determines how much you invest in availability and security.

Example:

E-commerce during Black Friday: Very low risk tolerance (every minute costs money)
Internal reporting tool: Higher risk tolerance (downtime is annoying, not catastrophic)

Cost of risk mitigation: Lower risk tolerance = higher costs (redundancy, monitoring, on-call team)

Team Skills as a Quality Attribute

This deserves its own section because it's crucial:

What your team can actually build and maintain IS a quality attribute.

Consider:

Team composition:

3 junior developers with 6 months experience
vs 10 senior developers with 10+ years
These are different constraints

Technology familiarity:

Team knows Ruby/Rails well
Team has never touched Kubernetes
Learning curve matters

Operational capability:

Who's on-call at 2 AM?
Who can debug distributed systems?
Who understands networking?
Operations is part of the architecture

Hiring constraints:

Can you hire for exotic technology?
In Mexican market, easier to hire Rails devs than Kubernetes experts
Recruitment affects sustainability

The best solution on paper is useless if your team can't build it.

Example scenarios:

Scenario 1:

Team: 3 juniors, know Node.js
Proposal: Microservices with Kubernetes
Problem: Team can't maintain this
Better: Node.js monolith, deploy to Heroku

Scenario 2:

Team: 10 seniors, strong ops background
Proposal: Monolith
May be fine, but team has capability for more if needed
Consider: Does business actually need more complexity?

Principle: Match architecture complexity to team capability.

The Boring Technology Principle

Choose boring technology.

Not because boring is always best, but because boring is usually appropriate.

What is "Boring Technology"?

Characteristics:

Been around for years
Well-documented
Lots of StackOverflow answers
Battle-tested in production
Team already knows it (or can learn easily)
Proven in production at scale
Mature ecosystem

Examples:

PostgreSQL (not MongoDB unless you have a specific reason)
Monolithic architecture (not microservices unless you need it)
REST APIs (not GraphQL unless it solves a problem)
Server-side rendering (not complex SPA unless necessary)
Heroku/Railway (not Kubernetes unless you have the team)
Ruby on Rails, Django, Laravel (not new frameworks)

Boring ≠ Bad. Boring = Reliable.

Why Boring Technology Wins

Reliability:

Known failure modes
Proven at scale
Stable API

Maintainability:

Team likely knows it
Easy to hire for
Lots of documentation
Large community

Debugability:

Many people have hit your problem before
StackOverflow answers exist
Monitoring tools mature

Productivity:

Less time learning
More time building
Faster debugging
Easier onboarding

Example: PostgreSQL has been around since 1996. It's boring. It's also:

Used by millions of applications
Handles petabytes of data
Well-understood by developers
Extensively documented
Easy to hire for

MongoDB is exciting! It's also:

Easier to misuse (schema-less can be dangerous)
Fewer people understand it well
Harder to hire for in some markets
Different failure modes

Use PostgreSQL unless you have a specific reason for MongoDB.

Innovation Tokens

Concept from Dan McKinley:

You have a limited budget of innovation. Spend it wisely.

Why limited?

New tech has unknown problems
Team has to learn it
Fewer StackOverflow answers
Harder to hire for
Debugging is slower
Operations are harder

Most companies have 3-5 innovation tokens for their entire stack.

Using Tokens Wisely

Good use of innovation tokens:

"We're using boring PostgreSQL, boring monolith, boring REST APIs, boring Heroku...

But we're innovating on our ML recommendation engine

Because that's our competitive advantage and core business value"

Bad use of innovation tokens:

"We're using brand new database (token 1), experimental architecture pattern (token 2), cutting-edge deployment tool (token 3), new frontend framework (token 4)...

Oh and also trying to build our core product

And our team is 3 people"

You'll die from a thousand cuts.

When to Spend Innovation Tokens

Spend when:

It's your core competitive advantage
Boring tech genuinely can't solve the problem
Team has capacity to learn and maintain
Benefits clearly outweigh costs
You have tokens available

Don't spend when:

"It's on my resume wishlist"
"It's what Google uses"
"It's trending on Twitter"
"I'm bored with current stack"
"It's 'best practice'" (according to who?)

Be honest about your motivations.

The Golden Hammer Anti-Pattern

"When you have a hammer, everything looks like a nail."

In software: When you know one technology, you use it for everything.

The Problem

Mexican development context:

Everyone learns MERN stack (MongoDB, Express, React, Node)
So every project becomes MERN
Even when relational database would be better (e-commerce, billing)
Even when server-side rendering would be simpler (marketing site)
Even when Python would be more appropriate (data science)

This is unconscious over-engineering.

You're not choosing the best tool for the job. You're choosing the tool you know.

Consequences

Inappropriate technology:

Square peg, round hole
Fighting the framework
Workarounds and hacks

Unnecessary complexity:

Using microservices when monolith would work
Using NoSQL when relational is better fit
Using SPA when SSR is simpler

Poor long-term fit:

Hard to maintain
Doesn't scale the way you need
Team struggles

How to Fight It

1. Always ask: "Is this the right tool for this job?"

Not: "Can I use the tool I know?"
But: "What does this problem actually need?"

2. Separate familiarity from appropriateness

Familiar ≠ Appropriate
Just because you know Kubernetes doesn't mean every project needs it

3. Consider team context

Can the team maintain this?
Is learning curve worth it?
Do we have time to learn?

4. Document why you chose it

If you can't justify beyond "I know this tech," reconsider
Force yourself to articulate the reasoning
Write an ADR (Week 9)

5. Be okay with boring

PostgreSQL again? Maybe that's fine.
Monolith again? Maybe that's appropriate.
Same stack as last project? Could be the right choice.

The goal isn't variety. The goal is appropriateness.

ATAM = UML Analogy

My take on formal methods:

"I've never known anyone who actually used UML in practice. C4 diagrams are basically UML drawn from memory: we kept the useful parts and lost the ceremony.

Same with ATAM (Architecture Tradeoff Analysis Method). It's a formal method from Carnegie Mellon with workshops, stakeholders, scenarios, evaluation trees.

Very thorough. Also very heavy.

We're doing ATAM drawn from memory. We're keeping:

Quality attribute thinking
Tradeoff analysis
Context awareness
Stakeholder needs

We're losing:

Multi-day workshops
Formal documentation
Complete stakeholder participation
Utility trees and sensitivity analysis

For most projects, ATAM-lite is enough.

If you're building nuclear reactor control software or airplane flight systems, use full ATAM.

If you're building a customer portal for 3 users, use the principles from this handout.

Extract value, lose ceremony."

How to Use This Handout for Assignments

When working on case studies:

Start with context questions (page 8-11)
- What's the business priority?
- What's the actual scale?
- What's the budget?
- What's the timeline?
- What can the team handle?
- What's the risk tolerance?
Identify which quality attributes matter most (page 3-7)
- Performance?
- Scalability?
- Availability?
- Security?
- Maintainability?
Acknowledge the tradeoffs
- What are you optimizing for?
- What are you sacrificing?
- Is that tradeoff appropriate for this context?
Consider team capability
- Can this team actually build this?
- Can they maintain it?
- Who's on-call when it breaks?
Default to boring technology
- Is there a simpler solution?
- Do you need the complexity?
- Can you justify the innovation?

Your assignment answers should reference these frameworks.

Example answer structure:

"For this e-commerce scenario, I'd prioritize availability and performance (context question #1: business priority is revenue).

The tradeoff is complexity - a simple monolith would be easier to build, but the high traffic requires horizontal scaling.

However, given the team is 3 junior developers (context question #5: team capability), I'd start with a monolith and scale when proven necessary.

This optimizes for development speed and maintainability over premature scalability."

Notice: Referenced context, quality attributes, tradeoffs, team capability. This is well-reasoned.

Quick Reference Tables

Quality Attributes Cheat Sheet

Attribute	Question	When Critical	When Less Critical
Performance	How fast?	E-commerce, gaming, trading	Batch processing, reports
Scalability	How many users?	Viral products, proven growth	New startups, internal tools
Availability	How much uptime?	Revenue systems, medical	Dev environments, analytics
Security	How protected?	Payment, PII, medical	Public content, catalogs
Maintainability	How easy to change?	Long-lived systems, small teams	Prototypes, short-term projects

Common Tradeoffs Quick Reference

If you optimize for...	You likely sacrifice...	Consider...
Performance	Simplicity, security, cost	Do you actually need <100ms?
Scalability	Simplicity, cost, development time	Do you have the scale yet?
Availability	Cost, complexity	Do you need five nines?
Security	Performance, UX, development time	How sensitive is the data?
Development Speed	Everything else	Are you shipping to learn?

Context Questions Quick Check

Business priority?
Actual scale?
Budget (money, time, people)?
Timeline?
Team capability?
Risk tolerance?

If you can't answer all six, you don't have enough context yet.

War Story: Project Mercury

Learn from instructor's mistakes:

Project: Customer portal for logistics company
Users: 3 customers
Usage: ~10 requests per month

What we built:

7 microservices
Kubernetes cluster
Event-driven architecture (RabbitMQ)
Separate databases per service
Complex CI/CD pipeline
Full monitoring suite

Time: 3 months
Cost: $500/month
Team: 3 junior developers struggling to maintain it

What we should have built:

Rails/Django monolith
PostgreSQL
Heroku deployment
Time: 2 weeks
Cost: $50/month
Team: Could easily maintain

What we optimized for:

Scalability for millions of users (we had 3)
Team independence (we had 1 team of 3)
Technology diversity (more complexity)

What we actually needed:

Ship quickly
Simple to maintain
Low cost

The lesson: Context matters more than "best practices."

Apply context questions to Mercury:

Business priority? Ship fast → We took 3 months
Scale? 3 customers → Built for millions
Budget? Limited → Spent $500/month
Timeline? 1 month → Took 3 months
Team capability? 3 juniors → Complex distributed system
Risk tolerance? Can tolerate downtime → High-availability overkill

We ignored every single context question.

When would microservices have made sense?

50+ customers (not 3)
Multiple teams (not 1)
Proven scaling issues (not hypothetical)
Budget for complexity (not shoestring)

None of which we had.

Don't make this mistake. Build for actual needs, not imaginary future scale.

Remember

Nobody knows everything. You're not supposed to have all the answers. You're supposed to ask good questions and reason through tradeoffs.

Context determines appropriateness. There's no universal "best architecture." Only "best for this situation."

Boring technology usually wins. Choose reliability over novelty. Save innovation tokens for where they matter.

Team capability is a quality attribute. The best architecture on paper is useless if your team can't build it.

You'll make mistakes. Everyone does. Learn from them. That's how you get better.

Redis (Paper-Server Edition) | Week 3 | No expiration policy | Your personal architecture cache