SQL 101
SQL (Structured Query Language) is the standard language for working with relational databases. Every major database system—PostgreSQL, MySQL, SQLite, Supabase, BigQuery, Snowflake—speaks SQL. This course teaches you everything you need to go from zero to writing real, production-quality queries.
What you will learn
- Databases & Tables — how data is organized
- SELECT Basics — reading data
- Filtering with WHERE — narrowing results
- Sorting & Limiting — ordering and paging
- Aggregations & GROUP BY — summarizing data
- JOINs — combining multiple tables
- Subqueries & CTEs — nested and reusable queries
- Modifying Data — INSERT, UPDATE, DELETE
- Creating & Altering Tables — DDL basics
- Window Functions — advanced analytics
Running SQL
- Online (free, instant): sqliteonline.com — paste and run SQL in your browser
- Supabase: project → SQL editor (PostgreSQL)
- VS Code: SQLite extension + any
.dbfile - PostgreSQL locally:
psqlin terminal
Sample tables used throughout this course
All examples below use these three tables. You can create them yourself using Lesson 9.
1. Databases & Tables
1.1 What is a relational database?
A relational database stores data in tables — like spreadsheets, but with rules. Each table has:
- Columns — the fields (e.g.,
name,email) - Rows — one record per row
- Primary key — a column that uniquely identifies each row (usually
id) - Foreign key — a column in one table that references the primary key of another
1.2 Data types you need to know
| Type | Description | Example |
|---|---|---|
INTEGER / INT | Whole numbers | 42 |
NUMERIC / DECIMAL | Precise decimal numbers | 9.99 |
TEXT / VARCHAR(n) | Text strings | 'hello' |
BOOLEAN | True / false | true |
DATE | Calendar date | '2025-01-15' |
TIMESTAMP | Date + time | '2025-01-15 09:30:00' |
NULL | Missing / unknown value | NULL |
1.3 Key concepts
- Schema — the structure/blueprint of a database (its tables, columns, types, and relationships)
- Query — a question you ask the database, written in SQL
- Result set — the rows returned by a query
- NULL — represents a missing or unknown value.
NULL ≠ 0andNULL ≠ '' - Case sensitivity — SQL keywords are case-insensitive (
SELECT=select); string values are case-sensitive in most databases
2. SELECT Basics
SELECT is how you read data. It is the most-used SQL statement by far.
2.1 Select all columns
SELECT *FROM users;
Returns every column and every row in users. The * means "all columns".
2.2 Select specific columns
SELECT name, emailFROM users;
2.3 Aliases with AS
Rename a column in the result set without changing the database.
SELECT name AS full_name,email AS contact_emailFROM users;
2.4 Calculated columns
SELECT name,price,price * 1.1 AS price_with_taxFROM products;
2.5 Removing duplicates with DISTINCT
SELECT DISTINCT cityFROM users;-- returns each city once
2.6 String functions
SELECT UPPER(name) AS name_upper FROM users;SELECT LOWER(email) AS email_lower FROM users;SELECT LENGTH(name) AS name_length FROM users;SELECT name || ' — ' || city AS label FROM users;-- concatenate (PostgreSQL)SELECT CONCAT(name, ' — ', city) AS label FROM users;-- concatenate (MySQL)SELECT TRIM(' hello ') AS trimmed;-- removes leading/trailing spaces
2.7 Number functions
SELECT ROUND(price, 2) FROM products;-- 2 decimal placesSELECT ABS(-5);-- absolute value → 5SELECT CEIL(4.1);-- round up → 5SELECT FLOOR(4.9);-- round down → 4
2.8 Handling NULL
SELECT COALESCE(phone, 'no phone') AS phone FROM users;-- returns first non-NULLSELECT name FROM users WHERE phone IS NULL;SELECT name FROM users WHERE phone IS NOT NULL;
3. Filtering with WHERE
WHERE limits which rows are returned. Think of it as "only include rows where this condition is true."
3.1 Comparison operators
| Operator | Meaning | Example |
|---|---|---|
= | Equal to | age = 30 |
!= or <> | Not equal to | status != 'cancelled' |
> | Greater than | amount > 100 |
< | Less than | age < 25 |
>= | Greater than or equal | price >= 10 |
<= | Less than or equal | stock <= 5 |
3.2 AND / OR / NOT
SELECT * FROM usersWHERE age >= 21 AND city = 'New York';SELECT * FROM ordersWHERE status = 'pending' OR status = 'processing';SELECT * FROM productsWHERE NOT category = 'electronics';
Tip: use parentheses when mixing AND and OR to make precedence explicit:
WHERE (status = 'pending' OR status = 'processing')AND amount > 50;
3.3 IN
Match any value in a list — cleaner than multiple OR conditions.
SELECT * FROM usersWHERE city IN ('New York', 'Los Angeles', 'Chicago');SELECT * FROM ordersWHERE status NOT IN ('cancelled', 'refunded');
3.4 BETWEEN
Inclusive range filter.
SELECT * FROM productsWHERE price BETWEEN 10 AND 50;SELECT * FROM ordersWHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
3.5 LIKE — pattern matching
%— matches any number of characters_— matches exactly one character
SELECT * FROM users WHERE email LIKE '%@gmail.com';-- ends with @gmail.comSELECT * FROM users WHERE name LIKE 'J%';-- starts with JSELECT * FROM users WHERE name LIKE '%son%';-- contains "son" anywhereSELECT * FROM users WHERE name LIKE 'J___';-- J + exactly 3 more charsSELECT * FROM users WHERE name ILIKE 'john%';-- PostgreSQL case-insensitive LIKE
4. Sorting & Limiting
4.1 ORDER BY
SELECT name, age FROM usersORDER BY age;-- ascending (lowest first) by defaultSELECT name, age FROM usersORDER BY age DESC;-- descending (highest first)SELECT name, city, age FROM usersORDER BY city ASC, age DESC;-- city A→Z, then age high→low within each city
4.2 LIMIT and OFFSET
SELECT * FROM ordersORDER BY created_at DESCLIMIT 10;-- only return the 10 most recent ordersSELECT * FROM orders-- pagination: page 3, 10 rows per pageORDER BY idLIMIT 10 OFFSET 20;-- skip first 20, return next 10
4.3 NULL ordering
NULLs sort to the end by default in ascending order in most databases. To control this explicitly:
ORDER BY phone NULLS LAST;-- PostgreSQLORDER BY phone IS NULL, phone;-- MySQL trick
5. Aggregations & GROUP BY
Aggregate functions collapse many rows into one summary value.
5.1 Aggregate functions
| Function | What it does | Example |
|---|---|---|
COUNT(*) | Count all rows | SELECT COUNT(*) FROM users; |
COUNT(col) | Count non-NULL values in a column | SELECT COUNT(phone) FROM users; |
COUNT(DISTINCT col) | Count unique non-NULL values | SELECT COUNT(DISTINCT city) FROM users; |
SUM(col) | Total of a numeric column | SELECT SUM(amount) FROM orders; |
AVG(col) | Average value | SELECT AVG(price) FROM products; |
MIN(col) | Smallest value | SELECT MIN(amount) FROM orders; |
MAX(col) | Largest value | SELECT MAX(amount) FROM orders; |
5.2 GROUP BY — aggregate per group
Group rows that share a value, then aggregate each group.
SELECT city, COUNT(*) AS user_countFROM usersGROUP BY city;SELECT status, SUM(amount) AS total_revenueFROM ordersGROUP BY status;SELECT category, MIN(price) AS cheapest, MAX(price) AS most_expensiveFROM productsGROUP BY categoryORDER BY category;
Rule: every column in SELECT must either be in GROUP BY or inside an aggregate function.
-- ✅ correctSELECT city, COUNT(*) FROM users GROUP BY city;-- ❌ incorrect — name is not in GROUP BY and not aggregatedSELECT city, name, COUNT(*) FROM users GROUP BY city;
5.3 HAVING — filter groups
WHERE filters rows before grouping. HAVING filters groups after grouping.
SELECT city, COUNT(*) AS user_countFROM usersGROUP BY cityHAVING COUNT(*) >= 5-- only cities with 5 or more usersORDER BY user_count DESC;SELECT user_id, SUM(amount) AS total_spentFROM ordersWHERE status = 'completed'-- filter rows firstGROUP BY user_idHAVING SUM(amount) > 500-- then filter groupsORDER BY total_spent DESC;
5.4 Order of clauses (important!)
SQL clauses must appear in this order:
SELECTFROMJOINWHEREGROUP BYHAVINGORDER BYLIMIT
6. JOINs
JOINs combine rows from two tables based on a related column.
6.1 INNER JOIN — only matching rows
Returns only rows where the join condition is true in both tables.
SELECT u.name, u.email, o.product, o.amountFROM orders AS oINNER JOIN users AS u ON o.user_id = u.id;
Any order without a matching user, or any user with no orders, is excluded.
6.2 LEFT JOIN — all rows from the left table
Returns all rows from the left table. Rows from the right table that don't match get NULL.
SELECT u.name, o.product, o.amountFROM users AS uLEFT JOIN orders AS o ON u.id = o.user_id;-- users with no orders appear with NULL in product and amount
Common use: find users who have never placed an order
SELECT u.nameFROM users AS uLEFT JOIN orders AS o ON u.id = o.user_idWHERE o.id IS NULL;
6.3 RIGHT JOIN — all rows from the right table
Mirror of LEFT JOIN. Less common — most developers rewrite as LEFT JOIN with tables swapped.
SELECT u.name, o.productFROM users AS uRIGHT JOIN orders AS o ON u.id = o.user_id;-- all orders appear; users without orders get NULL for name
6.4 FULL OUTER JOIN — all rows from both tables
SELECT u.name, o.productFROM users AS uFULL OUTER JOIN orders AS o ON u.id = o.user_id;-- all users and all orders; NULLs where no match exists
6.5 Joining three tables
SELECT u.name, o.amount, p.name AS product_name, p.categoryFROM orders AS oJOIN users AS u ON o.user_id = u.idJOIN products AS p ON o.product_id = p.id;
6.6 CROSS JOIN — every combination
SELECT u.name, p.name AS productFROM users AS uCROSS JOIN products AS p;-- returns every user paired with every product (m × n rows)
6.7 Self JOIN — a table joined to itself
-- Example: employees table with manager_id referencing idSELECT e.name AS employee, m.name AS managerFROM employees AS eLEFT JOIN employees AS m ON e.manager_id = m.id;
JOIN visual summary
| JOIN type | Rows returned |
|---|---|
INNER JOIN | Only rows matching in both tables |
LEFT JOIN | All rows from left + matched right (NULLs otherwise) |
RIGHT JOIN | All rows from right + matched left (NULLs otherwise) |
FULL OUTER JOIN | All rows from both tables |
CROSS JOIN | Every combination (Cartesian product) |
7. Subqueries & CTEs
7.1 Subquery in WHERE
A subquery is a query nested inside another query.
-- Find users who have placed at least one orderSELECT name FROM usersWHERE id IN (SELECT DISTINCT user_id FROM orders);-- Find products priced above averageSELECT name, price FROM productsWHERE price > (SELECT AVG(price) FROM products);
7.2 Subquery in FROM (derived table)
SELECT city, avg_ageFROM (SELECT city, AVG(age) AS avg_ageFROM usersGROUP BY city) AS city_statsWHERE avg_age > 30;
7.3 Subquery in SELECT (scalar subquery)
SELECT name,(SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id) AS order_countFROM users;
7.4 EXISTS and NOT EXISTS
-- Users who have placed at least one orderSELECT name FROM users uWHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);-- Users who have never orderedSELECT name FROM users uWHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
7.5 CTE — Common Table Expression (WITH clause)
A CTE is like a named subquery defined at the top of a query. It makes complex queries much more readable.
WITH high_spenders AS (SELECT user_id, SUM(amount) AS total_spentFROM ordersWHERE status = 'completed'GROUP BY user_idHAVING SUM(amount) > 1000)SELECT u.name, u.email, h.total_spentFROM high_spenders hJOIN users u ON h.user_id = u.idORDER BY h.total_spent DESC;
Multiple CTEs
WITHrevenue AS (SELECT user_id, SUM(amount) AS total FROM orders GROUP BY user_id),order_counts AS (SELECT user_id, COUNT(*) AS num_orders FROM orders GROUP BY user_id)SELECT u.name, r.total, oc.num_ordersFROM users uJOIN revenue r ON u.id = r.user_idJOIN order_counts oc ON u.id = oc.user_id;
8. Modifying Data
Warning: always run a matching SELECT with the same WHERE clause first to confirm which rows you will affect before running UPDATE or DELETE.
8.1 INSERT — add rows
-- Single rowINSERT INTO users (name, email, age, city)VALUES ('Alice Smith', 'alice@email.com', 28, 'New York');-- Multiple rowsINSERT INTO users (name, email, age, city)VALUES('Bob Jones', 'bob@email.com', 34, 'Chicago'),('Carol Wu', 'carol@email.com', 22, 'Seattle');-- Insert from a SELECTINSERT INTO vip_users (user_id, total_spent)SELECT user_id, SUM(amount)FROM orders WHERE status = 'completed'GROUP BY user_id HAVING SUM(amount) > 1000;
8.2 UPDATE — modify existing rows
-- Update a single rowUPDATE usersSET city = 'Los Angeles'WHERE id = 42;-- Update multiple columnsUPDATE usersSET city = 'Boston', age = 29WHERE email = 'alice@email.com';-- Update many rows at onceUPDATE ordersSET status = 'archived'WHERE created_at < '2023-01-01';
8.3 DELETE — remove rows
-- Delete a single rowDELETE FROM users WHERE id = 42;-- Delete by conditionDELETE FROM orders WHERE status = 'cancelled'AND created_at < '2022-01-01';-- ⚠️ Delete ALL rows (dangerous!)DELETE FROM temp_staging;
8.4 UPSERT (INSERT … ON CONFLICT)
Insert a row, or update it if the primary key already exists. Supported in PostgreSQL and SQLite.
INSERT INTO products (id, name, price)VALUES (5, 'Widget', 9.99)ON CONFLICT (id) DO UPDATESET price = EXCLUDED.price;
8.5 TRUNCATE — delete all rows fast
TRUNCATE TABLE temp_staging;-- much faster than DELETE for full clears
9. Creating & Altering Tables
9.1 CREATE TABLE
CREATE TABLE users (id SERIAL PRIMARY KEY,-- auto-incrementing integer PK (PostgreSQL)name TEXT NOT NULL,email TEXT UNIQUE NOT NULL,age INTEGER,city TEXT,joined_at TIMESTAMP DEFAULT NOW());CREATE TABLE orders (id SERIAL PRIMARY KEY,user_id INTEGER NOT NULL REFERENCES users(id),-- foreign keyproduct TEXT NOT NULL,amount NUMERIC(10, 2) NOT NULL,status TEXT DEFAULT 'pending',created_at TIMESTAMP DEFAULT NOW());CREATE TABLE products (id SERIAL PRIMARY KEY,name TEXT NOT NULL,category TEXT,price NUMERIC(10, 2) NOT NULL,stock INTEGER DEFAULT 0);
9.2 Constraints
| Constraint | Effect |
|---|---|
PRIMARY KEY | Unique + NOT NULL; identifies each row |
NOT NULL | Column cannot be NULL |
UNIQUE | All values in the column must be different |
DEFAULT value | Used when INSERT omits the column |
REFERENCES table(col) | Foreign key — value must exist in the referenced table |
CHECK (expr) | Row is only inserted if expression is true |
9.3 ALTER TABLE
-- Add a columnALTER TABLE users ADD COLUMN phone TEXT;-- Drop a columnALTER TABLE users DROP COLUMN phone;-- Rename a column (PostgreSQL)ALTER TABLE users RENAME COLUMN city TO location;-- Change column type (PostgreSQL)ALTER TABLE products ALTER COLUMN price TYPE NUMERIC(12, 2);-- Add a NOT NULL constraintALTER TABLE users ALTER COLUMN name SET NOT NULL;
9.4 DROP TABLE
DROP TABLE temp_staging;-- ⚠️ permanently deletes the table and all dataDROP TABLE IF EXISTS temp_staging;-- no error if table doesn't exist
9.5 CREATE INDEX
Indexes speed up queries that filter or sort by specific columns. The database uses them automatically.
CREATE INDEX idx_orders_user_id ON orders(user_id);CREATE INDEX idx_users_city ON users(city);CREATE UNIQUE INDEX idx_users_email ON users(email);
Rule of thumb: index columns you frequently use in WHERE, JOIN ON, or ORDER BY. Don't over-index — each index slows down writes.
10. Window Functions
Window functions perform calculations across a set of rows related to the current row — without collapsing rows like GROUP BY does. They use the OVER() clause.
10.1 Syntax
function_name() OVER (PARTITION BY partition_column -- optional: reset the window per groupORDER BY sort_column -- optional: order within window)
10.2 ROW_NUMBER — rank each row
SELECT name, city, age,ROW_NUMBER() OVER (PARTITION BY city ORDER BY age DESC) AS rank_in_cityFROM users;-- rank 1 = oldest user in each city
Top-N per group (using ROW_NUMBER in a CTE)
WITH ranked AS (SELECT user_id, amount,ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY amount DESC) AS rnFROM orders)SELECT * FROM ranked WHERE rn = 1;-- largest order per user
10.3 RANK and DENSE_RANK
SELECT name, amount,RANK() OVER (ORDER BY amount DESC) AS rank,DENSE_RANK() OVER (ORDER BY amount DESC) AS dense_rankFROM orders;
| Function | Ties behavior | Example (amounts: 100, 100, 80) |
|---|---|---|
ROW_NUMBER | Always unique — ties broken arbitrarily | 1, 2, 3 |
RANK | Ties get same rank; next rank skips | 1, 1, 3 |
DENSE_RANK | Ties get same rank; next rank does not skip | 1, 1, 2 |
10.4 Running totals and moving averages
SELECT created_at, amount,SUM(amount) OVER (ORDER BY created_at) AS running_totalFROM ordersORDER BY created_at;-- 7-day moving averageSELECT created_at, amount,AVG(amount) OVER (ORDER BY created_at ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS ma_7dFROM orders;
10.5 LAG and LEAD — access adjacent rows
SELECT created_at, amount,LAG(amount) OVER (ORDER BY created_at) AS prev_amount,LEAD(amount) OVER (ORDER BY created_at) AS next_amount,amount - LAG(amount) OVER (ORDER BY created_at) AS change_from_prevFROM orders;
10.6 NTILE — split into buckets
SELECT name, amount,NTILE(4) OVER (ORDER BY amount DESC) AS quartileFROM orders;-- quartile 1 = top 25%, quartile 4 = bottom 25%
10.7 FIRST_VALUE / LAST_VALUE
SELECT user_id, amount,FIRST_VALUE(amount) OVER (PARTITION BY user_id ORDER BY created_at) AS first_order_amt,LAST_VALUE(amount) OVER (PARTITION BY user_id ORDER BY created_atROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_order_amtFROM orders;
10.8 Window function quick reference
| Function | Use case |
|---|---|
ROW_NUMBER() | Unique sequential number per row |
RANK() | Rank with gaps on ties |
DENSE_RANK() | Rank without gaps on ties |
NTILE(n) | Divide rows into n equal buckets |
SUM() OVER | Running or partitioned total |
AVG() OVER | Running or moving average |
COUNT() OVER | Running or partitioned count |
LAG(col, n) | Value from n rows before current row |
LEAD(col, n) | Value from n rows after current row |
FIRST_VALUE(col) | First value in the window |
LAST_VALUE(col) | Last value in the window |
Quick Reference Card
Full SELECT anatomy
SELECT col1, col2, AGG(col3) AS aliasFROM table_a AS aJOIN table_b AS b ON a.id = b.a_idWHERE a.status = 'active' AND b.amount > 50GROUP BY col1, col2HAVING AGG(col3) > 100ORDER BY alias DESCLIMIT 25 OFFSET 0;
Common interview patterns
-- Nth highest valueSELECT amount FROM orders ORDER BY amount DESC LIMIT 1 OFFSET (n-1);-- Duplicate rowsSELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1;-- Percentage of totalSELECT category, SUM(amount) AS cat_total,ROUND(100.0 * SUM(amount) / SUM(SUM(amount)) OVER (), 2) AS pctFROM orders GROUP BY category;-- First event per userSELECT DISTINCT ON (user_id) user_id, created_at, productFROM orders ORDER BY user_id, created_at;-- PostgreSQL-- Cohort by signup monthSELECT DATE_TRUNC('month', joined_at) AS cohort, COUNT(*) AS signupsFROM users GROUP BY 1 ORDER BY 1;
11. Google SQL Interview — Product Strategy & Operations (StratOps)
Congratulations on landing an interview for the Product Strategy & Operations (StratOps) role! This is a highly sought-after position that sits at the intersection of product management and business analytics.
For StratOps, the SQL test is designed to see if you can translate business questions into data queries. Google doesn't expect you to be a database engineer, but they do need you to be "data-fluent."
Test Format
While individual teams vary, the SQL assessment typically follows one of two formats:
- Live Coding (Most Common): A 30–45 minute video call where you share your screen (using a tool like Google Docs or a simple code editor). You will be given a set of tables and asked to write queries in real-time while explaining your logic.
- Take-Home / Online Assessment: A timed platform test (like HackerRank or CoderPad) featuring 3–5 problems ranging from basic filtering to complex data manipulation.
What's on the Test?
The questions will almost always be framed around Google products (e.g., YouTube watch time, Google Ads performance, or Play Store transactions).
1. Core SQL Fundamentals
You must be comfortable with the "bread and butter" of SQL. Expect to use:
- Aggregations:
SUM,AVG,COUNT(DISTINCT ...), and the essentialGROUP BY. - Joins: Knowing when to use
LEFT JOINvs.INNER JOINis critical. You might be asked to join a "User" table with an "Activity" table to find inactive users. - Filtering:
WHEREvs.HAVING(this is a classic trap).
2. Intermediate Concepts
This is where StratOps candidates are usually differentiated:
- Window Functions: Be ready to use
RANK(),DENSE_RANK(), orROW_NUMBER(). For example: "Find the top 3 most-viewed YouTube categories per country." - Date/Time Manipulation: Calculating Month-over-Month (MoM) growth or identifying users who signed up in the last 30 days.
- CTEs: Use
WITHclauses to keep your code organized. Google interviewers love clean, readable code.
3. Business Logic & Case Integration
Expect "Case-style" SQL questions. Instead of being told "Join Table A and B," you'll be told:
"Our Ads revenue dropped by 5% in EMEA last month. Write a query to identify which ad formats (Search, Display, Video) contributed most to this decline."
Preparation Checklist
| Topic | Focus Area |
|---|---|
| Joins | Understand Self-Joins (e.g., comparing a user's behavior today vs. yesterday) |
| Case Logic | Use CASE WHEN to bucket data (e.g., "Small," "Medium," "Large" advertisers) |
| Windowing | Practice calculating running totals or period-over-period changes |
| Syntax | Google typically uses Standard SQL (BigQuery syntax), which is very similar to Postgres |
50 Practice Questions & Answers
These are categorized by difficulty and focus on business-centric problems you'll likely encounter. Questions assume BigQuery / Standard SQL syntax.
I. Basic: Filtering, Aggregation, and Joins (Q1–15)
-
Find the total number of unique users who were active on YouTube yesterday.
SELECT COUNT(DISTINCT user_id)FROM youtube_activityWHERE date = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);
-
List all Google Cloud customers who spent more than $10,000 last month.
SELECT customer_nameFROM cloud_billingWHERE billing_month = '2023-10'GROUP BY customer_nameHAVING SUM(spend) > 10000;
-
Calculate the average rating for every app category in the Play Store.
SELECT category, AVG(rating) AS avg_ratingFROM play_store_appsGROUP BY category;
-
Find users who have never made a purchase (Users + Orders tables).
SELECT u.user_idFROM Users uLEFT JOIN Orders o ON u.user_id = o.user_idWHERE o.order_id IS NULL;
-
Count how many videos were uploaded to YouTube per country today.
SELECT country, COUNT(video_id) AS uploadsFROM uploadsWHERE upload_date = CURRENT_DATE()GROUP BY country;
-
Find the top 5 most expensive products in the Google Store.
SELECT product_name, priceFROM google_store_productsORDER BY price DESCLIMIT 5;
-
Show all Gmail users whose accounts were created in 2022.
SELECT emailFROM gmail_usersWHERE EXTRACT(YEAR FROM created_at) = 2022;
-
Get total revenue generated by "Search" ads vs. "Display" ads.
SELECT ad_type, SUM(revenue) AS total_revenueFROM ads_performanceGROUP BY ad_type;
-
Find the number of Android users on a version higher than 12.
SELECT COUNT(user_id)FROM devicesWHERE android_version > 12;
-
Select all active Google Workspace subscriptions set to "Annual" billing.
SELECT sub_idFROM subscriptionsWHERE status = 'Active' AND billing_cycle = 'Annual';
-
What is the maximum storage used by a single Google Drive user?
SELECT MAX(storage_gb) AS max_storageFROM drive_usage;
-
List all YouTube creators with more than 1 million subscribers.
SELECT channel_nameFROM creatorsWHERE subscriber_count > 1000000;
-
Find the total number of support tickets closed by each agent today.
SELECT agent_id, COUNT(ticket_id) AS closed_todayFROM ticketsWHERE status = 'Closed' AND closed_date = CURRENT_DATE()GROUP BY agent_id;
-
Combine First_Name and Last_Name into a Full_Name column.
SELECT CONCAT(first_name, ' ', last_name) AS full_nameFROM employees;
-
Find all search queries that contain the word "Pixel".
SELECT queryFROM search_logsWHERE query LIKE '%Pixel%';
II. Intermediate: CTEs, Case Logic, and Dates (Q16–35)
-
Bucket advertisers into 'Small' (<$1k), 'Medium' ($1k–$10k), and 'Large' (>$10k).
SELECT advertiser_id,CASEWHEN SUM(spend) < 1000 THEN 'Small'WHEN SUM(spend) BETWEEN 1000 AND 10000 THEN 'Medium'ELSE 'Large'END AS tierFROM adsGROUP BY advertiser_id;
-
Calculate Month-over-Month growth in YouTube Premium subscribers.
WITH monthly AS (SELECT DATE_TRUNC(signup_date, MONTH) AS month,COUNT(*) AS subsFROM premium_signupsGROUP BY 1)SELECT month, subs,LAG(subs) OVER (ORDER BY month) AS prev_subs,ROUND(100.0 * (subs - LAG(subs) OVER (ORDER BY month))/ LAG(subs) OVER (ORDER BY month), 2) AS mom_pctFROM monthly;
-
Find users who logged in on both Monday and Tuesday of this week.
SELECT user_id FROM logins WHERE date = '2023-11-06'INTERSECT DISTINCTSELECT user_id FROM logins WHERE date = '2023-11-07';
-
Identify "Power Users" who used Google Search at least 5 days in the last week.
SELECT user_idFROM search_logsWHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)GROUP BY user_idHAVING COUNT(DISTINCT date) >= 5;
-
For each country, find the app with the highest number of downloads.
WITH ranked AS (SELECT country, app_name, downloads,RANK() OVER (PARTITION BY country ORDER BY downloads DESC) AS rkFROM app_downloads)SELECT country, app_name, downloadsFROM ranked WHERE rk = 1;
-
Calculate the 7-day rolling average of Google Maps searches.
SELECT date,AVG(search_count) OVER (ORDER BY dateROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7d_avgFROM maps_stats;
-
Find the time difference (in hours) between a user signing up and their first search.
SELECT user_id,TIMESTAMP_DIFF(first_search_at, signup_at, HOUR) AS hours_to_first_searchFROM user_journey;
-
List the second most popular video in each category.
WITH ranked AS (SELECT category, video_id, views,DENSE_RANK() OVER (PARTITION BY category ORDER BY views DESC) AS rkFROM videos)SELECT * FROM ranked WHERE rk = 2;
-
Convert a Unix timestamp to a readable date (BigQuery).
SELECT TIMESTAMP_SECONDS(unix_column) AS readable_tsFROM your_table;
-
Find users who upgraded from Free to Paid within 24 hours.
SELECT t1.user_idFROM plan_changes t1JOIN plan_changes t2ON t1.user_id = t2.user_idAND t1.plan = 'Free'AND t2.plan = 'Paid'AND TIMESTAMP_DIFF(t2.changed_at, t1.changed_at, HOUR) <= 24;
-
Retrieve the last search query performed by each user today.
WITH ranked AS (SELECT user_id, query,ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp DESC) AS rnFROM search_logsWHERE DATE(timestamp) = CURRENT_DATE())SELECT user_id, query FROM ranked WHERE rn = 1;
-
Calculate the "Take Rate" (Revenue / GMV) for the Play Store.
SELECT ROUND(SUM(revenue) / SUM(gmv), 4) AS take_rateFROM transactions;
-
Which YouTube channels have zero uploads in the last 90 days?
SELECT channel_idFROM channelsWHERE channel_id NOT IN (SELECT channel_id FROM uploadsWHERE date > DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY));
-
Find the median spend of all Google Cloud customers.
SELECT PERCENTILE_CONT(spend, 0.5) OVER() AS median_spendFROM cloud_billingLIMIT 1;
-
List all pairs of users who share the same IP address.
SELECT a.user_id AS user_a, b.user_id AS user_bFROM logins aJOIN logins b ON a.ip = b.ipWHERE a.user_id < b.user_id;-- avoids duplicate pairs
-
Calculate the click-through rate (CTR) of Search results.
SELECTROUND(100.0 * SUM(CASE WHEN clicked = 1 THEN 1 ELSE 0 END) / COUNT(*), 2) AS ctr_pctFROM search_results;
-
Find total revenue per quarter for the last two years.
SELECT EXTRACT(YEAR FROM date) AS yr,EXTRACT(QUARTER FROM date) AS qtr,SUM(revenue) AS totalFROM salesWHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 2 YEAR)GROUP BY 1, 2ORDER BY 1, 2;
-
Replace all NULL values in a 'Comments' column with 'No Comment'.
SELECT COALESCE(comments, 'No Comment') AS commentsFROM feedback;
-
Find the top 10% of users by data usage on Google Photos.
WITH buckets AS (SELECT user_id, data_gb,NTILE(10) OVER (ORDER BY data_gb DESC) AS decileFROM photos_usage)SELECT user_id, data_gb FROM buckets WHERE decile = 1;
-
Determine if a string is a valid email (basic @ check, BigQuery).
SELECT emailFROM usersWHERE REGEXP_CONTAINS(email, r'@');
III. Advanced: Business Logic & Complex Analytics (Q36–50)
-
Retention: What % of Day-0 users returned on Day 7?
WITH d0 AS (SELECT DISTINCT user_id FROM activity WHERE date = '2023-11-01'),d7 AS (SELECT DISTINCT user_id FROM activity WHERE date = '2023-11-08')SELECT ROUND(100.0 * COUNT(d7.user_id) / COUNT(d0.user_id), 2) AS day7_retention_pctFROM d0LEFT JOIN d7 USING (user_id);
-
Cannibalization: Did YouTube Shorts reduce long-form watch time?
WITH first_shorts AS (SELECT user_id, MIN(date) AS first_shorts_date FROM shorts_views GROUP BY user_id)SELECTAVG(CASE WHEN lv.date < fs.first_shorts_date THEN lv.watch_minutes END) AS before_shorts,AVG(CASE WHEN lv.date >= fs.first_shorts_date THEN lv.watch_minutes END) AS after_shortsFROM longform_views lvJOIN first_shorts fs ON lv.user_id = fs.user_idWHERE lv.date BETWEEN DATE_SUB(fs.first_shorts_date, INTERVAL 30 DAY)AND DATE_ADD(fs.first_shorts_date, INTERVAL 30 DAY);
-
Anomaly Detection: Find any day where Ads revenue dropped >20% vs. prior day.
WITH daily AS (SELECT date, SUM(revenue) AS rev,LAG(SUM(revenue)) OVER (ORDER BY date) AS prev_revFROM ads_revenue GROUP BY date)SELECT date, rev, prev_rev,ROUND((rev - prev_rev) / prev_rev * 100, 2) AS pct_changeFROM dailyWHERE (rev - prev_rev) / prev_rev < -0.20;
-
Attribution: Find the first marketing channel each user touched before purchase.
SELECT DISTINCT user_id,FIRST_VALUE(channel) OVER (PARTITION BY user_id ORDER BY touchpoint_time) AS first_channelFROM touchpointsWHERE touchpoint_time < purchase_time;
-
Sessionization: Assign session IDs where actions >30 min apart start a new session.
WITH gaps AS (SELECT user_id, timestamp,CASE WHEN TIMESTAMP_DIFF(timestamp,LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp), MINUTE) > 30OR LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) IS NULLTHEN 1 ELSE 0 END AS new_sessionFROM events)SELECT user_id, timestamp,SUM(new_session) OVER (PARTITION BY user_id ORDER BY timestamp) AS session_idFROM gaps;
-
Market Share: Percentage of total search volume by browser.
SELECT browser,ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 2) AS share_pctFROM search_logsGROUP BY browserORDER BY share_pct DESC;
-
Churn: Users active every month for 6 months but not in the last 30 days.
WITH regulars AS (SELECT user_idFROM activityWHERE date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 6 MONTH)AND DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)GROUP BY user_idHAVING COUNT(DISTINCT DATE_TRUNC(date, MONTH)) = 6)SELECT r.user_idFROM regulars rWHERE r.user_id NOT IN (SELECT user_id FROM activityWHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY));
-
Upsell: Identify users who reached 90% of their Google One storage limit.
SELECT user_id,ROUND(100.0 * used_bytes / limit_bytes, 1) AS pct_usedFROM storageWHERE (used_bytes / limit_bytes) >= 0.9;
-
ABC Analysis: Classify products as A (top 80% revenue), B (next 15%), C (bottom 5%).
WITH totals AS (SELECT product_id, SUM(revenue) AS rev FROM sales GROUP BY 1),running AS (SELECT product_id, rev,SUM(rev) OVER (ORDER BY rev DESC) AS running_total,SUM(rev) OVER () AS grand_totalFROM totals)SELECT product_id,CASEWHEN running_total / grand_total <= 0.80 THEN 'A'WHEN running_total / grand_total <= 0.95 THEN 'B'ELSE 'C'END AS abc_classFROM running;
-
Cross-Sell: Most common app bought by users who also bought "Minecraft".
SELECT p.product_name, COUNT(*) AS co_purchasesFROM purchases mcJOIN purchases p ON mc.user_id = p.user_idAND p.product_name != 'Minecraft'WHERE mc.product_name = 'Minecraft'GROUP BY p.product_nameORDER BY co_purchases DESCLIMIT 10;
-
Funnel Analysis: Drop-off rate between "Add to Cart" and "Checkout Complete".
WITH steps AS (SELECTCOUNT(DISTINCT CASE WHEN event = 'add_to_cart' THEN user_id END) AS add_to_cart,COUNT(DISTINCT CASE WHEN event = 'checkout_complete' THEN user_id END) AS completedFROM funnel_events)SELECT add_to_cart, completed,ROUND(100.0 * (add_to_cart - completed) / add_to_cart, 2) AS dropoff_pctFROM steps;
-
Seasonality: Which day of the week has the highest Google Maps traffic for "Restaurants"?
SELECT FORMAT_DATE('%A', date) AS day_of_week,SUM(searches) AS total_searchesFROM maps_trafficWHERE category = 'Restaurants'GROUP BY day_of_weekORDER BY total_searches DESCLIMIT 1;
-
LTV: Calculate total revenue generated by a user since their join date.
SELECT u.user_id, u.joined_at,COALESCE(SUM(o.amount), 0) AS lifetime_valueFROM users uLEFT JOIN orders o ON u.user_id = o.user_idAND o.created_at >= u.joined_atGROUP BY u.user_id, u.joined_atORDER BY lifetime_value DESC;
-
Data Integrity: Find transactions where tax_amount is not exactly 8% of subtotal.
SELECT transaction_id, subtotal, tax_amountFROM transactionsWHERE ABS(tax_amount - ROUND(subtotal * 0.08, 2)) > 0.01;
-
Efficiency: Average number of searches before a user clicks a "Help" article.
WITH pre_help AS (SELECT s.user_id, s.session_id, COUNT(*) AS searches_before_helpFROM searches sJOIN help_clicks hON s.user_id = h.user_idAND s.timestamp < h.clicked_atGROUP BY s.user_id, s.session_id)SELECT AVG(searches_before_help) AS avg_searchesFROM pre_help;
users-- id, name, email, age, city, joined_atorders-- id, user_id, product, amount, status, created_atproducts-- id, name, category, price, stock