Coding Tasks

12 practical problems for Synthesized.io interview — with full solutions

Task 1
Topological Sort of FK Graph
Graph / BFSDatabase25 min

Problem: Given tables and foreign key relationships, return tables in loading order (parents first). Detect cycles.

data class FK(val child: String, val parent: String)

// Input:
//   tables = ["order_items", "orders", "users", "products", "departments"]
//   fks = [FK("users","departments"), FK("orders","users"),
//          FK("order_items","orders"), FK("order_items","products")]
// Output: ["departments", "products", "users", "orders", "order_items"]

fun topologicalSort(tables: List<String>, fks: List<FK>): List<String> {
    TODO()
}
Task 2
Deterministic Email Masking
MaskingHashing15 min

Problem: Mask emails deterministically. Same input = same output (for cross-table consistency). Irreversible. Output must look like a valid email.

// mask("alex@gmail.com")  → "a7f3b2c1e9d04f28@masked.io"
// mask("alex@gmail.com")  → "a7f3b2c1e9d04f28@masked.io"  (same!)
// mask("maria@yahoo.com") → "different_hash@masked.io"
// mask(null)              → null

fun maskEmail(email: String?, salt: String = "s3cret"): String? {
    TODO()
}
Task 3
Consistent Database Subset (BFS through FK graph)
Graph / BFSDatabase35 min

Problem: Given seed rows from one table, find ALL rows across ALL tables needed for referential integrity. Follow FKs in both directions: children that reference seed rows, and parents that seed rows reference.

data class FK(val child: String, val childCol: String,
              val parent: String, val parentCol: String)

// Seed: users ids {1, 2}
// Expected: users:{1,2}, orders:{10,11,12}, order_items:{100..108},
//           products:{5,7,12}, departments:{1}

fun collectSubset(
    seedTable: String,
    seedIds: Set<Long>,
    fks: List<FK>,
    // Simulates DB query: "SELECT childCol FROM table WHERE parentCol IN ids"
    fetchIds: (table: String, col: String, filterCol: String, filterIds: Set<Long>) -> Set<Long>
): Map<String, Set<Long>> {
    TODO()
}
Task 4
Cycle Detection in FK Graph
Graph / DFSDatabase20 min

Problem: Find ALL cycles in a foreign key graph. Return the tables involved in each cycle. Self-references count as cycles.

// Input: employees.manager_id -> employees.id (self-ref)
//        table_a.b_id -> table_b.id, table_b.a_id -> table_a.id (mutual)
// Output: [["employees"], ["table_a", "table_b"]]

fun findCycles(tables: List<String>, fks: List<FK>): List<List<String>> {
    TODO()
}
Task 5
Chunked IN-clause Query Builder
DatabasePerformance15 min

Problem: PostgreSQL query planner degrades with large IN clauses. Split 50,000 IDs into chunks of 500, execute each chunk, merge results. Handle the edge case where chunk size > remaining IDs.

// fetchByIds("users", "id", setOf(1,2,...,50000), chunkSize=500)
// → executes 100 queries, each with IN(500 ids), merges results

fun <T> fetchByIds(
    table: String,
    column: String,
    ids: Set<Long>,
    chunkSize: Int = 500,
    conn: Connection,
    rowMapper: (ResultSet) -> T
): List<T> {
    TODO()
}
Task 6
Format-Preserving Masking (Phone Numbers)
MaskingHashing20 min

Problem: Mask phone numbers preserving format. "+7 702 365 6813" → "+7 XXX XXX XXXX" where X are deterministic digits. Same input = same output. Preserve country code. Output must pass regex validation for phone numbers.

// mask("+7 702 365 6813")  → "+7 481 927 3054"
// mask("+7 702 365 6813")  → "+7 481 927 3054"  (same!)
// mask("+44 20 7946 0958") → "+44 20 XXXX XXXX" (preserve format!)
// mask(null)               → null

fun maskPhone(phone: String?, salt: String = "s3cret"): String? {
    TODO()
}
Task 7
Producer-Consumer with Backpressure
ConcurrencyPerformance25 min

Problem: One reader reads rows from source. N workers transform rows. One writer batches writes to target. Backpressure: if workers are slow, reader pauses. If writer is slow, workers pause. Use BlockingQueue (Java) or Channel (Kotlin).

// Java version with BlockingQueue:
fun processTable(
    source: Connection,
    target: Connection,
    table: String,
    workers: Int = 4,
    batchSize: Int = 5000,
    transform: (Map<String, Any?>) -> Map<String, Any?>
) {
    TODO()
}
Task 8
Transformer Registry (Strategy Pattern)
DesignMasking20 min

Problem: User defines masking rules in YAML config. Build a registry that maps column → transformer. Apply the right transformer to each row. Must be thread-safe (called from multiple workers).

// Config: users.email → EmailMasker, users.phone → PhoneMasker
// registry.transform("users", "email", "alex@test.com") → "a7f3@masked.io"
// registry.transform("users", "name", "Alex") → "Alex" (no rule = passthrough)

interface Transformer {
    fun transform(value: Any?, salt: String): Any?
}

class TransformerRegistry {
    fun register(table: String, column: String, transformer: Transformer) { TODO() }
    fun transform(table: String, column: String, value: Any?, salt: String): Any? { TODO() }
}
Task 9
Parallel Table Processing with Level-Based Execution
ConcurrencyGraph25 min

Problem: Combine topo sort with parallel execution. Tables at the same topological level are independent and can be processed in parallel. Respect a max parallelism limit (connection budget).

// Level 0: [departments, products]     → parallel
// Level 1: [users]                     → after level 0
// Level 2: [orders]                     → after level 1
// Level 3: [order_items]               → after level 2

fun processInOrder(
    tables: List<String>, fks: List<FK>,
    maxParallel: Int = 4,
    process: (String) -> Unit
) {
    TODO()
}
Task 10
Checkpoint & Resume for Long-Running Processing
ReliabilityDatabase20 min

Problem: Processing 500 tables takes 6 hours. If it crashes at table #347, it should resume from #347, not start over. Track progress. Make each table's processing idempotent.

enum class Status { PENDING, IN_PROGRESS, COMPLETED, FAILED }
data class Progress(val table: String, val status: Status, val error: String? = null)

class CheckpointManager(private val conn: Connection) {
    fun initProgress(tables: List<String>) { TODO() }
    fun getStatus(table: String): Status { TODO() }
    fun markInProgress(table: String) { TODO() }
    fun markCompleted(table: String) { TODO() }
    fun markFailed(table: String, error: String) { TODO() }
    fun getPendingTables(): List<String> { TODO() }
}

fun processWithCheckpoint(tables: List<String>, mgr: CheckpointManager) {
    TODO()
}
Task 11
JSON/JSONB Deep Masking
MaskingTree Traversal25 min

Problem: A JSONB column contains nested objects with PII fields. Mask specified paths inside the JSON without destroying the structure. Paths defined as dot-notation: "user.email", "contacts[*].phone".

// Input JSON:  {"user":{"email":"alex@test.com","age":30},"notes":"call me"}
// Mask paths:  ["user.email"]
// Output JSON: {"user":{"email":"a7f3b2@masked.io","age":30},"notes":"call me"}

fun maskJson(json: String, paths: List<String>, salt: String): String {
    TODO()
}
Task 12
Database Dialect Abstraction
DesignDatabase25 min

Problem: Design a DatabaseDialect interface that abstracts differences between PostgreSQL and MySQL. Implement both dialects. Include: identifier quoting, streaming configuration, bulk load strategy, type mapping.

// dialect.quoteIdentifier("users") → PostgreSQL: "users", MySQL: `users`
// dialect.configureStreaming(stmt) → PostgreSQL: fetchSize=10000, MySQL: MIN_VALUE
// dialect.mapType("int4") → CanonicalType.INTEGER

interface DatabaseDialect {
    fun quoteIdentifier(name: String): String
    fun configureStreaming(conn: Connection, stmt: Statement)
    fun mapType(typeName: String, size: Int): CanonicalType
    fun bulkLoadSql(table: String, columns: List<String>): String
}

sealed class CanonicalType { /* define types */ }

Practice order

Day 1 (basics): Tasks 1, 2, 6 — topo sort, email masking, phone masking. Core algorithms, small scope, build confidence.
Day 2 (hard): Tasks 3, 4, 5 — FK graph BFS, cycle detection, chunked queries. Graph problems + database knowledge.
Day 3 (system): Tasks 7, 8, 9 — producer-consumer, transformer registry, parallel levels. Concurrency + design patterns.
Day 4 (polish): Tasks 10, 11, 12 — checkpoint, JSON masking, dialect abstraction. Reliability + real Synthesized patterns.

For each task: first try to solve WITHOUT looking at solution (set a timer). Then compare. Focus on explaining your approach out loud in English — this is what the interviewer evaluates.