20 deep questions — threads, locks, atomics, coroutines, race conditions
Two threads both do balance += 100. Balance starts at 0. Expected: 200. Actual: might be 100. Why?
The classic example:
var balance = 0 // Thread 1: balance += 100 // Thread 2: balance += 100 // Expected: balance = 200
But balance += 100 is THREE operations: read balance, add 100, write balance. Two threads interleave:
Thread 1: read balance = 0 Thread 2: read balance = 0 (reads SAME old value!) Thread 1: write balance = 0 + 100 = 100 Thread 2: write balance = 0 + 100 = 100 (overwrites Thread 1's result!) // balance = 100, not 200. Thread 1's update LOST.
This is called a lost update — one of the most common race conditions. It happens because read-modify-write is not atomic.
Why it's hard to catch: the bug depends on timing. In testing (single thread, low load), it works perfectly. In production (100 concurrent users), it fails intermittently. You can't reproduce it reliably.
Three categories of race conditions:
1. Lost update (above): two writes, one overwrites the other.
2. Dirty read: reading partially-written data. Thread 1 updates two fields (name + email) — Thread 2 reads between the two updates, sees new name with old email.
3. Check-then-act: if (map.containsKey(k)) map.get(k) — between check and get, another thread removes the key. You get null despite the check passing.
Fix: make the critical section atomic. Options: synchronized/Mutex (lock the entire section), AtomicInteger (hardware-level atomic operations), ConcurrentHashMap (thread-safe collections), immutable data (no shared mutable state).
synchronized, Mutex, and AtomicInteger?Choose the lightest tool for the job: Atomic > Mutex > synchronized.
AtomicInteger — lock-free, fastest, for single variables:
val counter = AtomicInteger(0) counter.incrementAndGet() // atomic: read + add + write in ONE CPU instruction counter.compareAndSet(5, 6) // atomic: if value is 5, set to 6 // No lock. No waiting. Hardware CAS (Compare-And-Swap). // Use for: counters, flags, simple accumulators
Mutex — suspends coroutine, for coroutine code:
val mutex = Mutex()
suspend fun debit(amount: Money) = mutex.withLock {
check(balance >= amount) // two operations
balance -= amount // protected together
}
// Thread is FREED while waiting for lock.
// Other coroutines can use the thread.
// Use for: protecting multiple operations in coroutines
synchronized — blocks thread, for non-coroutine code:
synchronized(lock) {
check(balance >= amount)
balance -= amount
}
// Thread is BLOCKED. Does nothing. Wastes resources.
// Use for: non-coroutine Java/Kotlin code
Decision matrix:
Scenario Tool Single counter/flag AtomicInteger/AtomicBoolean Multiple operations in coroutine Mutex Multiple operations without cor. synchronized or ReentrantLock Producer-consumer queue Channel (Kotlin) / BlockingQueue (Java) Limit parallelism Semaphore Thread-safe collection ConcurrentHashMap
C# equivalents: synchronized = lock(obj). Mutex = SemaphoreSlim(1,1) + await. AtomicInteger = Interlocked.Increment().
i++ not thread-safe?AtomicInteger.incrementAndGet() insteadi++ looks like one operation but compiles to: LOAD i, ADD 1, STORE i. Three instructions.
var i = 0 // i++ compiles to bytecode: // 1. ILOAD — read i from memory into register (value: 0) // 2. IADD 1 — add 1 in register (value: 1) // 3. ISTORE — write register back to memory (i = 1) // Two threads doing i++ simultaneously: Thread A: ILOAD (reads 0) Thread B: ILOAD (reads 0) // reads SAME old value! Thread A: IADD 1, ISTORE (i = 1) Thread B: IADD 1, ISTORE (i = 1) // overwrites with same value! // Result: i = 1, not 2. One increment lost.
The fix — AtomicInteger:
val counter = AtomicInteger(0) // incrementAndGet() is ONE atomic CPU instruction (CAS): // Compare-And-Swap: "if value is still 0, set to 1" // If another thread changed it between read and write, // CAS detects the change and retries automatically. // Thread A: CAS(expected=0, new=1) -> success, returns 1 // Thread B: CAS(expected=0, new=1) -> FAILS (value is now 1, not 0) // CAS retries: CAS(expected=1, new=2) -> success, returns 2 // Result: i = 2. Correct!
This also applies to: i--, i += 5, map.put(k, map.get(k) + 1) — any read-modify-write is unsafe without synchronization.
Local variables are safe because each thread has its own stack — local variables are not shared. The problem only occurs with shared mutable state (class fields, global variables, captured vars in closures).
Thread 1 locks A, waits for B. Thread 2 locks B, waits for A. Both stuck forever.
Four conditions for deadlock (ALL must be true):
1. Mutual exclusion: resource can only be held by one thread at a time (lock).
2. Hold and wait: thread holds one resource while waiting for another.
3. No preemption: locks can't be forcibly taken from a thread.
4. Circular wait: A waits for B, B waits for A (cycle).
Break ANY one condition to prevent deadlock. Easiest: break circular wait with consistent ordering.
// DEADLOCK — no ordering:
// transfer(A, B): lock A -> lock B
// transfer(B, A): lock B -> lock A <- opposite order!
suspend fun transfer(from: Account, to: Account, amount: Money) {
from.mutex.withLock { // locks in arbitrary order
to.mutex.withLock { // DEADLOCK if concurrent reverse transfer!
from.debit(amount)
to.credit(amount)
}
}
}
// FIX — lock ordering by ID:
suspend fun transfer(from: Account, to: Account, amount: Money) {
val (first, second) = if (from.id < to.id) from to to else to to from
first.mutex.withLock { // ALWAYS smaller ID first
second.mutex.withLock { // ALWAYS larger ID second
from.debit(amount)
to.credit(amount)
}
}
}
// transfer(A, B): lock A -> lock B
// transfer(B, A): lock A -> lock B <- SAME order! No deadlock.
Other prevention strategies:
Lock timeout: if (lock.tryLock(5, SECONDS)) — give up after 5 seconds. Prevents infinite wait but doesn't solve the root cause. ReentrantLock supports this, synchronized does not.
Lock-free algorithms: use AtomicInteger, ConcurrentHashMap — no locks, no deadlocks. Not always possible for complex operations.
Single lock: use one global lock for all accounts. Eliminates deadlock but kills concurrency — all transfers become sequential. Bad for throughput.
Avoid nested locks: redesign so you never need two locks simultaneously. Sometimes possible with atomic UPDATE WHERE in the database.
volatile and when is it NOT enough?i++ (read-modify-write) — use AtomicIntegerVolatile fixes: "Thread 1 writes flag=true but Thread 2 still sees false." Doesn't fix: "Two threads increment counter simultaneously."
The problem volatile solves — CPU cache visibility:
Each CPU core has a local cache. Without volatile, Thread 1 writes running = false to its cache. Thread 2 on another core still reads running = true from its own stale cache. The change is invisible.
Volatile forces: write goes to main memory immediately, reads always come from main memory.
// GOOD — volatile for a flag:
@Volatile var running = true
// Thread 1 (producer):
running = false // written to main memory immediately
// Thread 2 (consumer loop):
while (running) { // reads from main memory, sees false
doWork()
}
// Works correctly. Thread 2 stops.
The problem volatile does NOT solve — compound operations:
// BAD — volatile for counter: @Volatile var counter = 0 // Thread 1: counter++ (read 0, add 1, write 1) // Thread 2: counter++ (read 0, add 1, write 1) // RACE CONDITION! // Result: 1 instead of 2. Volatile didn't help.
Volatile is sufficient for: boolean flags, single assignment (write once, read many), double-checked locking patterns.
Volatile is NOT sufficient for: counters, accumulators, check-then-act, any operation that reads the old value to compute the new value.
Kotlin syntax: @Volatile var flag = false. C# equivalent: volatile bool flag or Volatile.Read()/Volatile.Write().
synchronized inside a suspend function?synchronized holds the OS thread. But the whole point of coroutines is that threads are shared and freed during suspension.
// BAD: synchronized in suspend function
suspend fun debit(amount: Money) {
synchronized(lock) { // blocks OS thread!
delay(100) // tries to suspend... but thread is held!
balance -= amount
}
}
// Thread pool has 8 threads. 8 coroutines call debit().
// All 8 threads blocked in synchronized. No threads left.
// 9th coroutine can't even start. System is frozen.
The fundamental conflict: synchronized says "hold this thread until I'm done." Coroutines say "free this thread while I'm suspended." These are contradictory. Synchronized + suspend = thread held unnecessarily.
// GOOD: Mutex in suspend function
val mutex = Mutex()
suspend fun debit(amount: Money) = mutex.withLock {
delay(100) // suspends coroutine, thread FREED
balance -= amount
}
// Thread pool has 8 threads. 8 coroutines call debit().
// First coroutine acquires Mutex, suspends during delay.
// Thread freed! Handles other work.
// Other 7 coroutines suspended waiting for Mutex.
// Threads free! No threads wasted.
Important: synchronized in Kotlin compiles fine — it's not a compile error. IntelliJ shows a warning: "Possibly blocking call in non-blocking context." But the code runs. It just wastes threads and can cause deadlocks under load.
Rule: inside suspend functions, use Mutex. Outside suspend functions (regular code), synchronized is fine.
Optimistic concurrency at the hardware level. "I think the value is 5. Set to 6. If someone changed it, try again."
CAS is a single CPU instruction that does three things atomically:
CAS(memory_location, expected_value, new_value):
if memory[location] == expected:
memory[location] = new_value
return true // success
else:
return false // someone changed it, retry
How AtomicInteger uses CAS for incrementAndGet():
// AtomicInteger.incrementAndGet() pseudocode:
fun incrementAndGet(): Int {
while (true) { // retry loop
val current = value // read current value
val next = current + 1 // compute new value
if (CAS(value, current, next)) { // atomic: if still current, set next
return next // success!
}
// CAS failed — someone else changed value
// Loop back, read again, try again
}
}
// Thread A: read 5, CAS(5, 6) -> success, returns 6
// Thread B: read 5, CAS(5, 6) -> FAILS (value is now 6)
// read 6, CAS(6, 7) -> success, returns 7
// Both increments counted. No lock needed!
Why CAS is faster than locks:
Lock (synchronized): thread must acquire lock, enter kernel mode, potentially sleep and wake up. Even uncontended lock acquisition takes ~20ns.
CAS: single CPU instruction, ~1-5ns. No kernel mode, no sleeping. Under low contention, CAS rarely retries. Much faster.
CAS downside — ABA problem: value was A, changed to B, changed back to A. CAS sees A and thinks "nothing changed!" But something DID happen. Fix: AtomicStampedReference (adds a version counter alongside the value).
CAS downside — high contention: if 100 threads all CAS simultaneously, 99 fail and retry. Spinning wastes CPU. Under very high contention, a lock might actually be better (threads sleep instead of spinning). LongAdder solves this by using multiple cells.
Built on CAS: AtomicInteger, AtomicReference, ConcurrentHashMap, ConcurrentLinkedQueue, Java's ReentrantLock (internally uses CAS). It's the foundation of all lock-free data structures.
coroutineScope and supervisorScope?coroutineScope = "all or nothing" (debit + credit). supervisorScope = "each on their own" (email + SMS + push).
// coroutineScope — fail-fast, cancel everything:
suspend fun transferMoney() = coroutineScope {
val debit = async { debitAccount(from, amount) }
val credit = async { creditAccount(to, amount) }
// If debit fails -> credit is CANCELLED automatically
// If credit fails -> debit result is useless anyway
// Makes sense: partial transfer = inconsistent state
}
// supervisorScope — independent, each on their own:
suspend fun notify() = supervisorScope {
launch { sendEmail(user) } // fails!
launch { sendPushNotification() } // continues!
launch { sendSMS() } // continues!
// Email failed? Fine. Push and SMS still sent.
// User still gets notified via other channels.
}
How failure propagation differs:
coroutineScope: child exception propagates UP to the scope. Scope cancels all other children. Scope re-throws the exception. Caller's try/catch catches it.
supervisorScope: child exception does NOT propagate up. Other children are unaffected. But you MUST handle errors in each child (try/catch or CoroutineExceptionHandler). Unhandled exception in launch goes to CoroutineExceptionHandler.
// supervisorScope without error handling — BAD:
supervisorScope {
launch { throw Exception("boom") } // where does this go?
// -> CoroutineExceptionHandler if set, otherwise stderr
// -> caller NEVER knows about the failure!
}
// supervisorScope with error handling — GOOD:
supervisorScope {
launch {
try { sendEmail() }
catch (e: Exception) { logger.error("Email failed", e) }
}
}
Rule of thumb: use coroutineScope by default (safer — failures always propagated). Use supervisorScope only when children are explicitly independent AND you handle errors in each child.
Semaphore has 10 permits?Kotlin's Semaphore is coroutine-aware. What does "coroutine-aware" mean for waiting?
Semaphore limits concurrent access to a resource. Like a parking lot with N spaces.
val dbSemaphore = Semaphore(permits = 10) // 10 "parking spaces"
suspend fun query(sql: String): Result {
dbSemaphore.withPermit { // take a permit (or suspend if none available)
return db.execute(sql) // only 10 concurrent DB queries
} // permit returned automatically
}
// Coroutines 1-10: acquire permits, run queries.
// Coroutine 11: all permits taken -> SUSPENDED
// -> thread freed for other work!
// -> when coroutine 3 finishes, permit returned
// -> coroutine 11 resumes automatically
Key difference from Java's Semaphore:
Java Semaphore.acquire(): BLOCKS the OS thread. Thread sits idle consuming 1MB of stack while waiting. 1000 waiting threads = 1GB wasted.
Kotlin Semaphore.withPermit: SUSPENDS the coroutine. Thread freed immediately. Coroutine state stored in ~200 bytes on heap. 1000 waiting coroutines = ~200KB.
Use cases for Semaphore:
Limit DB connection usage (match connection pool size), limit concurrent API calls to external service (respect rate limits), limit parallel file I/O (prevent disk saturation).
Semaphore vs Mutex: Mutex = Semaphore(1). Only one coroutine can enter. Semaphore allows N concurrent coroutines.
Like structured programming (no goto) but for concurrency. Every concurrent task has a clear owner and lifetime.
The problem without structured concurrency:
// Unstructured — fire and forget:
fun handleRequest() {
GlobalScope.launch { fetchUser() } // who owns this?
GlobalScope.launch { fetchOrders() } // what if this fails?
// Function returns immediately.
// Coroutines running somewhere. No one waits for them.
// If they fail -> error goes to stderr. Caller never knows.
// If server shuts down -> coroutines might be mid-operation. Data loss.
// These are "orphaned" tasks — no parent, no lifecycle.
}
With structured concurrency:
suspend fun handleRequest() = coroutineScope {
val user = async { fetchUser() }
val orders = async { fetchOrders() }
// Three guarantees:
// 1. coroutineScope WAITS for both children to complete
// 2. If fetchUser fails -> fetchOrders is CANCELLED
// 3. If someone cancels handleRequest -> both children cancelled
// No orphaned tasks. No silent failures. No resource leaks.
respond(user.await(), orders.await())
}
The three guarantees:
1. Completion: parent scope doesn't complete until ALL children complete. No orphaned tasks running in the background.
2. Cancellation propagation: cancelling a parent automatically cancels all children. SIGTERM -> scope.cancel() -> all children get CancellationException -> graceful shutdown.
3. Error propagation: child failure propagates to parent. Parent can handle it or propagate further. No silent failures.
This is Kotlin coroutines' biggest advantage over C# async/await. In C# you must manually pass CancellationToken through every method. Forget once -> task runs forever. In Kotlin, cancellation is automatic through the scope hierarchy.
Also the biggest advantage over Java Virtual Threads: VT has no structured concurrency. Fire a virtual thread -> it runs independently. No parent-child relationship. No auto-cancellation. Java is exploring structured concurrency as a preview feature (JEP 453) but it's not stable yet.
Channel in Kotlin coroutines?send() suspends when full, receive() suspends when empty. Like Java's BlockingQueue but non-blockingProducer sends payments into channel. Consumer processes them. Channel provides backpressure — producer suspends when consumer is slow.
val channel = Channel<Payment>(capacity = 100) // buffer of 100
// Producer coroutine:
launch {
for (payment in payments) {
channel.send(payment) // suspends if buffer full (backpressure!)
}
channel.close() // signal: no more items
}
// Consumer coroutine:
launch {
for (payment in channel) { // suspends if buffer empty
process(payment) // processes at its own pace
}
// loop ends when channel is closed AND empty
}
Channel types by capacity:
Rendezvous (Channel(0)): no buffer. send() suspends until someone calls receive(). Direct hand-off. Tightest synchronization.
Buffered (Channel(100)): send() succeeds immediately if buffer not full. Suspends when buffer full. Decouples producer and consumer speed.
Unlimited (Channel(UNLIMITED)): never suspends on send. Buffer grows forever. Risk: OOM if producer is much faster than consumer. Use with caution.
Conflated (Channel(CONFLATED)): keeps only the latest value. If consumer is slow, old values are dropped. Good for "latest state" patterns (UI updates, sensor readings).
Comparison across languages:
Java: BlockingQueue — blocks OS thread on put/take. Same concept, but wastes threads.
Go: Go channels — almost identical concept. Kotlin channels are inspired by Go.
C#: System.Threading.Channels.Channel — async producer-consumer. Similar to Kotlin but less integrated.
Flow in Kotlin?IAsyncEnumerable or RxJava Observable but simplerFlow is to async streams what Sequence is to synchronous streams. Cold = nothing happens until you call .collect().
// Creating a Flow:
fun priceUpdates(): Flow<Price> = flow {
while (true) {
val price = fetchLatestPrice()
emit(price) // sends value downstream
delay(1000) // wait 1 second
}
}
// NOTHING runs yet. Flow is just a description.
// Collecting (starts execution):
priceUpdates()
.filter { it.amount > 100 }
.map { it.format() }
.collect { formatted -> // THIS triggers execution
updateUI(formatted)
}
Cold vs hot:
Flow (cold): each collector gets its own independent execution. Two collectors = two separate streams running independently. Like reading a file — each reader starts from the beginning.
SharedFlow / StateFlow (hot): one source, multiple collectors share the same emissions. Like a radio broadcast — tune in anytime, get whatever's playing now.
// SharedFlow — hot, multiple collectors:
val events = MutableSharedFlow<Event>()
// Collector A: events.collect { ... } // sees all events from now
// Collector B: events.collect { ... } // sees same events
// events.emit(event) // both collectors receive it
// StateFlow — hot, always has a current value:
val balance = MutableStateFlow(Money(0, "EUR"))
// balance.value = newBalance // update
// balance.collect { ... } // immediately gets current value + future updates
Backpressure is built-in: if collector is slow, the flow producer suspends automatically. No buffer overflow. No manual backpressure handling. This is because flow uses coroutines — emit() is a suspend function that waits if collector isn't ready.
Comparison: Flow = C#'s IAsyncEnumerable<T> = RxJava Observable (but simpler, no callback hell, uses suspend instead of subscribe).
CancellationException and why must you never swallow it?SIGTERM -> scope.cancel() -> CancellationException thrown at suspension points. If you catch and swallow it, the coroutine can't stop.
How coroutine cancellation works:
val job = launch {
while (true) {
doWork()
delay(1000) // suspension point
}
}
job.cancel() // request cancellation
// At the next suspension point (delay), Kotlin throws CancellationException
// The coroutine's while loop is interrupted
// Coroutine completes with "cancelled" status
The danger of catching Exception:
// BAD — swallows CancellationException:
launch {
while (true) {
try {
delay(1000)
doWork()
} catch (e: Exception) { // catches EVERYTHING
logger.error("Error", e) // logs CancellationException as "error"
// continues looping!
// Coroutine CANNOT be cancelled!
// SIGTERM -> scope.cancel() -> nothing happens!
}
}
}
// GOOD — catch specific exceptions:
launch {
while (true) {
try {
delay(1000)
doWork()
} catch (e: CancellationException) {
throw e // re-throw! Let cancellation happen.
} catch (e: Exception) {
logger.error("Error", e) // handle business errors
}
}
}
// BEST — catch only what you expect:
launch {
while (isActive) { // check cancellation explicitly
try {
delay(1000)
doWork()
} catch (e: HttpException) { // only catch specific errors
logger.error("HTTP error", e)
}
// CancellationException flies through — coroutine stops
}
}
Rule: never catch (e: Exception) in coroutines without re-throwing CancellationException. Either catch specific exceptions, or add catch (e: CancellationException) { throw e } before the general catch.
C# equivalent: OperationCanceledException — same concept. Must not be swallowed. Indicates intentional cancellation via CancellationToken.
launch and async?C# analogy: launch = Task.Run() ignoring result. async = storing Task<T> and awaiting it.
// launch — fire and forget, no return value:
val job: Job = launch {
sendNotification(user) // we don't need the result
}
// job.join() — wait for completion (optional)
// job.cancel() — cancel the coroutine
// async — returns a future value:
val deferred: Deferred<User> = async {
fetchUser(id) // we NEED the result
}
val user: User = deferred.await() // suspends until result ready
Exception handling difference — critical!
// launch: exception propagates to parent IMMEDIATELY
coroutineScope {
launch {
throw RuntimeException("boom")
// -> exception goes to parent scope RIGHT NOW
// -> scope cancels other children
// -> scope re-throws
}
}
// async: exception stored in Deferred, thrown at .await()
coroutineScope {
val d = async {
throw RuntimeException("boom")
// -> exception stored inside Deferred
// -> nothing happens yet!
}
println("still running") // this executes!
d.await() // -> NOW exception is thrown
}
// BUT: in coroutineScope, async exception ALSO propagates to parent
// (because structured concurrency). In supervisorScope, it only
// throws at .await().
Common pattern — parallel decomposition:
suspend fun loadDashboard() = coroutineScope {
val user = async { fetchUser() } // starts immediately
val orders = async { fetchOrders() } // starts immediately, in parallel
val balance = async { fetchBalance() } // starts immediately, in parallel
// All three run concurrently
Dashboard(user.await(), orders.await(), balance.await())
// Waits for all three, returns combined result
}
10,000 requests arrive. Create 10,000 threads? That's 10GB RAM just for stacks. Pool reuses a fixed number.
// BAD — new thread per request:
fun handleRequest(req: Request) {
Thread {
process(req)
}.start()
}
// 10,000 concurrent requests = 10,000 threads = 10GB stack memory
// OS scheduler overwhelmed with 10K threads
// Thread creation overhead: 1ms each = 10 seconds of CPU time
// GOOD — thread pool:
val executor = Executors.newFixedThreadPool(200)
fun handleRequest(req: Request) {
executor.submit {
process(req)
}
}
// 200 threads reused. 200MB stack memory.
// Extra requests wait in queue until a thread is free.
// Thread creation: once at startup, not per request.
Kotlin Dispatchers ARE thread pools:
Dispatchers.Default = pool of CPU-core threads (for CPU work)
Dispatchers.IO = pool of ~64 threads (for blocking I/O)
Dispatchers.Main = single UI thread (Android)
// When you write:
launch(Dispatchers.IO) { db.query("SELECT ...") }
// A thread from the IO pool executes your code.
// When done, thread returns to pool for the next coroutine.
Sizing a thread pool:
CPU-bound work (calculations, parsing): threads = number of CPU cores. More threads = more context switching, no benefit since CPU is the bottleneck.
I/O-bound work (DB, HTTP, file): threads = cores * (1 + wait_time/compute_time). If each request waits 50ms and computes 5ms, optimal = cores * 11. Kotlin's Dispatchers.IO defaults to max(64, cores) threads.
With coroutines, thread pools still exist — they're just hidden behind Dispatchers. Coroutines are scheduled ON threads from the pool. The difference: 10,000 coroutines can run on 64 threads, because coroutines suspend during I/O and the thread handles another coroutine.
Dispatchers.IO and Dispatchers.Default?CPU work: all threads are busy computing. IO work: threads are mostly idle, waiting for responses.
// Dispatchers.Default — for CPU-intensive work:
withContext(Dispatchers.Default) {
val hash = computeExpensiveHash(data) // CPU busy 100% of the time
val parsed = parseHugeJson(input) // CPU busy
}
// Thread count = CPU cores (e.g., 8 on 8-core machine)
// More threads = more context switching, no speedup (CPU is bottleneck)
// Dispatchers.IO — for blocking I/O:
withContext(Dispatchers.IO) {
val result = db.query("SELECT ...") // thread waits 50ms for DB
val response = httpClient.get(url) // thread waits 200ms for HTTP
val content = file.readText() // thread waits for disk
}
// Thread count = max(64, cores)
// Many threads OK because they're mostly WAITING, not computing
// While thread 1 waits for DB, thread 2 handles another coroutine
What happens if you use the wrong Dispatcher:
// BAD — CPU work on IO dispatcher:
withContext(Dispatchers.IO) {
computeExpensiveHash(data)
}
// 64 threads all doing CPU work. CPU only has 8 cores.
// 56 threads constantly context-switching. Overhead, not speedup.
// BAD — blocking IO on Default dispatcher:
withContext(Dispatchers.Default) {
db.query("SELECT ...") // blocks one of 8 threads
}
// 8 threads, all waiting for DB. No threads left for CPU work.
// Application appears frozen.
C# equivalent: Task.Run() uses ThreadPool (similar to Default). There's no direct equivalent of Dispatchers.IO — C# async/await doesn't block threads for I/O at all (truly asynchronous I/O at OS level). Kotlin's Dispatchers.IO exists because many Java libraries have blocking I/O (JDBC, file I/O) that needs a separate large thread pool.
ThreadLocal and why is it problematic with coroutines?ThreadLocal stores request context (user ID, trace ID). Coroutine suspends on Thread 1, resumes on Thread 3. Where's the context?
ThreadLocal — thread-scoped storage:
val currentUser = ThreadLocal<User>()
// Thread 1: currentUser.set(User("Alex"))
// Thread 1: currentUser.get() // User("Alex")
// Thread 2: currentUser.get() // null! Thread 2 has its own copy
Common use: request context in web frameworks. Servlet receives request on Thread 5. Sets ThreadLocal with user info. All code on Thread 5 can access user info without passing it through every method.
The coroutine problem:
val requestId = ThreadLocal<String>()
suspend fun handleRequest(id: String) {
requestId.set(id) // set on Thread 1
val user = fetchUser() // suspends... resumes on Thread 3!
logger.info("${requestId.get()}") // null! Thread 3 has no ThreadLocal
}
Coroutine suspends on Thread 1, resumes on Thread 3. ThreadLocal belongs to Thread 1. Thread 3 has its own (empty) ThreadLocal.
Kotlin's solution — CoroutineContext elements:
// Option 1: ThreadLocal.asContextElement()
val requestId = ThreadLocal<String>()
launch(requestId.asContextElement("req-123")) {
// Kotlin automatically copies ThreadLocal value when switching threads
requestId.get() // "req-123" — works even after thread switch!
}
// Option 2: Custom CoroutineContext element (cleaner):
data class RequestContext(val id: String, val userId: String)
: AbstractCoroutineContextElement(RequestContext) {
companion object Key : CoroutineContext.Key<RequestContext>
}
launch(RequestContext("req-123", "user-456")) {
val ctx = coroutineContext[RequestContext]
logger.info("Request ${ctx?.id} by ${ctx?.userId}")
}
Also: always call threadLocal.remove() when done. Thread pools reuse threads — old values leak into next request. This is a common memory leak source.
withContext and how does it differ from launch?withContext switches dispatcher and suspends until complete (sequential). launch starts a new concurrent coroutine (parallel). withContext returns a result, launch returns JobwithContext(IO) { query() } = run on IO pool, wait for result, continue. launch { query() } = fire and forget.
// withContext — switch context, wait for result:
suspend fun getUser(id: String): User {
return withContext(Dispatchers.IO) { // switch to IO thread
db.query("SELECT * FROM users WHERE id = ?", id)
} // switch back, return result
// Sequential: next line runs AFTER query completes
}
// launch — start concurrent coroutine:
fun startBackgroundJob() {
scope.launch(Dispatchers.IO) { // start new coroutine
processQueue() // runs in background
}
// launch returns IMMEDIATELY
// processQueue runs concurrently
println("This runs right away!")
}
Key differences:
withContext: suspends current coroutine, runs block on specified dispatcher, returns result. The calling code WAITS. Like calling a function that happens to run on a different thread. Use for: "run this blocking code on IO pool and give me the result."
launch: starts a NEW coroutine. Returns Job immediately. The calling code continues in parallel. Use for: "start this background task, I don't need the result right now."
// Common pattern — blocking Java library in coroutine:
suspend fun readFile(path: String): String {
return withContext(Dispatchers.IO) { // move to IO pool
File(path).readText() // blocking call — OK on IO
} // back to original dispatcher
}
// Caller doesn't know or care that it ran on IO internally.
// Just looks like a regular suspend function.
C# equivalent of withContext: there's no direct equivalent. C#'s async I/O is truly non-blocking at the OS level (IOCP). ConfigureAwait(false) is the closest concept — "don't come back to the original context." Kotlin's withContext explicitly moves TO a specific context.
Rob Pike: "Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once."
Concurrency — juggling multiple tasks:
One cook preparing three dishes. Chops vegetables for dish 1, puts dish 2 in oven, stirs dish 3. Only doing one thing at a time, but managing all three. Tasks are interleaved. Can happen on a single CPU core.
Parallelism — doing multiple tasks simultaneously:
Three cooks, each preparing one dish. Actually doing three things at the same time. Requires multiple CPU cores.
You can have one without the other:
Concurrent but not parallel: single-core machine running async/await. Tasks interleave on one core. JavaScript is always concurrent (event loop) but never parallel (single-threaded, unless Web Workers).
Parallel but not concurrent: GPU processing 1000 pixels simultaneously. Same operation on different data. No task management needed. Pure parallelism.
Both concurrent and parallel: Kotlin coroutines on Dispatchers.Default (multi-core). Multiple coroutines managed concurrently, executing in parallel on different cores.
In Kotlin:
// Concurrent (one thread, interleaving):
// Dispatchers.Main (Android) — one UI thread
// But coroutines take turns at suspension points
// Parallel (multiple threads):
// Dispatchers.Default — CPU cores threads
coroutineScope {
val a = async(Dispatchers.Default) { cpuWork1() } // core 1
val b = async(Dispatchers.Default) { cpuWork2() } // core 2
a.await() + b.await() // truly simultaneous execution
}
For the interview: "Concurrency is about structuring your program to handle multiple things — like async/await. Parallelism is about executing things simultaneously — needs multiple cores. Kotlin coroutines give you concurrency by default, and parallelism when you use multi-threaded dispatchers."
Each language has thread-blocking tools. Kotlin adds coroutine-suspending equivalents.
Full comparison:
Purpose Java C# Kotlin (coroutines)
Mutual exclusion synchronized lock(obj) Mutex
(one at a time) ReentrantLock Monitor.Enter/Exit mutex.withLock { }
Atomic counter AtomicInteger Interlocked.Inc() AtomicInteger (same)
AtomicReference Interlocked.Exchange AtomicReference
Limit parallelism Semaphore SemaphoreSlim Semaphore
(N at a time) .acquire() BLOCKS .WaitAsync() async .withPermit suspends
Producer-consumer BlockingQueue Channel<T> Channel<T>
.put() BLOCKS .WriteAsync() .send() suspends
Thread-safe map ConcurrentHashMap ConcurrentDictionary ConcurrentHashMap
(uses Java's)
Thread-safe list CopyOnWriteArrayList ConcurrentBag N/A (use Mutex +
Collections.sync ImmutableList mutableList)
The pattern:
Java: thread-oriented tools. All block the OS thread when waiting.
C#: has both blocking (Monitor) and async (SemaphoreSlim.WaitAsync()) variants.
Kotlin: adds coroutine-aware tools that SUSPEND instead of blocking. These are the preferred choice in coroutine code. The Java blocking tools still work but waste threads.
Decision rule for Kotlin:
In suspend function / coroutine context: Use Mutex, Semaphore, Channel (suspend, don't block) In regular (non-suspend) code: Use synchronized, ReentrantLock (blocking is OK) For single atomic variables (any context): Use AtomicInteger, AtomicBoolean (lock-free, always fine) For thread-safe maps (any context): Use ConcurrentHashMap (lock-free reads, fine everywhere)