Notes for Parallel Computing and Distributed Systems
Forms of Parallel Execution
SIMD 遇到 branchs?一部分 ALU 需要等待
Interleaved (temporal) multi-threading: 这个任务在 stall 了(比如在 fetch memory),先执行别的任务
Overcoming bandwidth limits is often the most important challenge facing software developers targeting modern throughput-optimized systems.
忽然发现 CS244B 的 instructor 是
You should try everything else before you try distributed systems.
Why distributed systems?
需要比普通的 procedure 多一个 “I don’t know” 的选项。
Interface Definition Languages (IDL): Specify RPC call and return types
External Data Representation Standard (XDR)
在讨论 asynchronous systems 的时候,我们会保守地认为网络是很慢的。也就是说,we can’t distinguish failed agent from slow network.
有 \(n\) 个 agents,每个 agent input 一个数字,现在要各个 agent 通过互相交流达成一致,使得每个 agent 的 output 都相同,并且是其中一个 agent 的 input。
Safety: 所有 agent 的 output 都相同(agreement)并且 output 为其中某个 agent 的 input(validity)
Liveness: 所有 non-failed 的 agents 都有输出
Fault tolerance:
FLP impossibility result
Bivalent: An execution of a consensus protocol is in a bivalent state when the network can affect which value agents choose.
Univalent, Valent: An execution of a consensus protocol is in a univalent state when only one output value is possible. If that value is \(i\), call the state \(i\)-valent.
Stuck: An execution of a [broken] consensus protocol is in a stuck state when one or more non-faulty nodes can never output a value.