Malte Schwarzkopf

Date: 13:45, Wednesday, December 21, 2016
Speaker: Malte Schwarzkopf
Venue: IST Austria

CS Talk

Scheduling tasks on clusters in large-scale datacenters is challenging:
thousands of tasks must be cleverly and quickly placed to achieve good
application-level performance and high resource utilization. Centralized
datacenter schedulers can make high-quality placement decisions, but
today these high-quality placements come at the cost of high latency at
scale, which degrades response time for interactive tasks and reduces
resource utilization.

In this talk I present Firmament, a centralized scheduler that scales to
over ten thousand machines, even though it performs a computationally
expensive min-cost max-flow (MCMF) optimization that continuously
reschedules all tasks. To achieve this, Firmament automatically chooses
between different MCMF algorithms, solves the optimization problem
incrementally when possible, and applies problem-specific optimizations.

Experiments with a Google workload trace from a 12,500-machine cluster
show that Firmament places tasks in hundreds of milliseconds, and that
Firmament improves placement latency by 20x over Quincy, a prior
centralized scheduler using the same MCMF optimization. Moreover, even
though Firmament is centralized, it matches the placement latency of
distributed schedulers for workloads of short tasks. Finally, Firmament
exceeds the placement quality of four widely-used centralized and
distributed schedulers on a real-world cluster, improving batch task
response time by 6x.

I. Gog, M. Schwarzkopf, A. Gleave, R. N. M. Watson, S. Hand. “Firmament:
Fast, Centralized Cluster Scheduling at Scale”. In Proceedings of the
12th USENIX Symposium on Operating Systems Design and Implementation
(OSDI), Savannah, GA, USA, November 2016, p. 99–115.

Posted in RiSE Seminar