Skip to main content

Scheduling

By default, Hadoop uses FIFO to schedule jobs. Alternate scheduler options: capacity and fair

Capacity Scheduler

  • Jobs are submitted to queues
  • Jobs can be prioritized
  • Queues are allocated a fraction of the total resource capacity
  • Free resources are allocated to queues beyond their total capacity
  • No preemption once a job is running

Fair scheduler

  • Provides fast response times for small jobs
  • Jobs are grouped into Pools
  • Each pool assigned a guaranteed minimum share
  • Excess capacity split between jobs
  • By default, jobs that are uncategorized go into a default pool.
  • Pools have to specify the minimum number of map slots, reduce slots, and a limit on the number of running job