Jobs and Scheduling
This documentation is for an unreleased version of Apache Flink. We recommend you use the latest stable version.

ジョブとスケジューリング #

This document briefly describes how Flink schedules jobs and how it represents and tracks job status on the JobManager.

スケジューリング #

Execution resources in Flink are defined through Task Slots. Each TaskManager will have one or more task slots, each of which can run one pipeline of parallel tasks. A pipeline consists of multiple successive tasks, such as the n-th parallel instance of a MapFunction together with the n-th parallel instance of a ReduceFunction. Note that Flink often executes successive tasks concurrently: For Streaming programs, that happens in any case, but also for batch programs, it happens frequently.

以下の図はそれを示しています。Consider a program with a data source, a MapFunction, and a ReduceFunction. The source and MapFunction are executed with a parallelism of 4, while the ReduceFunction is executed with a parallelism of 3. パイプラインは連続するソース - Map - Reduceから成ります。On a cluster with 2 TaskManagers with 3 slots each, the program will be executed as described below.

Assigning Pipelines of Tasks to Slots

Internally, Flink defines through SlotSharingGroup and CoLocationGroup which tasks may share a slot (permissive), respectively which tasks must be strictly placed into the same slot.

ジョブマネージャー データ構造 #

During job execution, the JobManager keeps track of distributed tasks, decides when to schedule the next task (or set of tasks), and reacts to finished tasks or execution failures.

The JobManager receives the JobGraph , which is a representation of the data flow consisting of operators ( JobVertex ) and intermediate results ( IntermediateDataSet ). 各オペレータは、並行度および実行するコードのようなプロパティを持ちます。 更に、ジョブグラフはアタッチされたライブラリのセットを持ちます。それはオペレータのコードを実行するために必要なものです。

The JobManager transforms the JobGraph into an ExecutionGraph . The ExecutionGraph is a parallel version of the JobGraph: For each JobVertex, it contains an ExecutionVertex per parallel subtask. 並行度100のオペレータは1つのJobVertexと100のExecutionVerticesを持つでしょう。 ExecutionVertexは特定のサブタスクの実行状態を追跡します。All ExecutionVertices from one JobVertex are held in an ExecutionJobVertex , which tracks the status of the operator as a whole. Besides the vertices, the ExecutionGraph also contains the IntermediateResult and the IntermediateResultPartition . The former tracks the state of the IntermediateDataSet, the latter the state of each of its partitions.

JobGraph and ExecutionGraph

各 ExecutionGraph はそれに関するジョブの状態を持ちます。 子のジョブの状態はジョブの実行の現在の状態を示します。

A Flink job is first in the created state, then switches to running and upon completion of all work it switches to finished. In case of failures, a job switches first to failing where it cancels all running tasks. If all job vertices have reached a final state and the job is not restartable, then the job transitions to failed. If the job can be restarted, then it will enter the restarting state. Once the job has been completely restarted, it will reach the created state.

In case that the user cancels the job, it will go into the cancelling state. これは必然的に現在の全ての実行中のタスクの中止を伴います。 Once all running tasks have reached a final state, the job transitions to the state cancelled.

Unlike the states finished, canceled and failed which denote a globally terminal state and, thus, trigger the clean up of the job, the suspended state is only locally terminal. ローカルで終端はジョブの実行がそれぞれのJobManagerで終了したがFlinkクラスタの他のJobManagerが永続的なHAソトアからジョブを取り出し再開することができることを意味します。 Consequently, a job which reaches the suspended state won’t be completely cleaned up.

States and Transitions of Flink job

During the execution of the ExecutionGraph, each parallel task goes through multiple stages, from created to finished or failed. The diagram below illustrates the states and possible transitions between them. タスクは複数回実行されるかもしれません(タオ手羽、障害の復旧の道筋)。 For that reason, the execution of an ExecutionVertex is tracked in an Execution . 各 ExecutionVertex は現在の Execution と以前の Executions を持ちます。

States and Transitions of Task Executions

Back to top

inserted by FC2 system