スタンドアローン #

はじめに #

このはじめにセクションでは、Flinkクラスタのローカルセットアップ(1台のマシン上だが別のプロセス)を通じて案内します。これは簡単に拡張して分散スタンドアローンクラスタをセットアップできます。これについてはリファレンスセクションで説明します。

はじめに #

スタンドアローンモードは、Flinkをデプロイするもっとも必要最低限の方法です:デプロイメントの概要で説明されているFlinkサービスは、オペレーティングシステム上で単にプロセスとして起動されます。KubernetesやYARNのようなリソースプロバイダを使ってFlinkをデプロイするのとは異なり、失敗したプロセスの再起動や運用中のろそーすのアリあてと割り当て解除に注意する必要があります。

スタンドアローンモードのリソースプロバイダの追加のサブページでは、スタンドアローンモードに基づく追加のデプロイメント方法について説明します: Dockerコンテナでのデプロイメント、Kubernetes。

準備 #

Flinkは全てのUNIX風の環境、例えばLinux、Mac OS X、Cygwin (for Windows)で実行されます。システムのセットアップを開始する前に、システムが次の要件を満たしていることを確認してください。

Java 1.8.x移行がインストールされている、
ダウンロードページから最新のFlink配布物をダウンロードし、解凍した、

スタンドアローンクラスタの起動(セッションモード) #

これらのステップはFlinkスタンドアローンクラスタを起動し、サンプルのジョブを送信する方法を示します:

# 解凍したFlink配布物のルートディレクトリにいると仮定します

# (1) クラスタを開始します
$ ./bin/start-cluster.sh

# (2) これで、Flink Webインタフェース http://localhost:8081 にアクセスできます

# (3) サンプルのジョブを送信します
$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar

# (4) 再度クラスタを停止します
$ ./bin/stop-cluster.sh

ステップ(1)で、2つのプロセスを開始しました: JobManager用のJVM、TaskManager用のJVM。JobManagerは、localhost:8081でアクセス可能なwebインタフェースを提供します。ステップ(3)で、アプリケーションをJobManagerに送信するFlinkクライアント(短期間のJVMプロセス)を開始します

デプロイメントモード #

アプリケーションモード #

アプリケーションモードの背景にある高レベルの直感については、配備モードの概要を参照してください。

埋め込みアプリケーションでFlink JobManagerを起動するには、bin/standalone-job.shスクリプトを使います。このモードを示すために、単一のTaskManager上で実行されるTopSpeedWindowing.jarの例をローカルで開始します。

アプリケーションのjarファイルはクラスパスで利用可能である必要があります。これを実現する最も簡単な方法は、jarをlib/フォルダに置くことです:

$ cp ./examples/streaming/TopSpeedWindowing.jar lib/

次に、JobManagerを起動します:

$ ./bin/standalone-job.sh start --job-classname org.apache.flink.streaming.examples.windowing.TopSpeedWindowing

webインタフェースはlocalhost:8081で利用できるようになりました。ただし、TaskManagerがまだ実行されていないため、アプリケーションは開始できません:

$ ./bin/taskmanager.sh start

注意: アプリケーションでより多くのリソースが必要な場合は、複数のTaskManagerを開始できます。

サービスの停止はスクリプト経由でもサポートされています。複数のインスタンスを停止する場合は、複数回呼び出すか、stop-allを使ってください:

$ ./bin/taskmanager.sh stop
$ ./bin/standalone-job.sh stop

セッションモード #

セッションモードの背景にある高レベルの直感については、配備モードの概要を参照してください。

セッションモードでのローカルデプロイメントについては、上記の概要ですでに説明されています。<分節 9618 ¶>

Standalone Cluster Reference

設定

All available configuration options are listed on the configuration page(/docs/deployment/config/), in particular the Basic Setup(/docs/deployment/config/#basic-setup) section contains good advise on configuring the ports, memory, parallelism etc.

The following scripts also allow configuration parameters to be set via dynamic properties:

例:

$ ./bin/jobmanager.sh start -D jobmanager.rpc.address=localhost -D rest.port=8081

動的プロパティ経由で設定されたオプションは、flink-conf.yamlのオプションを上書きします。

デバッギング

If Flink is behaving unexpectedly, we recommend looking at Flink’s log files as a starting point for further investigations.

The log files are located in the logs/ directory.

There’s a .log file for each Flink service running on this machine.

In the default configuration, log files are rotated on each start of a Flink service – older runs of a service will have a number suffixed to the log file.

Alternatively, logs are available from the Flink web frontend (both for the JobManager and each TaskManager).

By default, Flink is logging on the “INFO” log level, which provides basic information for all obvious issues.

For cases where Flink seems to behave wrongly, reducing the log level to “DEBUG” is advised.

The logging level is controlled via the conf/log4.properties file.

Setting rootLogger.level = DEBUG will bootstrap Flink on the DEBUG log level.

There’s a dedicated page on the logging(/docs/deployment/advanced/logging/) in Flink.

Component Management Scripts

Starting and Stopping a cluster

bin/start-cluster.sh and bin/stop-cluster.sh rely on conf/masters and conf/workers to determine the number of cluster component instances.

If password-less SSH access to the listed machines is configured, and they share the same directory structure, the scripts also support starting and stopping instances remotely.

Example 1: Start a cluster with 2 TaskManagers locally Local deployment in Session Mode has already been described in the introduction above.<分節 9618 ¶>

Standalone Cluster Reference

設定

The following scripts also allow configuration parameters to be set via dynamic properties:

例:

$ ./bin/jobmanager.sh start -D jobmanager.rpc.address=localhost -D rest.port=8081

動的プロパティ経由で設定されたオプションは、flink-conf.yamlのオプションを上書きします。

デバッギング

If Flink is behaving unexpectedly, we recommend looking at Flink’s log files as a starting point for further investigations.

The log files are located in the logs/ directory.

There’s a .log file for each Flink service running on this machine.

In the default configuration, log files are rotated on each start of a Flink service – older runs of a service will have a number suffixed to the log file.

Alternatively, logs are available from the Flink web frontend (both for the JobManager and each TaskManager).

By default, Flink is logging on the “INFO” log level, which provides basic information for all obvious issues.

For cases where Flink seems to behave wrongly, reducing the log level to “DEBUG” is advised.

The logging level is controlled via the conf/log4.properties file.

Setting rootLogger.level = DEBUG will bootstrap Flink on the DEBUG log level.

There’s a dedicated page on the logging(/docs/deployment/advanced/logging/) in Flink.

Component Management Scripts

Starting and Stopping a cluster

bin/start-cluster.sh and bin/stop-cluster.sh rely on conf/masters and conf/workers to determine the number of cluster component instances.

If password-less SSH access to the listed machines is configured, and they share the same directory structure, the scripts also support starting and stopping instances remotely.

Example 1: Start a cluster with 2 TaskManagers locallyセッションモードでのローカルデプロイメントについては、introduction above.

スタンドアローンクラスタのリファレンス #

設定 #

利用可能な全ての設定オプションは設定ページでリストされ、特に基本的なセットアップセクションには、ポート、メモリ、並列度などに関する優れたアドバイスが含まれています。

以下のスクリプトでは、動的なプロパティを介して設定パラメータを設定することもできます:

jobmanager.sh
standalone-job.sh
taskmanager.sh
historyserver.sh

例:

$ ./bin/jobmanager.sh start -D jobmanager.rpc.address=localhost -D rest.port=8081

動的プロパティ経由で設定されたオプションは、flink-conf.yamlのオプションを上書きします。

デバッギング #

Flinkが予期せぬ動作をする場合は、更なる調査の開始点としてFlinkのログファイルを確認することをお勧めします。

ログファイルはlogs/ディレクトリにあります。このマシーンで実行されているFlinkサービスごとに、.logファイルがあります。デフォルトの設定では、Flinkサービスが開始せれるたびにログファイルがローテートされます。サービスの古い実行には、ログファイルの末尾に番号が付けられます。

あるいは、ログはFlink webフロントエンドから入手できます(JobManagerと各TaskManagerの両方)。

デフォルトで、Flinkは"INFO"ログレベルでログを記録し、全ての明白な問題に対する基本情報を提供します。Flinkが誤動作していると思われる場合は、ログレベルを"DEBUG"に下げることをお勧めします。ログレベルは、conf/log4.propertiesファイルを使って制御されます。 rootLogger.level = DEBUGに設定すると、DEBUGログレベルでFlinkがブートストラップされます。

Flinkのloggingには専用のページがあります。

コンポーネント管理スクリプト #

クラスタの開始と停止 #

bin/start-cluster.shとbin/stop-cluster.shは、conf/mastersとconf/workersに依存して、クラスタのコンポーネントインスタンスの数を決定します。

リストされたマシーンへのパスワード無しのSSHアクセスが設定されていて、それらのマシンが同じディレクトリ構造を共有している場合、スクリプトはリモートでのインスタンスの開始と停止もサポートします。

例 1: 2つのTaskManagersをローカルで使ってクラスタを起動します #

conf/masters の内容:

localhost

conf/workers の内容:

localhost
localhost

例 2: 分散クラスタJobManagerを起動します #

4台のマシーン(master1、worker1、worker2、worker3)から成るクラスタを仮定します。ネットワークを介してお互いに到達できます。

conf/masters の内容:

master1

conf/workers の内容:

worker1
worker2
worker3

これが動作するには、設定キーjobmanager.rpc.addressがmaster1に設定される必要がある事に注意してください。

高可用性セクションでスタンドバイJobManagerを使った3番目の例を示します。

Flinkコンポーネントの開始と終了 #

bin/jobmanager.shとbin/taskmanager.shスクリプトは、バックグランド(start引数を使用)またはフォアグラウンド(start-foregroundを使用)でのそれぞれのデーモンの起動をサポートします。フォアグラウンドモードでは、ログは標準出力に出力されます。このモードは、他のプロセスがFlinkデーモンを制御しているデプロイメントシナリオ(例えば、Docker)に役立ちます。

例えば、複数のTaskManagerが必要な場合、スクリプトは複数回呼ばれる可能性があります。インスタンスはスクリプトによって追跡され、1つずつ停止することも(stop)、全てをまとめて停止することも(stop-all)できます。

Windows Cygwinユーザ #

gitリポジトリからFlinkをインストールし、Windowsのgitシェルを使っている場合は、Cygwinは以下のような失敗を生成するかも知れません:

c:/flink/bin/start-cluster.sh: line 30: $'\r': command not found

このエラーは、gitはWindows上で実行する時にUNIXの行の終了が自動的にWindows形式の行の終了に変換されるために起こります。問題は、CygwinはUNIX形式の行の終了のみ取り扱うことができるという事です。解決方法は、以下の3つのステップに従うことで正しい行末を扱うようにCygwinの設定を調整することです。

Cygwinシェルを開始します。
以下を入力することでホームディレクトリを決定します
```
cd; pwd
```
これはCygwinルートパスの下のパスを返すでしょう。
NotePad、WordPadあるいは異なるテキストエディタを使って、ホームディレクトリ内のファイル.bash_profileを開き、以下を追加します: (ファイルが存在しない場合はそれを作成する必要があるでしょう)
```
$ export SHELLOPTS
```

$ set -o igncr


4. ファイルを保存し、新しいbashシェルを開きます。

### 高可用性のセットアップ

スタンドアローンクラスタのHAを有効にするには、[ZooKeeper HAサービス](/docs/deployment/ha/zookeeper_ha/)を有効にする必要があります。

さらに、複数のJobManagerを開始するようにクラスタを設定する必要があります。

HA-クラスターを開始するために、`conf/masters`内の*masters*ファイルを設定します:

- **masters file<x2/: *masters file*は全てのホストを含みます。この上でジョブマネージャーが開始され、ポート番号はwebユーザインタフェースがバインドされます。

```bash
master1:webUIPort1
[...]
masterX:webUIPortX

デフォルトでは、JobManagerは内部処理通信のためにrandom portを取り上げるでしょう。high-availability.jobmanager.portキーを介してこれを変更できます。このキーは1つのポート(例えば50010)、範囲(50000-50025)、あるいは両方の組み合わせ (50010,50011,50020-50025,50050-50075)を受け付けます。

例: 2つのジョブマネージャーを持つスタンドアローンHAクラスタ #

conf/flink-conf.yamlで高可用性モードとZooKeeper quorumを設定します:

high-availability.type: zookeeper
high-availability.zookeeper.quorum: localhost:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /cluster_one # important: customize per cluster
high-availability.storageDir: hdfs:///flink/recovery

conf/mastersの中でマスターを設定します:

localhost:8081
localhost:8082

conf/zoo.cfgの中でZooKeeper サーバを設定します (現在のところ、マシーンあたり1つだけZooKeeperを実行することができます):

server.0=localhost:2888:3888

ZooKeeper quorumを開始します:

$ ./bin/start-zookeeper-quorum.sh
Starting zookeeper daemon on host localhost.

HAクラスタを開始します:

$ ./bin/start-cluster.sh
Starting HA cluster with 2 masters and 1 peers in ZooKeeper quorum.
Starting standalonesession daemon on host localhost.
Starting standalonesession daemon on host localhost.
Starting taskexecutor daemon on host localhost.

ZooKeeper quorumとクラスタを停止します:

$ ./bin/stop-cluster.sh
Stopping taskexecutor daemon (pid: 7647) on localhost.
Stopping standalonesession daemon (pid: 7495) on host localhost.
Stopping standalonesession daemon (pid: 7349) on host localhost.
$ ./bin/stop-zookeeper-quorum.sh
Stopping zookeeper daemon (pid: 7101) on host localhost.

ユーザのjarsとクラスパス #

スタンドアローンモードでは、以下のjarsはユーザjarsとして認識され、クラスパスに含まれます:

セッションモード: 起動コマンドで指定されたJARファイル。
アプリケーションモード: 起動コマンドで指定されたJARファイルと、Flinkのusrlibフォルダの全てのJARファイル。

詳細については、クラスローディングのデバッグを参照してください。