rebalanceは mogile の必須の操作ではありません。しかしながら、デバイスストレージに注意を払い、どこにファイルをおくか決めることは、よい練習です。もし3つの空のホストを追加したら、既存のファイルをそれらの上に移し替えて、その新しいファイルを分散するために全てのホストに渡ってスペースを使えるようにするのは良い考えです。(古いものよりもっと頻繁にアクセスされるかもしれません)

MogileFSのリプリケーションはファイルをリプリケートするための多くのオプションをもっています。So keep it happy :)

Rebalance ポリシー

新しい rebalance/drain システムはポリシーを使って動作します。あなたがオプションの文字列を定義し、システムがそのオプションを評価し決定します。これを書いている時点では、rebalanceは開発中で、ここのオプションは利用可能な全てのオプションではないかも知れません。

rebalanceの設定を表示するには:

$ mogadm rebalance settings
             rebal_policy = from_percent_used=95 to_percent_free=50 limit_type=device limit_by=size limit=5g fid_age=old

設定するには:

$ mogadm rebalance policy --options="from_hosts=3 to_percent_free=50"

最初にrebalanceを開始する時には、mogileはソースデバイスのリストを見つけ出し保存するでしょう。しかしながら、数秒ごとにデスティネイションデバイスのリストを再評価するでしょう。This is both for avoiding a ping-pong state and tpo always find the best possible candidates for a destination.

ポリシーのオプション

(それらのデフォルト)

    # source
    from_hosts => [],           # host ids (not names).
    from_devices => [],         # device ids.
    from_percent_used => undef, # 0.nn * 100
    from_percent_free => undef,
    from_space_used => undef,
    from_space_free => undef,
    fid_age => 'old',           # old|new
    limit_type => 'device',     # global|device
    limit_by => 'none',         # size|count|percent|none
    limit => undef,             # 100g|10%|5000

    # target
    to_hosts => [],
    to_devices => [],
    to_percent_used => undef,
    to_percent_free => undef,
    to_space_used => undef,
    to_space_free => undef,
    not_to_hosts => [],
    not_to_devices => [],
    use_dest_devs => 'all',     # all|N (list up to N devices to rep pol)
    leave_in_drain_mode => 0,

from_(percent|space)_*: Pull fids from devices "at least this x". At least this much space used, this much percent free. スペースのオプションはメガバイトで表現されます。

to_(percent|space)_*: Same as above, except used in limiting possible destination devices.

(from|to)_hosts: ホストIDのカンマ区切りのリストが与えられ、全てのデバイスを選択します。

(from|to)_devices: どのデバイスをファイルの取り出しあるいは配置に使いたいかを直接指定する他のオプションはこのリストをもっと少なくするかも知れません (percent_used, etc). 注意: これはカンマで区切られたデバイスのIDです。例えば、from_devices=199,201,233

not_to_*: デスティネイションから特定のデバイスまたはホストを透過する。

fid_age: Defines whether rebalance will choose "old" (ascending numbered) fids or "new" (descending numbered) fids from the device first. MogileFSはfidのインクリメンタを使っているため、ファイルは自然に年齢で並べられます。いくつかのセットアップでは、"old"ファイルは new ファイルよりもアクセスされにくいかも知れません。これはrebalancing 決定に影響を与えるかも知れません。

limit_type: (global|device) whether or not the specified limit is applied globally (drain 5000g in total from any of these devices), or per device (pick 12 devices and drain 10g from each)

limit_by: (size|count|percent|none) 何についてlimitを定義するか。注意書いている時点では、"percent" は実装されていません。'count' は移動するファイルの数を制限し、'byte' はコピーするバイト数を指定し、'none' はデバイスから全てのファイルを削除することを意味します。

limit: 上で定義した制限。'size' はデフォルトではバイト数で制限しますが、人が読みやすい修飾子を取ることができます (500m, 10g, 13t など)

use_dest_devs: (all|N) after applying all filters, you may have any number of destination devices. A handful, dozens, hundreds. This limits the amount of devices that replication will later consider. これは主に最適化のためのもので、幾つかのデバイスがあると、これを意味のある数字に設定したいと思うかも知れません。

leave_in_drain_mode: (0|1) 以前のバージョンでは、デバイスを'drain'に設定することは、mogile がデバイスを自動的に繰り返し攻撃し、ファイルを削除することを意味していました。今ではそれは単純に"ここに新しいファイルを配置しない"を意味します。デバイス上でrebalanceが動作している間は、aliveからdrainモードに設定されます。もし、古い drain の挙動を真似してみたければ、この値を 1 に設定します。limitを設定せずにこれを有効にすると、デバイスから全てのファイルを削除し、再び新しいファイルを追加させないでしょう。

Rebalanceの実行

上で述べたように、ポリシーを作成します。mogadm rebalance 設定を使って、それを再検討することができます。

テスト

$ mogadm rebalance test
Tested rebalance policy...
Policy: etc

Source devices:
 - 100
 - 102
 - 103
 - 104
Destination devices:
 - 156
 - 157
 - 158
 - 159

Before starting a rebalance, you should review what devices the policy would match.

Hopefully future versions will display more information about the devices, but for now you may match the lists up against the output of mogadm check mentally :)

開始

$ mogadm rebalance start
$ mogadm rebalance stop
$ mogadm rebalance reset

rebalanceは stop すると一時停止しますが、REBAL_QUEUEにあるエントリーは実行し続けるでしょう。mogstats --stats="general-queues"を使ってキューを見ることができます。再起動する前に、このキューが完了するのを待つのが望ましいかも知れません。

一からrebalanceを再起動するには、ポリシーを変更するか、または止まっている間に mogadm rebalance reset を実行してください。

監視

$ mogadm rebalance status
Rebalance is running
Rebalance status:
             bytes_queued = 126008251219
           completed_devs = ,102,125,151,148
              fids_queued = 519021
             sdev_current = 119
             sdev_lastfid = 54646960511250969
               sdev_limit = 2840763873
              source_devs = 108,115,103,113,152,142,107,141,100

rebalanceのstatus出力は、単純に内部ステータスのdumpと幾つかのカウンターです。ジョブマスターの実行の後で、ステータスは数分毎に更新されます。

sdev_current は、動作しているデバイスです。

sdev_limit は rebalanceが移動しようとしている残りのバイト(あるいはファイル数)です。

fids_queued は開始してからどれだけのfidがキューされたかのグローバルカウンタです。

bytes_queued は開始してから移動されたバイト数です。

ステートはrebalanceの実行が終わった後で残っているので、最後のステートを見ることができるでしょう。

rebalanceの間、mogileのsyslog出力を見ることは良い考えです。If it runs into fids it cannot rebalance for some reason, the information is sent to syslog (or !watch if you telnet to a tracker).

最適化

FSCKのためのほとんどの最適化はrebalanceに適用されますので、処理を高速化するtipsはhttp://code.google.com/p/mogilefs/wiki/FSCK#Tuning_FSCK を見てください。

基本的にtrackerに渡ってreplicateジョブの数を増加させます。

telnet trackername trackerport
!want 5 replicate

There is a configurable setting: queue_rate_for_rebal which defaults to 60 (in Mogilefs 2.45). ほとんどの人はそのままにしておくことができます。If adding mor e replicate processes stops helping and load across trackers/database is still low you may want to look into tweaking this.

Rebalance の例

全てをおおまかに均等にする

As of this writing there're some missing shortcuts for evening out your file distribution. これをするには二つの取り組み方があります:

One is to calculate how much disk space to move from each device on the "fuller" devices, and where to put them ie:

from_percent_used=90 to_percent_used=10 limit_type=device limit_by=size limit=5g

それぞれ4つのディスクがfullの二つのホストがあり、空のディスクの二つの新しいホストがあるとすると、上の設定では新しいホストに40gのデータが移動するでしょう。

もう一つの取り組み方は、 mogadm rebalance test が新しいデバイスを引っ掛けなくなるまで繰り返しrebalanceを実行する方法です。

from_percent_used=51 to_percent_free=51 limit_type=device limit_by=size limit=1g

The above will take files off anything > 51% full, and move 1g from each device toward anything at least 51% empty. The theory is if you run this enough times it should trend towards even. 気をつけてください; もし全てのクラスタが50%以上フルであれば、結局rebalanceをずっと再実行しなければならないでしょう。

どのようにrebalanceが動作するのか

デバイスをフィルタする

Simply put, the above document tells you how to build a string which filters the list of devices and selects which ones should or shouldn't be drained from, or should or shouldn't be replicated to.

Once the devices are selected, rebalance will run them through one at a time and queue up fids for the replication workers to actually rebalance.

内部的な rebalance

Below is a high level example of the flow a replicate worker has for rebalance. This is useful to know in case of seeing "would_worsen" errors on small clusters.

fid 5 は 3つのコピーを持っています。dev 31, 32, 33のそれぞれに1つあります。
rebalance は fidを dev33から抜き出そうと決めます。

rebalance はリプリケーションコード "FID 5をリプリケートしなさい。そしてdev33にはこのコピーを持ってはいけません。そしてdev33にこのFIDを置いてはいけません"を話します。

This is a constraint that rebalance can never accidentaly leave a fid worse off than it was before, if anything goes wrong it should be "too_happy" and not sad. これは、たった一つの有効なコピーを持つfidを移動することができるようも扱います。そうでなければ、drainコードがfidを破壊するでしょう。

これで、4つのコピーがあり、リバランスコードはdev33から1つを削除します。

そして3つのコピーを持たなければなりません。そしてdev33から離れてバランスされます。

► Sign in to add a comment