グラフ ジェネレータ

Gelly はスケーラブルなグラフのジェネレータのコレクションを提供します。各ジェネレータは

  • 大きなデータセットを生成するため、並行可能
  • 並行度に関係なく同じグラフを生成する、スケールフリー
  • できる限り少ないオペレータを使う、倹約

グラフのジェネレータはビルダーパターンを使って設定されます。ジェネレータのオペレータの並行度はsetParallelism(parallelism)を呼ぶことで明示的に設定することができます。並行度を少なくするとメモリとネットワークバッファの割り当てを減らすでしょう。

最初にグラフ固有の設定、次に全てのジェネレータで一般的な設定、そして最後にgenerate()の呼び出しが呼ばれるべきです。以下の例は2次元の格子グラフを設定し、並行度を設定し、そしてグラフを生成します。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

boolean wrapEndpoints = false;

int parallelism = 4;

Graph<LongValue, NullValue, NullValue> graph = new GridGraph(env)
    .addDimension(2, wrapEndpoints)
    .addDimension(4, wrapEndpoints)
    .setParallelism(parallelism)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.GridGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

wrapEndpoints = false

val parallelism = 4

val graph = new GridGraph(env.getJavaEnv).addDimension(2, wrapEndpoints).addDimension(4, wrapEndpoints).setParallelism(parallelism).generate()

循環グラフ

円グラフ はオフセットの1つ以上の隣接する領域で設定された有向グラフです。Edges connect integer vertex IDs whose difference equals a configured offset. オフセットを持たない円グラフは空のグラフで、最大の領域を持つグラフは完全グラフです。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5;

Graph<LongValue, NullValue, NullValue> graph = new CirculantGraph(env, vertexCount)
    .addRange(1, 2)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.CirculantGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5

val graph = new CirculantGraph(env.getJavaEnv, vertexCount).addRange(1, 2).generate()

完全グラフ

それぞれ異なる頂点のペアに接続する無効グラフ。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5;

Graph<LongValue, NullValue, NullValue> graph = new CompleteGraph(env, vertexCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.CompleteGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5

val graph = new CompleteGraph(env.getJavaEnv, vertexCount).generate()
0 1 2 3 4

循環グラフ

An undirected graph where the set of edges form a single cycle by connecting each vertex to two adjacent vertices in a chained loop.

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5;

Graph<LongValue, NullValue, NullValue> graph = new CycleGraph(env, vertexCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.CycleGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5

val graph = new CycleGraph(env.getJavaEnv, vertexCount).generate()
0 1 2 3 4

Echo グラフ

An echo graph is a circulant graph with n vertices defined by the width of a single range of offsets centered at n/2. 頂点は ‘far’ 頂点に接続され、それは ‘near’ 頂点に接続し、それは ‘far’ 頂点に接続し …。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5;
long vertexDegree = 2;

Graph<LongValue, NullValue, NullValue> graph = new EchoGraph(env, vertexCount, vertexDegree)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.EchoGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5
val vertexDegree = 2

val graph = new EchoGraph(env.getJavaEnv, vertexCount, vertexDegree).generate()

空グラフ

辺を含まないグラフ

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5;

Graph<LongValue, NullValue, NullValue> graph = new EmptyGraph(env, vertexCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.EmptyGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5

val graph = new EmptyGraph(env.getJavaEnv, vertexCount).generate()
0 1 2 3 4

格子グラフ

An undirected graph connecting vertices in a regular tiling in one or more dimensions. 各次元は個別に設定されます。次元数が少なくとも3の時、エンドポイントはwrapEndpointsを設定することで任意に接続されます。以下の例をaddDimension(4, true)に変更すると、0 から 3へ、そして 47接続するでしょう。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

boolean wrapEndpoints = false;

Graph<LongValue, NullValue, NullValue> graph = new GridGraph(env)
    .addDimension(2, wrapEndpoints)
    .addDimension(4, wrapEndpoints)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.GridGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val wrapEndpoints = false

val graph = new GridGraph(env.getJavaEnv).addDimension(2, wrapEndpoints).addDimension(4, wrapEndpoints).generate()
0 1 2 3 4 5 6 7

ハイパーキューブ グラフ

辺がn-次元のハイパーキューブを構成する無向グラフ。ハイパーキューブ内の各頂点は各次元のもう一つの頂点に接続します。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long dimensions = 3;

Graph<LongValue, NullValue, NullValue> graph = new HypercubeGraph(env, dimensions)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.HypercubeGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val dimensions = 3

val graph = new HypercubeGraph(env.getJavaEnv, dimensions).generate()
0 1 2 3 4 5 6 7

パス グラフ

An undirected graph where the set of edges form a single path by connecting two endpoint vertices with degree 1 and all midpoint vertices with degree 2. パスグラフはサイクルグラフから1つの辺を削除することで形成することができます。

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 5

Graph<LongValue, NullValue, NullValue> graph = new PathGraph(env, vertexCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.PathGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 5

val graph = new PathGraph(env.getJavaEnv, vertexCount).generate()
0 1 2 3 4

RMat グラフ

A directed power-law multigraph generated using the Recursive Matrix (R-Mat) model.

RMat is a stochastic generator configured with a source of randomness implementing the RandomGenerableFactory interface. Provided implementations are JDKRandomGeneratorFactory and MersenneTwisterFactory. These generate an initial sequence of random values which are then used as seeds for generating the edges.

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

RandomGenerableFactory<JDKRandomGenerator> rnd = new JDKRandomGeneratorFactory();

int vertexCount = 1 << scale;
int edgeCount = edgeFactor * vertexCount;

Graph<LongValue, NullValue, NullValue> graph = new RMatGraph<>(env, rnd, vertexCount, edgeCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.RMatGraph

val env = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 1 << scale
val edgeCount = edgeFactor * vertexCount

val graph = new RMatGraph(env.getJavaEnv, rnd, vertexCount, edgeCount).generate()

The default RMat contants can be overridden as shown in the following example. The contants define the interdependence of bits from each generated edge’s source and target labels. The RMat noise can be enabled and progressively perturbs the contants while generating each edge.

The RMat generator can be configured to produce a simple graph by removing self-loops and duplicate edges. Symmetrization is performed either by a “clip-and-flip” throwing away the half matrix above the diagonal or a full “flip” preserving and mirroring all edges.

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

RandomGenerableFactory<JDKRandomGenerator> rnd = new JDKRandomGeneratorFactory();

int vertexCount = 1 << scale;
int edgeCount = edgeFactor * vertexCount;

boolean clipAndFlip = false;

Graph<LongValue, NullValue, NullValue> graph = new RMatGraph<>(env, rnd, vertexCount, edgeCount)
    .setConstants(0.57f, 0.19f, 0.19f)
    .setNoise(true, 0.10f)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.RMatGraph

val env = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 1 << scale
val edgeCount = edgeFactor * vertexCount

clipAndFlip = false

val graph = new RMatGraph(env.getJavaEnv, rnd, vertexCount, edgeCount).setConstants(0.57f, 0.19f, 0.19f).setNoise(true, 0.10f).generate()

シングルトン エッジ グラフ

An undirected graph containing isolated two-paths where every vertex has degree 1.

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexPairCount = 4

// note: configured with the number of vertex pairs
Graph<LongValue, NullValue, NullValue> graph = new SingletonEdgeGraph(env, vertexPairCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.SingletonEdgeGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexPairCount = 4

// note: configured with the number of vertex pairs
val graph = new SingletonEdgeGraph(env.getJavaEnv, vertexPairCount).generate()
0 1 2 3 4 5 6 7

スター グラフ

An undirected graph containing a single central vertex connected to all other leaf vertices.

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

long vertexCount = 6;

Graph<LongValue, NullValue, NullValue> graph = new StarGraph(env, vertexCount)
    .generate();
import org.apache.flink.api.scala._
import org.apache.flink.graph.generator.StarGraph

val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

val vertexCount = 6

val graph = new StarGraph(env.getJavaEnv, vertexCount).generate()
0 1 2 3 4 5

上に戻る

TOP
inserted by FC2 system