This is a high level glossary of terms. To get more detail on a particular term, click on one of the related links. To find the information you are looking for, you can also try searching this site. If there is a term that is not here that you would like to see defined, please email docs@clustrix.com

Search all Clustrix Documentation:

Jump to:

A | B | C | D | E | F | G | H | I | J | K | L | M | N | P | Q | R | S | T | U | V | W | X | Y | Z

A

ACID

ACID (Atomicity, Consistency, Isolation, Durability) refer to the characteristics of a database that guarantee that transactions are processed reliably.

allnodes

refers to the ability to set replicas = allnodes.

Atomic

refers to the characteristic of a transaction that is all or nothing.

B

B-Tree

is a standard computer science data structure use for fast access. See B-Tree at Wikipedia

Base Representation

aka baserep. The relation representation that contains all columns, and indexed by the primary key. If no primary key is defined, ClustrixDB assigns a unique rowed key.

BigC

is ClustrixDB's garbage collection process that cleans up undo logs needed to rollback running transactions. ClustrixDB uses multi version concurrency control (MVCC) and as part of this must to keep the system state of a transaction while open, and cannot clean up past the oldest running transaction. Once the oldest running transaction is commited "bigc" cleans up the various undo logs. See also "pinning BigC"

Broadcast

Broadcasting refers to a method of transferring a message to all recipients simultaneously. ClustrixDB leverages distributed computing to avoid broadcasts. See how other databases do joins with broadcast .

C

cid

Commit Identifier that marks when transactional changes become visible to other transactions.

クラスタ

A group of ClustrixDB nodes connected to provide a redundant, scalable RDBMS.

Consistent

A consistent transaction does not violate any referential integrity during its execution.

C ost

ClustrixDB uses a cost-based model for the query optimizer (Sierra) that uses a cost factor based on I/O, CPU usage, and latency.

D

Data Distribution

ClustrixDB leverages fine-grained data distribution and a shared-nothing architecture to provide scalability.

データベース

A collection of tables, or relations. This is also sometimes referred to as a schema. (The ClustrixDB term for tables is "relations.")

分散型統計

Refers to the ability for ClustrixDB to perform aggregate queries (e.g. OLAP) in a distributed manner.

分散キー

The distribution key is some prefix of a representation's key columns and is used to distribute data across the cluster.

Durable

A database is durable if it provides a guarantee that transactions that have committed will survive permanently, including in the event of unexpected power loss or other hardware failure.

E

評価モデル

This describes how a query is evaluated in ClustrixDB.

F

Flow Control

The GTM and other subsystems use flow control to prevent message senders from outpacing receivers, and to prevent receiving nodes' memory from filling up with unprocessed messages

Forward

Sending a row or rows to another node for further processing

Fragment

A pre-compiled part of a query usually sent to another node for processing

G

Global Transaction Manager (GTM)

is a subsystem that manages the atomic commitment of transactions across the cluster, and ensures that all nodes involved come to the same decision every time

Group Change

A Group Change is the event that occurs when a cluster forms a new group. This occurs when a node joins or leaves the cluster group. This can also manifest in Clustrix Insight as 503 Errors while the Group Change occurs.

H

I

iid

Invocation Id that marks the beginning of a statement within a transaction.

Index Distribution

ClustrixDB uses an independent index distribution

Isolation

is a property that defines how and when changes made by one transaction are visible to other concurrent transactions.

J

K

L

Lumpy

Generally refers to a poor data distribution.

M

Massively Parallel Processing

refers to the ability to leverage a large number of processors to perform a set of coordinated ocmputations parallel.

Multi-Version Concurrency Control (MVCC)

is an method used to implement concurrency and consistency in a distributed database environment. One of the original papers on this topic is Concurrency Control in Distributed Database Systems. ClustrixDB implements a modified version of this algorithm that provides optimizations for modern database workloads.

N

Node

A single server running the ClustrixDB software. Multiple nodes connect to form a cluster.

OID

is internal Object Identifier used by ClustrixDB to describe a "thing" – types, relations, and rows are all examples of things that have OIDs.

OLAP

On-Line Analytical Processing

OLTP

On-line Transaction Processing

P

PD (Probability Distribution)

Probability Distributions are tracked for values in each relation to predict the size of a the result set for a given query.

Protected

refers to the status of the cluster when at least two replicas of every slice are available. See also Reprotect

Q

Query オプティマイザ

The job of the optimizer is to determine which execution plan uses the least amount of resources. Typically this is done by assigning costs to plan choosing the lowest cost plan.

Queue (Recovery)

Queues are used to track changes to data that may have occurred for a given node while it was unavailable to the cluster. See also Queue Replay and Queue Flip .

R

リバランサー

The ClustrixDB Rebalancer automatically moves, copies, and redistributes, and re-ranks data across the cluster. For more on this topic, see ClustrixDB Rebalancer.

Relation

is a table.

レプリカ

ClustrixDBは複数のデータのコピーを耐障害性と可用性のために保持します。By rule, copies are stored on different nodes.

Representation

refers to a collection of indexes for a table. Each representation is made up of a series of slices.

Reprotect

When a slice has fewer replicas than desired, the Rebalancer will create a new copy of the slice on a different node

Reslicing

As the data set grows, ClustrixDB will automatically and incrementally redistribute the dataset one or more slices at a time.

RIGR

is the Relational Intermediate Language, a version of SQL used in the ClustrixDB internals.

S

Sierra

Sierra is the name given to the ClustrixDB Query Optimizer.

スライス

ClustrixDB breaks up each representation into a collection of logical slices. Rows are assigned to slices according to the results of a hashing function. See also Re-slicing

Soft Fail

is an operation that removes a node from a cluster.

T

U

Under-Protected

Refers to the state of the cluster when it does not have at least two copies (replicas) of each slice.

V

vrel

Virtual relation, often used to represent system information.

W

WAL

The Write Ahead Log is used to log every command that the user executes.

X

xid

is an identifier used by ClustrixDB internals to denote the logical start of a Transaction.

用語集

A

B

C

D

E

F

G

H

I

J

K

L

M

N

P

Q

R

S

T

U

V

W

X

Y

Z