This is a high level glossary of terms. To get more detail on a particular term, click on one of the related links. To find the information you are looking for, you can also try searching this site. If there is a term that is not here that you would like to see defined, please email docs@clustrix.com
Search all Clustrix Documentation:
Jump to:
A
ACID
ACID (Atomicity, Consistency, Isolation, Durability) refer to the characteristics of a database that guarantee that transactions are processed reliably.
allnodes
refers to the ability to set replicas = allnodes.
Atomic
refers to the characteristic of a transaction that is all or nothing.
B
B-Tree
is a standard computer science data structure use for fast access. See B-Tree at Wikipedia
Base Representation
aka baserep. The relation representation that contains all columns, and indexed by the primary key. If no primary key is defined, ClustrixDB assigns a unique rowed key.
BigC
is ClustrixDB's garbage collection process that cleans up undo logs needed to rollback running transactions. ClustrixDB uses multi version concurrency control (MVCC) and as part of this must to keep the system state of a transaction while open, and cannot clean up past the oldest running transaction. Once the oldest running transaction is commited "bigc" cleans up the various undo logs. See also "pinning BigC"
Broadcast
Broadcasting refers to a method of transferring a message to all recipients simultaneously. ClustrixDB leverages distributed computing to avoid broadcasts. See how other databases do joins with broadcast .
C
cid
Commit Identifier that marks when transactional changes become visible to other transactions.
クラスタ
A group of ClustrixDB nodes connected to provide a redundant, scalable RDBMS.
A consistent transaction does not violate any referential integrity during its execution.
ClustrixDB uses a cost-based model for the query optimizer (Sierra) that uses a cost factor based on I/O, CPU usage, and latency.
D
ClustrixDB leverages fine-grained data distribution and a shared-nothing architecture to provide scalability.
データベース
A collection of tables, or relations. This is also sometimes referred to as a schema. (The ClustrixDB term for tables is "relations.")
Refers to the ability for ClustrixDB to perform aggregate queries (e.g. OLAP) in a distributed manner.
分散キー
The distribution key is some prefix of a representation's key columns and is used to distribute data across the cluster.
A database is durable if it provides a guarantee that transactions that have committed will survive permanently, including in the event of unexpected power loss or other hardware failure.
E
This describes how a query is evaluated in ClustrixDB.
F
Flow Control
The GTM and other subsystems use flow control to prevent message senders from outpacing receivers, and to prevent receiving nodes' memory from filling up with unprocessed messages
Forward
Sending a row or rows to another node for further processing
Fragment
A pre-compiled part of a query usually sent to another node for processing
G
Global Transaction Manager (GTM)
is a subsystem that manages the atomic commitment of transactions across the cluster, and ensures that all nodes involved come to the same decision every time
Group Change
A Group Change is the event that occurs when a cluster forms a new group. This occurs when a node joins or leaves the cluster group. This can also manifest in Clustrix Insight as 503 Errors while the Group Change occurs.
H
I
iid
Invocation Id that marks the beginning of a statement within a transaction.
ClustrixDB uses an independent index distribution
is a property that defines how and when changes made by one transaction are visible to other concurrent transactions.
J
K
L
Lumpy
Generally refers to a poor data distribution.
M
refers to the ability to leverage a large number of processors to perform a set of coordinated ocmputations parallel.
Multi-Version Concurrency Control (MVCC)
is an method used to implement concurrency and consistency in a distributed database environment. One of the original papers on this topic is Concurrency Control in Distributed Database Systems. ClustrixDB implements a modified version of this algorithm that provides optimizations for modern database workloads.
N
Node
A single server running the ClustrixDB software. Multiple nodes connect to form a cluster.
OID
is internal Object Identifier used by ClustrixDB to describe a "thing" – types, relations, and rows are all examples of things that have OIDs.
OLAP
On-Line Analytical Processing
OLTP
On-line Transaction Processing
P
PD (Probability Distribution)
Probability Distributions are tracked for values in each relation to predict the size of a the result set for a given query.
Protected
refers to the status of the cluster when at least two replicas of every slice are available. See also Reprotect
Q
The job of the optimizer is to determine which execution plan uses the least amount of resources. Typically this is done by assigning costs to plan choosing the lowest cost plan.
Queue (Recovery)
Queues are used to track changes to data that may have occurred for a given node while it was unavailable to the cluster. See also Queue Replay and Queue Flip .
R
The ClustrixDB Rebalancer automatically moves, copies, and redistributes, and re-ranks data across the cluster. For more on this topic, see ClustrixDB Rebalancer.
Relation
is a table.
ClustrixDBは複数のデータのコピーを耐障害性と可用性のために保持します。By rule, copies are stored on different nodes.
refers to a collection of indexes for a table. Each representation is made up of a series of slices.
When a slice has fewer replicas than desired, the Rebalancer will create a new copy of the slice on a different node
As the data set grows, ClustrixDB will automatically and incrementally redistribute the dataset one or more slices at a time.
RIGR
is the Relational Intermediate Language, a version of SQL used in the ClustrixDB internals.
S
Sierra
Sierra is the name given to the ClustrixDB Query Optimizer.
ClustrixDB breaks up each representation into a collection of logical slices. Rows are assigned to slices according to the results of a hashing function. See also Re-slicing
Soft Fail
is an operation that removes a node from a cluster.
T
U
Refers to the state of the cluster when it does not have at least two copies (replicas) of each slice.
V
vrel
Virtual relation, often used to represent system information.
W
WAL
The Write Ahead Log is used to log every command that the user executes.
X
is an identifier used by ClustrixDB internals to denote the logical start of a Transaction.
Y
Z