Rebalance Failure Impact and Cleanup

Rebalance Phases: Conceptual Overview

Rebalance reorganizes database partitions to balance the number of partitions on leaf nodes and to ensure redundancy. To rebalance, the system needs to "move" some or all partitions to new nodes, promote some partitions to master, and drop some partitions.

The copy, promote and drop phases are performed for all partitions in a database. Thus for each database being rebalanced the actions are:

copy all partitions of the database that need to be copied.
promote all partitions of the database that need promotion.
drop all the no longer needed partitions of that database.

The rebalance plan can be seen with the EXPLAIN REBALANCE PARTITIONS command.

A full rebalance (REBALANCE ALL DATABASES / REBALANCE PARTITIONS) conceptually proceeds in three high‑level steps for each affected partition:

COPY PARTITION
- Create an additional copy of a partition on a target leaf.
- Source and target both exist; metadata is prepared, but primary/replica roles are not yet switched.
PROMOTE PARTITION (WITH or WITHOUT DROP)
- Switch roles, for example, make the new copy the master, adjust replication.
- If a partition that was a master before the rebalance is being dropped by the rebalance, PROMOTE PARTITION is called using WITH DROP, thereby dropping the old master immediately, instead of in step 3.
DROP PARTITION
- If the partition was a slave before the rebalance, then, it is dropped in this step by removinge the old, now‑redundant copy once the new placement is fully active and in sync.
- Dropping a partition reduces extra storage usage and finalizes the new topology.

Failures can occur in any of these steps. The engine’s primary design goal is to avoid data loss and keep the cluster in a consistent state, even if the operation does not complete.

Failure during COPY PARTITION

The typical causes can be:

External interruption (e.g., KILL CONNECTION/ KILL QUERY, kill script).
Resource exhaustion (e.g., out of memory, I/O saturation).
Network or storage errors during data transfer.
Lock timeouts from concurrent DDLs (such as OPTIMIZE TABLE).

Impact on the cluster:

The source (original) partition remains intact as the authoritative copy.
The target copy may be incomplete or discarded; REBALANCE may report failure or in some edge cases, the high level command may have failed, but some internal actions (such as creating new copies) may have succeeded.
Metadata may reflect no new copy (copy discarded), or more replicas than expected for a given partition.

Critically, user data remains safe on the original master; the risk is mainly wasted time and temporary extra storage, not corruption.

Failure during PROMOTE PARTITION (WITH/WITHOUT DROP)

The typical causes can be:

Timeouts in internal synchronization (for example, _SYNC_PARTITIONS, WaitForLSNWithTimeout).
Locks or internal dependencies that prevent fast role transition.
Resource pressure (CPU, memory, or merger activity) slowing replication or state updates.

Impact on the cluster:

Copy phase may have succeeded and now both old and new copies exist.
The old master remains the authoritative partition and the new copy remains a secondary/unpromoted replica or is left unused.
Data integrity is preserved; the cluster continues using the old placement.
In some cases, there may be extra replicas / unused copies, but not fewer than what is required for redundancy.

Failure during DROP PARTITION

The typical causes can be:

Errors when cleaning up old copies (for example, filesystem issues, transient node problems).
Operation is interrupted after new placement is live but before old copies are fully dropped.

Impact on the cluster:

New partition placement is already active and serving queries.
Old copies remain on disk (and may still be visible as extra replicas or orphans).
This primarily affects storage usage (more disk consumed than necessary) and operational clarity (extra copies may appear in low‑level views).
Database availability and data correctness are not impacted.

General Post‑Failure Recovery Flow

When a rebalance fails (regardless of the phase), the following standard flow is recommended:

Record the failure time and error:
- From the client use, SHOW REBALANCE ... STATUS, and memsql.log.

Verify data safety and redundancy:

SQL

SELECT * FROM information_schema.distributed_partitions_on_leaves
  WHERE database_name = 'your_db';

Ensure each partition has at least one master.

Plan a safe retry:
- Address underlying issues:
  - Resource limits (CPU, memory, I/O).
  - Heavy merger load.
  - Backup/maintenance overlap.
- Preferably use a maintenance window with no user workload.

Retry rebalance:

SQL

EXPLAIN REBALANCE ALL DATABASES;

REBALANCE ALL DATABASES;

Final cleanup (if needed)
- Once satisfied that the cluster is balanced and redundancy is correct:
  SQL
```
EXPLAIN CLEAR ORPHAN DATABASES;

CLEAR ORPHAN DATABASES;
```

Rebalance Failure Impact and Cleanup

On this page

Rebalance Phases: Conceptual Overview

Failure during COPY PARTITION

Failure during PROMOTE PARTITION (WITH/WITHOUT DROP)

Failure during DROP PARTITION

General Post‑Failure Recovery Flow

Was this article helpful?

On this page

Was this article helpful?