Google Cloud Spanner

From Luis Gallego Hurtado - Not Another IT guy
Jump to: navigation, search


Google Cloud Spanner is a mission-critical, relational database service with transactional consistency, global scale and high availability.

It is good for mission-critical applications, high transactions, scale + Consistency requirements.

Features

  • Fully managed relational database service (schemas, SQL queries, and ACID transactions) with a global scale.
  • 99,999% availability for multi-regional instances and provides transparent, synchronous replication across region and multi-region configurations.
  • Optimized performance by automatically sharding the data based on request load and size of the data.
  • Strong transactional consistency
  • Online schema changes with no downtime
  • Built on Google's dedicated network.
  • Enterprise-grade security: data-layer encryption, IAM integration, and audit logging.
  • On demand backup and restore.
  • Multi-language support for client libraries.

Schema and data model

Relational database schema: tables, rows, columns, values, primary keys.

Data strongly typed.

Sharded data: data is divided into "splits" that can be moved to different nodes/servers.

Interleave Data

An interleaved table is a table that is declared to be a child of another one, because related data should be stored together.

Replication

Replicas are created for each database split.

There are 3 types of replicas: read-write, read-only and witness.

Witness replicas don't perform read operations but they vote to commit writes. They don't contain data, but they make easier to achieve quorums. They also participate in leader election, but they cannot become leaders.

Single-region instances use only read-write replicas, but multi-region instances use combination of 3 types of replicas.

Certain parts of the data in the replicas is owned by different nodes - data is managed by custom algorithm.

In multi-region configuration, 2 regions contain 2 read-write replicas each.

Horizontal Scalability

Adding a node does not increase number of replicas, but resources managed by replicas (CPU and memory).

Each node provides up to 2 TB of storage:

  • Regional: up to 10k queries per second and 2k update queries (single row at 1KB) per second.
  • Multiregional: up to 7k queries per second and 1.8k update queries (single row at 1KB) per second.

Nodes can be added/removed to the instance.

After increasing the number of nodes, the splits are automatically optimized across replicas over regions.

TrueTime and External Consistency

TrueTime is a distributed clock that enables applications to generate increasing timestamps.

Cloud Spanner uses it to timestamp transactions, allowing it to perform consistent reads across entire database and across multiple regions without blocking writes.

With External Consistency, Cloud Spanner behaves as it all transactions are executed sequentially, without blocking data during strong reads (like Strong Consistency does).

IAM

Roles: admin, database admin, database reader, database writer, viewer

Use Cases

  • Adtech
  • Financial services
  • Global supply chain
  • Retail