Databases on vnykmshr

Primary PII

Tue, 05 Nov 2024 00:00:00 +0000

A regulation arrives. Or an auditor. Or a new market with stricter rules. PII is a thing the application was always sloppy about, and now it is a thing the application has to be careful with. This is how PII externalization begins: as someone else’s deadline, landing on the engineering team as an initiative.

The work looks like encryption at first. It is not.

Identify

The first question is not how to encrypt. The first question is what to encrypt.

Redis caching patterns

Thu, 20 Jun 2024 00:00:00 +0000

Put Redis in front of a database and reads get fast. The cost is a cache layer that’s now load-bearing, and a set of failure modes that come with that.

Three write patterns, three hard problems. The patterns determine consistency. The problems determine whether your cache layer is a net positive or a source of outages.

Write patterns

Cache-aside (lazy loading). The application checks cache on read. On miss, it reads from the database and populates cache. Writes go directly to the database; cache entries are either invalidated or left to expire.

PostgreSQL HA

Mon, 15 Mar 2021 00:00:00 +0000

PostgreSQL’s streaming replication is straightforward to set up. The documentation is clear, the configuration is well-understood, and base backups with pg_basebackup work reliably.

The operational problems are the hard part. They show up when the primary goes down and the automated failover does the wrong thing. Or when you promote a replica that’s silently been two hours behind. Or when you discover that backups you’ve been taking for months don’t actually restore.

The week pgbouncer stopped being news

Thu, 12 Jul 2018 00:00:00 +0000

The connection count climbs faster than our instance classes can keep up. Ops is hot. Every few weeks the same thread resurfaces: we need a pool in front of Postgres before the next scale event.

We move on pgbouncer.

The choice

Two modes on the table. Session pooling hands a connection to a client and gives it back when the client disconnects. Transaction pooling hands one out per transaction. Transaction is tighter – the pool stretches further, the math gets better – but the client loses everything a session holds. Server-side prepared statements. Advisory locks. Temp tables. SET commands that expect to persist.

MySQL on XFS

Thu, 11 Apr 2013 00:00:00 +0000

XFS handles database workloads better than ext4 – better concurrent I/O, more efficient metadata operations for tables-heavy schemas, and delayed allocation that improves write throughput. The obvious approach is to change MySQL’s datadir in the config. The less obvious approach is bind mounts, which keep every path where the system expects it.

Setup

Install XFS utilities alongside MySQL:

sudo apt-get install -y xfsprogs mysql-server

Create the filesystem on the dedicated volume:

Rolling your own search

Wed, 14 Mar 2012 00:00:00 +0000

The shop needed catalog search. Users type something, products come back. Sounds trivial until you start building it.

Our stack is Node.js, MySQL for the primary data store, MongoDB for everything else we need to go fast. The catalog lives in MySQL – products, categories, attributes, prices. Normalized, relational, correct. But relational isn’t searchable. Try finding “blue cotton kurta” across five joined tables with MySQL FULLTEXT on MyISAM. It sort of works. The relevance is terrible.