In the first two articles we examined vacuums, clusters and freeze. In this third and final piece we look at indexes and backups.
What are indexes and backups?
An index, like in a book, is a separate set of information that tells us what is where and saves having to re-read the whole book to find something. PostgreSQL uses indexes for a similar reason and broadly in a similar way.
Indexes have their downsides. Every time the name of a chapter in a book is changed the corresponding entry in the index must be updated. Any new entry in the book must be added to the index. If the indexed information is moved, the X value that points to a given page must be changed. Generally, the more indexes you have the more trouble it is to keep your information consistent.
Indexes serve to transfer work away from reading operations and onto writing operations. If your application reads little and writes a lot you should probably limit your indexes to a bare minimum. With the opposite scenario, you can make a lot of indexes at very little cost. There is no general rule. It depends on how you are using PostgreSQL.
What sorts of indexes can we use?
There are four kinds of index in the basic version of PostgreSQL:
How does an index get bloated?
Thinking back to the book example, entries in the index need to be updated to correspond to what happens to live data forming the book’s content. If by accident we update/delete most entries in the index, but not all, we end up with lots of sheets that contain mostly invalid entries. But they cannot be reused because they still contain at least one live entry. This is called bloat and it needs to be dealt with.
Versions before 9.0.x bloat their indexes when performing a VACUUM FULL because they use transient copies in the process. Just to be clear, this does not apply to autovacuum, since it never issues a FULL.
Bloat can be measured. Generally speaking, it is shown by an increase of the ratio between index size and table size. In really bad situations, a bloated index, although containing only a subset of a table's information, can become larger than the table itself.
Best practice with indexes
make sure that existing indexes are all used and that using them does in fact produce an improvement
keep indexes in a different physical storage from that used for data in order to reduce I/O contention
A staging box that clones an existing archive and runs queries against it may serve as a structural metrics generator for indexes as long as its storage structure is comparable. However, index bloat can only be measured directly on the live box.
Indexes are generally created by developers in the expectation of them being used by their queries. However, there is no regular data gathering about how indexes are being used and about their possible bloating, and no automated maintenance. Constant monitoring is needed to check what indexes are actually used and how much bloat an index has. The observed values should trigger a periodic re-indexing.
There are two distinct backup strategies available in PostgreSQL.
Dumps. This generates text files made of sequences of DDL and DML commands that can be executed again to reproduce the state of the database at the moment in which the dump was produced. It is important to understand that such a restore is ‘lossy’ by design. No data about transactions following the dump generation survive such a backup/restore cycle
Continous archiving. This is made by coupling a starting dump file with a set of WAL files, which will allow the database activity to be replicated over the timespan following the dump generation. In what is usually defined as a Point in Time Recovery (PITR), WAL files can be replayed (even partially) to regenerate a copy of the transaction related data
For large databases, PITR can prove lighter in terms of I/O compared to traditional dumps. WAL file generations can happen quite seamlessly and just as seamlessly be copied to target destinations. This is, indeed, how postgreSQL replication works and it could eventually be used to generate a high availability structure.
Tuning RAM consumption
RAM consumption needs to be tuned as it has an important impact on the I/O rate. PostgreSQL consumes RAM according to two kinds of parameters:
shared (this amount is consumed once per cluster and is not repeated by other factors like user/connection/database
session related (these amounts are consumed once for each connection)
shared_buffers.This is an I/O buffer of fixed size that is common to all sessions. PostgreSQL tends to make good use of the OS cache as well, so it is generally considered good practice not to raise the value over 8Gb (see pg. 118 - PostgreSQL 9.0 High performance). It is not clear whether this applies to older versions and it would make sense to experiment.
Session related consumers
work_mem specifies the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. The value defaults to one megabyte (1MB). For a complex query, several sort or hash operations may be running in parallel and each one will be allowed to use as much memory as this value specifies before it starts to put data into temporary files. In addition, several running sessions could be doing such operations concurrently. As a result, the total memory used could be many times the value of work_mem and this should be borne in mind when choosing the value.
Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.
max_connections sets the maximum amount of incoming connections that PostgreSQL will accept. Implicitly, it sets a number of top RAM allocation values. Setting this value too high has important downsides.
Firstly, the system must be capable of allocating RAM to connections that do not actually take place and this reserved RAM is in fact wasted. Or high individual RAM consumption is allowed anyway, and the operation is exposed to an out of memory error once the number of real connections adds up to an amount of assigned RAM larger than what is actually available.
Second, even if RAM allocation is well planned and all connections are present so that nothing is wasted, a Linux box does not like to have more than around 100 active sessions. The average default value of 100 max_connections should therefore be left in place and may even be reduced by means of connection pooler. It is impossible to predict how many work_mem will be used in a session. The larger the number of possible sessions the more conservative you should be about it. In fact, having a smaller amount of connections to manage allows the creation of stress that can be used to test RAM consumption and a better tuning.
Maintenance session related consumers
As we have seen, operations like VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY can consume up to maintenance_work_mem, before using disk swaps. So the more RAM we can safely allocate here, the quicker they will work, and the less they will impact on the usual database operations.