There are a lot of articles about working with PostgreSQL and its advantages, but it is not always clear from them how to monitor the state of the database and metrics that affect its optimal work. In the article, we will take a detailed look at SQL queries that will help you track these indicators and can be useful for regular users.
Why do you need to monitor the state of PostgreSQL?
Database monitoring is important as monitoring your applications. It is necessary to monitor processes in more detail than at the system level. To do this, you can track the following metrics:
- How efficient is the database cache?
- What is the size of the tables in your database?
- Are your indexes used?
- Etc.
Monitoring the size of the database and its elements
1. Size of tablespaces
SELECT spcname, pg_size_pretty(pg_tablespace_size(spcname))
FROM pg_tablespace
WHERE spcname<>'pg_global';
After running the query, you will receive information about the size of all tablespaces created in your database. The pg_tablespace_size function provides information about the size of the tablespace in bytes, so we also use the pg_size_pretty function to make it readable. The pg_global space is excluded, since it is used for shared system directories.
2. Size of databases
SELECT pg_database.datname,
pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM pg_database
ORDER BY pg_database_size(pg_database.datname) DESC;
After running the query, you will receive information about the size of each database created within your PostgreSQL instance.
3. Size of schemas in the database
SELECT A.schemaname,
pg_size_pretty (SUM(pg_relation_size(C.oid))) as table,
pg_size_pretty (SUM(pg_total_relation_size(C.oid)-pg_relation_size(C.oid))) as index,
pg_size_pretty (SUM(pg_total_relation_size(C.oid))) as table_index,
SUM(n_live_tup)
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace)
INNER JOIN pg_stat_user_tables A ON C.relname = A.relname
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C .relkind <> 'i'
AND nspname !~ '^pg_toast'
GROUP BY A.schemaname;
After running the query, you will receive detailed information about each schema in your database: the total size of all tables, the total size of all indexes, the total size of the schema and the total number of rows in all schema tables.
4. Size of tables
SELECT schemaname,
C.relname AS "relation",
pg_size_pretty (pg_relation_size(C.oid)) as table,
pg_size_pretty (pg_total_relation_size (C.oid)-pg_relation_size(C.oid)) as index,
pg_size_pretty (pg_total_relation_size (C.oid)) as table_index,
n_live_tup
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace)
LEFT JOIN pg_stat_user_tables A ON C.relname = A.relname
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size (C.oid) DESC
After running the query, you will receive detailed information about each table indicating its schema, size without indexes, size of indexes, total size of the table and indexes, as well as the number of rows in the table.
Control of deadlocks
If more than one user uses your database, there is always a risk of deadlocks of queries and the appearance of a queue with numerous of queries that will be pending. Most often, this can occur when processing many queries using the same tables. They will prevent each other from completing and prevent other queries from starting. You can read more about this in the official documentation. We will consider ways to find locks and remove them.
1. Monitoring of deadlocks
SELECT COALESCE(blockingl.relation::regclass::text, blockingl.locktype) AS locked_item,
now() - blockeda.query_start AS waiting_duration,
blockeda.pid AS blocked_pid,
blockeda.query AS blocked_query,
blockedl.mode AS blocked_mode,
blockinga.pid AS blocking_pid,
blockinga.query AS blocking_query,
blockingl.mode AS blocking_mode
FROM pg_locks blockedl
JOIN pg_stat_activity blockeda ON blockedl.pid = blockeda.pid
JOIN pg_locks blockingl ON (blockingl.transactionid = blockedl.transactionid OR
blockingl.relation = blockedl.relation AND
blockingl.locktype = blockedl.locktype) AND blockedl.pid <> blockingl.pid
JOIN pg_stat_activity blockinga ON blockingl.pid = blockinga.pid AND blockinga.datid = blockeda.datid
WHERE NOT blockedl.granted AND blockinga.datname = current_database();
This query shows all information about blocked queries as well as information about who blocked them.
2. Removing of deadlocks
SELECT pg_cancel_backend(PID_ID);
OR
SELECT pg_terminate_backend(PID_ID);
PID_ID is the ID of the query that blocks other queries. Most often, it is enough to cancel one blocking query to remove the deadlocks and start the entire accumulated queue. The difference between pg_cancel_backend and pg_terminate_backend is that pg_cancel_backend cancels the query, and pg_terminate_backend ends the session and closes the database connection. The pg_cancel_backend command is more gentle and in most cases it will be enough for you. If not, use pg_terminate_backend.
Indicators of optimal performance of your database
1. Cache Hit Ratio
SELECT sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM
pg_statio_user_tables;
The caching coefficient is a measure of read efficiency, measured by the proportion of cache reads compared to the total number of reads from both disk and cache. Except when using data storage, the ideal caching ratio is 99% or higher, which means that at least 99% of reads are performed from the cache and no more than 1% from disk.
2. Using of indexes
SELECT relname,
100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,
n_live_tup rows_in_table
FROM pg_stat_user_tables
WHERE seq_scan + idx_scan > 0
ORDER BY n_live_tup DESC;
Adding indexes to your database makes a big difference to query performance. Indexes are especially important for large tables. This query shows the number of rows in tables and the percentage of index usage time compared to reading without indexes. Ideal candidates for adding an index are tables larger than 10,000 rows with zero or low index usage.
3. Index Cache Hit Rate
SELECT sum(idx_blks_read) as idx_read,
sum(idx_blks_hit) as idx_hit,
(sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit) as ratio
FROM pg_statio_user_indexes;
This coefficient is similar to the usual caching coefficient, but is calculated based on index usage data.
4. Unused indexes
SELECT schemaname, relname, indexrelname
FROM pg_stat_all_indexes
WHERE idx_scan = 0 and schemaname <> 'pg_toast' and schemaname <> 'pg_catalog'
This query finds indexes that were created but not used in SQL queries.
5. Database bloat
SELECT
current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::float/otta END)::numeric,1) AS tbloat,
CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
iname, /*ituples::bigint, ipages::bigint, iotta,*/
ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::float/iotta END)::numeric,1) AS ibloat,
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
SELECT
schemaname, tablename, cc.reltuples, cc.relpages, bs,
CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta /* very rough approximation, assumes all cols */
FROM (
SELECT
ma,bs,schemaname,tablename,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
schemaname, tablename, hdr, ma, bs,
SUM((1-null_frac)*avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
) AS nullhdr
FROM pg_stats s, (
SELECT
(SELECT current_setting('block_size')::numeric) AS bs,
CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
JOIN pg_class cc ON cc.relname = rs.tablename
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
LEFT JOIN pg_index i ON indrelid = cc.oid
LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC;
Database bloat is disk space that has been used by a table or index and is available for reuse by the database, but has not been freed up. Bloating occurs when tables or indexes are updated. If you have a loaded database with a lot of deletion operations, bloating can leave a lot of unused space in your database and affect performance if it is not removed. The metrics of wastedbytes for tables and wastedibytes for indexes will show you if you have any serious bloat issues. To combat with bloat there is a VACUUM command.
6. Checking VACUUM starts
SELECT relname,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables;
The bloat can be reduced using the VACUUM command, but PostgreSQL also supports AUTOVACUUM. You can read about its setup here.
A few more queries that may be useful to you
1. Shows the number of open connections
SELECT COUNT(*) as connections,
backend_type
FROM pg_stat_activity
where state = 'active' OR state = 'idle'
GROUP BY backend_type
ORDER BY connections DESC;
Shows open connections to all databases in your PostgreSQL instance. If you have several databases in one PostgreSQL, then add datname = ‘Your Database’ to the WHERE condition.
2. Shows running queries
SELECT pid, age(clock_timestamp(), query_start), usename, query, state
FROM pg_stat_activity
WHERE state != 'idle' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;
Shows running queries and their duration.