There are a lot of articles about working with PostgreSQL and its advantages, but it is not always clear from them how to monitor the state of the database and metrics that affect its optimal work. In the article, we will take a detailed look at SQL queries that will help you track these indicators and can be useful for regular users.

Why do you need to monitor the state of PostgreSQL?

Database monitoring is important as monitoring your applications. It is necessary to monitor processes in more detail than at the system level. To do this, you can track the following metrics:

How efficient is the database cache?
What is the size of the tables in your database?
Are your indexes used?
Etc.

Monitoring the size of the database and its elements

1. Size of tablespaces

SELECT spcname, pg_size_pretty(pg_tablespace_size(spcname)) 
FROM pg_tablespace
WHERE spcname<>'pg_global';

After running the query, you will receive information about the size of all tablespaces created in your database. The pg_tablespace_size function provides information about the size of the tablespace in bytes, so we also use the pg_size_pretty function to make it readable. The pg_global space is excluded, since it is used for shared system directories.

2. Size of databases

SELECT pg_database.datname,
       pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM pg_database
ORDER BY pg_database_size(pg_database.datname) DESC;

After running the query, you will receive information about the size of each database created within your PostgreSQL instance.

3. Size of schemas in the database

SELECT A.schemaname,
       pg_size_pretty (SUM(pg_relation_size(C.oid))) as table, 
       pg_size_pretty (SUM(pg_total_relation_size(C.oid)-pg_relation_size(C.oid))) as index, 
       pg_size_pretty (SUM(pg_total_relation_size(C.oid))) as table_index,
       SUM(n_live_tup)
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace)
INNER JOIN pg_stat_user_tables A ON C.relname = A.relname
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C .relkind <> 'i'
AND nspname !~ '^pg_toast'
GROUP BY A.schemaname;

After running the query, you will receive detailed information about each schema in your database: the total size of all tables, the total size of all indexes, the total size of the schema and the total number of rows in all schema tables.

4. Size of tables

SELECT schemaname,
       C.relname AS "relation",
       pg_size_pretty (pg_relation_size(C.oid)) as table,
       pg_size_pretty (pg_total_relation_size (C.oid)-pg_relation_size(C.oid)) as index,
       pg_size_pretty (pg_total_relation_size (C.oid)) as table_index,
       n_live_tup
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace)
LEFT JOIN pg_stat_user_tables A ON C.relname = A.relname
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size (C.oid) DESC

After running the query, you will receive detailed information about each table indicating its schema, size without indexes, size of indexes, total size of the table and indexes, as well as the number of rows in the table.

Control of `deadlocks`

If more than one user uses your database, there is always a risk of deadlocks of queries and the appearance of a queue with numerous of queries that will be pending. Most often, this can occur when processing many queries using the same tables. They will prevent each other from completing and prevent other queries from starting. You can read more about this in the official documentation. We will consider ways to find locks and remove them.

1. Monitoring of deadlocks

SELECT COALESCE(blockingl.relation::regclass::text, blockingl.locktype) AS locked_item,
       now() - blockeda.query_start                                     AS waiting_duration,
       blockeda.pid                                                     AS blocked_pid,
       blockeda.query                                                   AS blocked_query,
       blockedl.mode                                                    AS blocked_mode,
       blockinga.pid                                                    AS blocking_pid,
       blockinga.query                                                  AS blocking_query,
       blockingl.mode                                                   AS blocking_mode
FROM pg_locks blockedl
JOIN pg_stat_activity blockeda ON blockedl.pid = blockeda.pid
JOIN pg_locks blockingl ON (blockingl.transactionid = blockedl.transactionid OR
                            blockingl.relation = blockedl.relation AND
                            blockingl.locktype = blockedl.locktype) AND blockedl.pid <> blockingl.pid
JOIN pg_stat_activity blockinga ON blockingl.pid = blockinga.pid AND blockinga.datid = blockeda.datid
WHERE NOT blockedl.granted AND blockinga.datname = current_database();

This query shows all information about blocked queries as well as information about who blocked them.

2. Removing of deadlocks

SELECT pg_cancel_backend(PID_ID);
OR
SELECT pg_terminate_backend(PID_ID);

PID_ID is the ID of the query that blocks other queries. Most often, it is enough to cancel one blocking query to remove the deadlocks and start the entire accumulated queue. The difference between pg_cancel_backend and pg_terminate_backend is that pg_cancel_backend cancels the query, and pg_terminate_backend ends the session and closes the database connection. The pg_cancel_backend command is more gentle and in most cases it will be enough for you. If not, use pg_terminate_backend.

Indicators of optimal performance of your database

1. Cache Hit Ratio

SELECT sum(heap_blks_read) as heap_read,
       sum(heap_blks_hit)  as heap_hit,
       sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM 
  pg_statio_user_tables;

The caching coefficient is a measure of read efficiency, measured by the proportion of cache reads compared to the total number of reads from both disk and cache. Except when using data storage, the ideal caching ratio is 99% or higher, which means that at least 99% of reads are performed from the cache and no more than 1% from disk.

2. Using of indexes

SELECT relname,   
       100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used,   
       n_live_tup rows_in_table 
FROM pg_stat_user_tables 
WHERE seq_scan + idx_scan > 0 
ORDER BY n_live_tup DESC;

Adding indexes to your database makes a big difference to query performance. Indexes are especially important for large tables. This query shows the number of rows in tables and the percentage of index usage time compared to reading without indexes. Ideal candidates for adding an index are tables larger than 10,000 rows with zero or low index usage.

3. Index Cache Hit Rate

SELECT sum(idx_blks_read) as idx_read,
       sum(idx_blks_hit)  as idx_hit,
       (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit) as ratio
FROM pg_statio_user_indexes;

This coefficient is similar to the usual caching coefficient, but is calculated based on index usage data.

4. Unused indexes

SELECT schemaname, relname, indexrelname
FROM pg_stat_all_indexes
WHERE idx_scan = 0 and schemaname <> 'pg_toast' and schemaname <> 'pg_catalog'

This query finds indexes that were created but not used in SQL queries.

5. Database bloat

SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::float/otta END)::numeric,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::float/iotta END)::numeric,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta /* very rough approximation, assumes all cols */
  FROM (
    SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM (
      SELECT
        schemaname, tablename, hdr, ma, bs,
        SUM((1-null_frac)*avg_width) AS datawidth,
        MAX(null_frac) AS maxfracsum,
        hdr+(
          SELECT 1+count(*)/8
          FROM pg_stats s2
          WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
        ) AS nullhdr
      FROM pg_stats s, (
        SELECT
          (SELECT current_setting('block_size')::numeric) AS bs,
          CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
    ) AS foo
  ) AS rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC;

Database bloat is disk space that has been used by a table or index and is available for reuse by the database, but has not been freed up. Bloating occurs when tables or indexes are updated. If you have a loaded database with a lot of deletion operations, bloating can leave a lot of unused space in your database and affect performance if it is not removed. The metrics of wastedbytes for tables and wastedibytes for indexes will show you if you have any serious bloat issues. To combat with bloat there is a VACUUM command.

6. Checking VACUUM starts

SELECT relname, 
       last_vacuum, 
       last_autovacuum 
FROM pg_stat_user_tables;

The bloat can be reduced using the VACUUM command, but PostgreSQL also supports AUTOVACUUM. You can read about its setup here.

A few more queries that may be useful to you

1. Shows the number of open connections

SELECT COUNT(*) as connections,
       backend_type
FROM pg_stat_activity
where state = 'active' OR state = 'idle'
GROUP BY backend_type
ORDER BY connections DESC;

Shows open connections to all databases in your PostgreSQL instance. If you have several databases in one PostgreSQL, then add datname = ‘Your Database’ to the WHERE condition.

2. Shows running queries

SELECT pid, age(clock_timestamp(), query_start), usename, query, state
FROM pg_stat_activity
WHERE state != 'idle' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;

Shows running queries and their duration.

Database and Middleware Knowledge Sharing

Thursday, March 14, 2024

Top useful SQL queries for PostgreSQL

Why do you need to monitor the state of PostgreSQL?

Monitoring the size of the database and its elements

1. Size of tablespaces

2. Size of databases

3. Size of schemas in the database

4. Size of tables

Control of `deadlocks`

1. Monitoring of deadlocks

2. Removing of deadlocks

Indicators of optimal performance of your database

1. Cache Hit Ratio

2. Using of indexes

3. Index Cache Hit Rate

4. Unused indexes

5. Database bloat

6. Checking VACUUM starts

A few more queries that may be useful to you

1. Shows the number of open connections

2. Shows running queries

No comments:

Post a Comment

In Oracle How to shrink Securefile LOB tables

Report Abuse

Thursday, March 14, 2024

Top useful SQL queries for PostgreSQL

Why do you need to monitor the state of PostgreSQL?

Monitoring the size of the database and its elements

1. Size of tablespaces

2. Size of databases

3. Size of schemas in the database

4. Size of tables

Control of deadlocks

1. Monitoring of deadlocks

2. Removing of deadlocks

Indicators of optimal performance of your database

1. Cache Hit Ratio

2. Using of indexes

3. Index Cache Hit Rate

4. Unused indexes

5. Database bloat

6. Checking VACUUM starts

A few more queries that may be useful to you

1. Shows the number of open connections

2. Shows running queries

No comments:

Post a Comment

In Oracle How to shrink Securefile LOB tables

Control of `deadlocks`