/ Lean Analytics, Engineering, Business Intelligence

Useful PostgreSQL Queries For Monitoring Purpose

by Huy Nguyen

Below are some useful Postgres queries for both troubleshooting, monitoring and analytics purpose.

Note: if you're looking for a simple tool to run queries & visualize PostgreSQL results, check out Holistics.io

Get Running Queries (And Lock statuses) in PostgreSQL

The below display currently running queries, and a granted column indicate whether the query has actually started running (or still waiting to acquire locks).

SELECT
  S.pid,
  age(clock_timestamp(), query_start),
  usename,
  query,
  L.mode,
  L.locktype,
  L.granted
FROM pg_stat_activity S
inner join pg_locks L on S.pid = L.pid 
order by L.granted, L.pid DESC

Do note that this query will return multiple records for the same process, since it will list down all the corresponding locks that goes with the query.

Cancel Running Queries

To kill a particular query, simply get its pid (using the above query) and run:

SELECT pg_cancel_backend(pid);

This will send a SIGINT to the current process. Learn more.

Show Biggest PostgreSQL Tables/Indexes And Their Size

This query returns list of the largest (in file sizes) tables and indexes in your database

SELECT
  nspname || '.' || relname AS "relation",
  pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;

Show All PostgreSQL Databases And Their Size

This query returns list of the largest databases in your cluster.

select
  datname as db,
  pg_size_pretty(pg_database_size(datname)) as size
from pg_database
order by pg_database_size(datname) desc;

Sample Output:

  db       | size    
-----------+---------
 prod      | 9852 MB
 postgres  | 7581 kB
 template1 | 7449 kB
 template0 | 7449 kB
(4 rows)

Show Table Bloats

Overtime due to MVCC, your table will grow in size (called table bloat) - this is why regular VACUUM is needed. This query will show you list of tables and indexes with the most bloats.

The value represents the number of "wasted bytes", or the difference between what is actually used by the table and index, and what we compute that it should be.

The way it works is it estimates the optimized size of the table/index by a calculation from each row sizes times total rows, and compare that against the actual table size. Do note that this is an estimate, not an actual figure.

with foo as (
  SELECT
    schemaname, tablename, hdr, ma, bs,
    SUM((1-null_frac)*avg_width) AS datawidth,
    MAX(null_frac) AS maxfracsum,
    hdr+(
      SELECT 1+COUNT(*)/8
      FROM pg_stats s2
      WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
    ) AS nullhdr
  FROM pg_stats s, (
    SELECT
      (SELECT current_setting('block_size')::NUMERIC) AS bs,
      CASE WHEN SUBSTRING(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
      CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
    FROM (SELECT version() AS v) AS foo
  ) AS constants
  GROUP BY 1,2,3,4,5  
), rs as (
  SELECT
    ma,bs,schemaname,tablename,
    (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr,
    (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
  FROM foo  
), sml as (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta -- very rough approximation, assumes all cols
  FROM rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
)

SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::FLOAT/otta END)::NUMERIC,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::FLOAT/iotta END)::NUMERIC,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM sml
ORDER BY wastedbytes DESC

Query extracted but rewrote from checkpostgres

Sample Output:

current_database | schemaname |            tablename            | tbloat | wastedbytes |                             iname                             | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
 dashboard        | public     | job_logs                        |    1.1 |  4139507712 | job_logs_pkey                                                 |    0.2 |            0
 dashboard        | public     | job_logs                        |    1.1 |  4139507712 | index_job_logs_on_job_id_and_created_at                       |    0.4 |            0
 dashboard        | public     | events                          |    1.1 |  3571736576 | events_pkey                                                   |    0.1 |            0
 dashboard        | public     | events                          |    1.1 |  3571736576 | index_events_on_tenant_id                                     |    0.1 |            0
 dashboard        | public     | events                          |    1.1 |  3571736576 | index_events_on_event_type                                    |    0.2 |            0
 dashboard        | public     | jobs                            |    1.1 |  2013282304 | index_jobs_on_status                                          |    0.0 |            0
 dashboard        | public     | jobs                            |    1.1 |  2013282304 | index_jobs_on_tag                                             |    0.3 |            0
 dashboard        | public     | jobs                            |    1.1 |  2013282304 | index_jobs_on_tenant_id                                       |    0.2 |            0
 dashboard        | public     | jobs                            |    1.1 |  2013282304 | index_jobs_on_created_at                                      |    0.2 |            0
 dashboard        | public     | jobs                            |    1.1 |  2013282304 | index_jobs_on_created_at_queued_or_running                    |    0.0 |     21086208

You should be looking at:

  • tbloat: table bloat, ratio between what it current is, and what it can be optimized to.
  • wastedbytes: number of bytes wasted
  • ibloat & wastedibytes: same as above, but for indexes.

When you see a table with high bloats, then consider running VACUUM ANALYZE on it.

Conclusion

I hope the above queries are useful for PostgreSQL monitoring purpose.

One last note, if you're looking for a simple tool to run queries & visualize PostgreSQL results, check out our product Holistics.io.

Huy Nguyen

Huy Nguyen

Original creator and cofounder of Holistics, a data platform for tech companies. Holistics’ customers are tech startups like Grab, Traveloka, ShopBack, 99co, Tech In Asia and alike.

Read More

From SQL Queries To Beautiful Charts

Connect to your database and build beautiful charts with Holistics BI

Learn More
Grab Logo

"Holistics is the solution to the increasingly many and complex data requests from the operational teams"


Tang Yee Jie - Senior Data Analyst, Grab

Tang Yee Jie

Senior Data Analyst, Grab