Browsing articles from "February, 2011"
Feb 28, 2011
Nathan Thom

PostgreSQL – Impact of Cancelling a Vacuum Full


A vacuum full (in 8.3) goes through the table row by row and packs them close together filling in all unused parts of the data blocks. But what happens if you cancel a long running vacuum full? I’ve often had to do this because the vacuum full runs too slow and blocks any other accesses to the table. Things I figure might happen:

1. The vacuum is partially done – the parts it processed are packed together and next time it runs it can get the remaining parts.
2. Rolls back everything it did and actually increases the wasted space as a result.
3. No changes.

I’m hoping #1 but fear #2 so decided to run a test.

I need to create a table with about 50% real rows vs empty rows, so the vacuum full runs for a while and does some real work. Also, I don’t want it finishing too fast before I can cancel it so want a fair amount of data.

nt_test=# create table beer as select a.* from pg_class a, pg_class b, pg_class c where 1=0;
SELECT

Now, add this table to pg_autovacuum to exclude it from being picked up by the autovacuum process.

nt_test=# select oid from pg_class where relname='beer';
oid
----------
17170397
nt_test=# insert into pg_autovacuum (vacrelid, enabled, vac_base_thresh, vac_scale_factor, anl_base_thresh, anl_scale_factor, vac_cost_delay, vac_cost_limit, freeze_min_age, freeze_max_age)
values (17170397,false,-1,-1,-1,-1,-1,-1,-1,-1);
INSERT 0 1
nt_test=# select last_autovacuum from pg_stat_all_tables where relname='beer';
last_autovacuum
-----------------

(1 row)

Create some rows.

nt_test=# insert into beer select a.* from pg_class a, pg_class b, pg_class c;
INSERT 0 12008989

Create some bloat.

nt_test=# begin;
BEGIN
nt_test=# insert into beer select a.* from pg_class a, pg_class b, pg_class c;
INSERT 0 12008989
nt_test=# rollback;
ROLLBACK

Create some more rows after the empty space.

nt_test=# insert into beer select a.* from pg_class a, pg_class b, pg_class c;
INSERT 0 12008989

I wrote a little script to help map out the location of rows within the table. It extracts the block and row numbers from the ctid virtual column for each row. Using these, it shows how many rows exist within a range of blocks to reveal row density and any large gaps.

nt_test=# analyze beer;
ANALYZE
nt_test=# select relpages from pg_class where relname='beer';
relpages
----------
830733
nt_test=# \i map
chunk | numrows
--------+---------
0 | 4603159
100000 | 4328102
200000 | 3077728
500000 | 1965895
600000 | 4861646
700000 | 3950850
800000 | 1230598
(7 rows)

So the table is about 6GB in size. I’m not sure how long it will take to vacuum, so I’ll give it progressively longer runs and check the results after each time. Note that the vacuum full process will grab an Access Exclusive lock on the table and block all reads and writes, so don’t run on production!

nt_test=# vacuum full beer;
Cancel request sent
ERROR: canceling statement due to user request

I ran it several times in increasing time frames.

Timeframe relpages
1 min 1374220
2 mins 1374220
5 mins 1374220
15 mins 1374220

After each run, I checked the size and map of rows. It was the same every time:

nt_test=# analyze beer;
ANALYZE
nt_test=# select relpages from pg_class where relname='beer';
relpages
----------
830733
(1 row)
nt_test=# \i map
chunk | numrows
--------+---------
0 | 4603159
100000 | 4328102
200000 | 3077728
500000 | 1965895
600000 | 4861646
700000 | 3950850
800000 | 1230598
(7 rows)

So cancelling a vacuum full has not caused the table to grow even more. Moreover, it did not keep the partially completed rows to reduce the effort required next time. It seems to have done nothing!

For it to not have used any extra space while running, I’m guessing it has identified the empty rows for reuse – in effect performing a normal vacuum – and started using them. When it is cancelled, the inserted rows are abandoned. Does it modify the alive/dead status of each row afterwards, or has everything truly gone back to the original state? I tested this by inserting more rows and seeing if they went into the gap or appended to the end of the.

nt_test=# insert into beer select a.* from pg_class a, pg_class b, pg_class c;
INSERT 0 12008989
nt_test=# \i map
chunk | numrows
---------+---------
0 | 4603159
100000 | 4328102
200000 | 3077728
500000 | 1965895
600000 | 4861646
700000 | 3950850
800000 | 4327858
900000 | 4632300
1000000 | 3988977
1100000 | 290452
(10 rows)

It has not reused the space near block 300,000 but added them to the end of the table, indicating it did not save the alive/dead state of the rows it identified. To make sure I’m not going crazy I checked that a manual vacuum does indeed allow it to reclaim that space:

nt_test=# vacuum beer;
VACUUM
nt_test=# insert into beer select a.* from pg_class a, pg_class b, pg_class c;
INSERT 0 12008989
nt_test=# \i map
chunk | numrows
---------+---------
0 | 4603159
100000 | 4328102
200000 | 3996576
300000 | 4815680
400000 | 4096102
500000 | 4121268
600000 | 4861646
700000 | 3950850
800000 | 4327858
900000 | 4632300
1000000 | 3988977
1100000 | 313438
(12 rows)

It does.

So what did I learn here? Cancelling a vacuum full seems to have no effect on the table, but means all the work done is wasted and needs to be repeated the next time you run it. And it’s SLOW!

Pages:12345»

Article Series

Kick Ass PostgreSQL Books

Kick Ass Oracle Books

I've read lots of Oracle books, but these are by far the best I've encountered:

Categories