Something happened earlier this week (or maybe it was last week, I forget), and I was quite perplexed for a good 30 seconds, so I thought I'd try and make this problem a bit more well-known. Consider the following schema:
CREATE TABLE blacklist ( person text ); INSERT INTO blacklist VALUES ('badguy'); CREATE TABLE orders ( orderid serial PRIMARY KEY, person text, amount numeric ); CREATE FUNCTION is_blacklisted(_person text) RETURNS boolean STABLE AS $$ BEGIN RETURN EXISTS (SELECT 1 FROM blacklist WHERE person = _person); END $$ LANGUAGE plpgsql; CREATE FUNCTION new_order(_person text, _amount numeric) RETURNS bigint AS $$ DECLARE _orderid bigint; BEGIN IF is_blacklisted(_person) THEN RAISE EXCEPTION 'person has been blacklisted'; END IF; INSERT INTO orders (person, amount) VALUES (_person, _amount) RETURNING orderid INTO _orderid; RETURN _orderid; END $$ LANGUAGE plpgsql;
Now everything is running smoothly:
=# select new_order('goodguy', 100.0); new_order ----------- 1 (1 row) =# select new_order('badguy', 100.0); ERROR: person has been blacklisted
But something's wrong with the blacklist table: it doesn't tell you who blacklisted the person or why. You also happen to know that you added all of the existing people on the blacklist, so you want to have them reflect that. You run the following ALTER TABLEs:
ALTER TABLE blacklist ADD COLUMN reason text, -- use DEFAULT to assign values to existing rows ADD COLUMN blacklisted_by text NOT NULL DEFAULT 'you'; -- and then drop it ALTER TABLE blacklist ALTER COLUMN blacklisted_by DROP DEFAULT;
And everything seems to be working nicely. However, a few days later you realize that "badguy" stole all your money by getting through a couple of fake orders. How did that happen?
The answer is a bit complicated, but the gist of it is that if a STABLE function accesses the table for the first time in a transaction while a table-rewriting operation (such as the ALTER TABLE we used) is going on concurrently, it will see an empty table, instead of either the old or the new version of the data. is_blacklisted() returned FALSE because of this, and "badguy" happened to be able to create an order. This caveat applies to ALL SERIALIZATION MODES; not even READ COMMITTED users are safe. However, SQL language functions don't seem to have this problem in more recent versions of postgres.
This really questions the safety of table-rewriting operations in an application heavy on PL/PgSQL (or some other procedural languages). I know we're banning them internally.