Today an interesting problem came up on the IRC channel: Given a table of integer IDs, find the first non-existing ID starting from zero. A straightforward solution using generate_series() and a left join is in the comments, but we're trying to do better than that.
For this solution, there are four cases we need to worry about (and I only thought of one of these on my own, I'll blame the lack of caffeine):
- There are no rows in the table.
- There are rows in the table, but their IDs are all greater than zero.
- There are rows between zero and N, but there are gaps.
- There are rows between zero and N and no gaps.
-- Use zero if available (SELECT 0 AS id WHERE NOT EXISTS (SELECT 1 FROM foo WHERE id = 0) ) UNION ALL -- Find the smallest available ID inside a gap (SELECT id + 1 FROM ( SELECT id, lead(id) OVER (ORDER BY id) FROM foo ) ss WHERE lead - id > 1 ORDER BY id LIMIT 1) UNION ALL -- Or if there were no gaps and zero wasn't -- available, use max + 1. But don't return -- a row if the table is empty; the first -- query took care of that. (SELECT max(id) + 1 FROM foo HAVING max(id) IS NOT NULL) ORDER BY id LIMIT 1 ;
The last ORDER BY id LIMIT 1 affects the result of the whole UNION so we always get the smallest available ID if more than one of these queries return a row. For that same reason we also need to use parentheses around the second query; we only want to return one ID if there are several gaps or the only gap is wide. I used parentheses for all three queries because I thought having them only around one query in a UNION looked funny.
For the record, finding the gap on a version older than 8.4 is still possible, but I'm not going to cover it here. Consider upgrading ;-)
Edit: It occurred to me that don't need to separately query for the max value; lead(id) will be NULL for the last row. This gives us a bit nicer query:
-- Use zero if available (SELECT 0 AS id WHERE NOT EXISTS (SELECT 1 FROM foo WHERE id = 0) ) UNION ALL -- Find the smallest available ID inside a gap, or max + 1 -- if there are no gaps. (SELECT id + 1 FROM ( SELECT id, lead(id) OVER (ORDER BY id) FROM foo ) ss WHERE lead - id > 1 OR lead IS NULL ORDER BY id LIMIT 1) ORDER BY id LIMIT 1 ;
Unfortunately, you still need the first query of the UNION, but there's a way around that, too: insert a dummy row with id = -1 to the table. That gives us an even nicer query:
SELECT id + 1 FROM ( SELECT id, lead(id) OVER (ORDER BY id) FROM foo ) ss WHERE lead - id > 1 OR lead IS NULL ORDER BY id LIMIT 1 ;
You could also add the dummy row in a subquery instead of selecting straight from the table, but then the query can't use an index at all.