02 August 2006

Series generating functions in PostgreSQL

I'm not sure how much the following constitutes Functional Programming in SQL, but various aspects of SQL reminds me much of the list and array processing functionalities in perl, python and Unix pipes.


Anyway, consider the following clone of python's range() built-in function in PostgreSQL:
Aw, crud. I'm an idiot. PostgreSQL has already a built-in generate_series() to cover this. Preserving my original code here though:

-- range() to emulate python's range(start, stop, step)
-- Note: "immutable" kind of makes this a pure function with no side effects
CREATE OR REPLACE FUNCTION range(int, int, int) RETURNS SETOF INT
LANGUAGE plpgsql IMMUTABLE AS '
DECLARE
i integer;
start alias for $1;
stop alias for $2;
step alias for $3;
BEGIN
i := start;

IF step = 0 THEN
EXIT;
END IF;

IF step > 0 THEN
LOOP
IF i >= stop THEN
EXIT;
END IF;
RETURN NEXT i;
i := i + step;
END LOOP;
ELSE
LOOP
IF i < stop THEN
EXIT;
END IF;
RETURN NEXT i;
i := i + step;
END LOOP;
END IF;
END;
';


-- Overloaded range() to emulate python's range(start, stop)
-- Written in the "sql" language because pgsql inlines SQL functions (faster).
CREATE OR REPLACE FUNCTION range(int, int) RETURNS SETOF INT
IMMUTABLE LANGUAGE SQL AS 'select * from range($1, $2, 1);';

-- Overloaded range() to emulate python's range(stop)
-- Written in the "sql" language because pgsql inlines SQL functions (faster).
CREATE OR REPLACE FUNCTION range(int) RETURNS SETOF INT
IMMUTABLE LANGUAGE SQL AS 'select * from range(0, $1, 1);';


I'm losing my train of thought here, but point is that with generate_series(), and postgresql'S CREATE AGGREGATE as a starting point, you'd be able to construct some pretty functional style SQL programming.

One quick, practical use of generate_series() would be to quickly fill in your database with test values:


-- Insert 1000 random users, 20-40 years of age,
INSERT INTO users (uid, username, date_of_birth)
SELECT
nextval('uid_seq'),
generate_random_username(),
now() - ('1 year'::interval * random() * 20) - '20 years'::interval
FROM
generate_series(1, 1000);

which the method we often use at work. More later.