I have a PostgreSQL database (v9.5.3) that's hosting "jobs" for workers to pull, run, and commit back.
When a worker wants a job, it runs something to the effect of:
SELECT MIN(sim_id) FROM job WHERE job_status = 0;
-- status 0 indicates it's ready to be run
job is a table with this schema:
CREATE TABLE commit_schema.job (
sim_id serial NOT NULL,
controller_id smallint NOT NULL,
controller_parameters smallint NOT NULL,
model_id smallint NOT NULL,
model_parameters smallint NOT NULL,
client_id smallint,
job_status smallint DEFAULT 0,
initial_glucose_id smallint NOT NULL
);
Afterwards, it uses this sim_id to piece together a bunch of parameters in a JOIN:
SELECT a.par1, b.par2 FROM
a INNER JOIN b ON a.sim_id = b.sim_id;
These parameters are then return to the worker, along with the sim_id, and the job is run. The sim_id is locked by setting job.job_status to 1, using an UPDATE:
UPDATE job SET job_status = 1 WHERE sim_id = $1;
The results are then committed using that same sim_id.
Ideally,
Workers wouldn't under any circumstances be able to get the same
sim_id.Two workers requesting a job won't error out, one will just have to wait to receive a job.
I think that using a serializable isolation level will ensure that the MIN() always returns unique sim_id's, but I believe this may also be achievable using a read committed isolation level. Then again, MIN() may not be able to concurrently and deterministically give unique sim_id's to two concurrent workers?