Context
I am looking for a way to programmatically introduce LIMIT and OFFSET in any SQL query (provided in a conf file). To pad the initial query with f"{query} LIMIT {params.offset} OFFSET {batch * params.offset} typically works for simple SQL queries...
SELECT * FROM some_table LIMIT 100000 OFFSET 3400000
... but is incorrect for JOIN based queries (and probably many other cases)
SELECT d.DiagnosticIDUnique, d.LogementID, d.DateValidite, TypeEnergieLibelle_list, td.TypeDiagnostic, c.NomClient, c.SiretClient
FROM DIAGNOSTIC as d LEFT JOIN TYPE_DIAGNOSTIC as td ON d.TypeDiagnosticID = td.TypeDiagnosticID
LIMIT 100000 OFFSET 3400000
Moreover, LIMIT and OFFSET won't work for some DBMS such as MSSQL, which, from what I understood, relies on an ORDER BY and expects something like
SELECT * FROM some_table
ORDER BY Id
OFFSET 3400000 ROWS
FETCH NEXT 10000 ROWS ONLY
Need
Managing this complexity and providing an high level API is typically the job of an ORM. My project is based on pandas and comes with SQLAlchemy as a dependance. I wondered if I could
- load a query string to create a
Queryobject - easily introduce a pagination and
- convert it back to a query string for the target DBMS
I am a complete noob on SQLAlchemy (only used Django's in previous projects). Here are some of the questions I am asking myself before digging into it
- Should I necessarily go through the creation of data models?
- Is this an overkill ? (some simpler, almost generic way to handle things)