DB Upsertion¶

Helper CRUD methods to interact with db

Version¶

In order to keep previous versions of a particular (table, column, row) triplet, we use a Version table.

class Version(SQLModel, table=True):  # type: ignore
    """
    Table handling versioning.

    Attributes are:
        - table: the table to version
        - column: the column of table to version
        - row_id: the the row id of the column of table to version
        - col_type: the column type
        - value: the value to store (previous row/column/table version)
    """
    __tablename__ = 'version'
    id: Optional[int] = Field(default=None, primary_key=True)
    created_at: datetime = Field(default_factory=datetime.utcnow)
    table: str = Field(index=True)
    column: str = Field(index=True)
    row_id: int = Field(index=True)
    col_type: ColType
    value: str | None = Field(index=True)

    __table_args__ = (
        Index(
            'version_filter_id',
            'table',
            'row_id',
            'column'
        ),
    )

To be noted ,the col_type attribute encodes what column type we are able to handle (6 as of 2025/03/18)

@unique
class ColType(str, Enum):
    """
    Enum listing all col types allowed to be stored in version class
    """
    STR = 'str'
    BOOL = 'bool'
    INT = 'int'
    FLOAT = 'float'
    DATE = 'datetime'
    ENUM = 'enum'


COL_TYPES = str | int | bool | float | datetime | Enum | None

To retrieve previous Version, use the following two methods

`get_versions`¶

Retrieve all previous versions of a given (table: str, column: str, row_id: int) triple

`get_row_versions`¶

Retrieve all previous versions all of versioned columns of a given (table: str, row_id: int) couple

`upsert_data` and `upsert_df_data`¶

Core methods of these CRUD helpers. upsert_df_data works on a pandas dataframe, and upsert_data on a list of dict | SQLModel. upsert_df_data converts its input and calls upsert_data under the hood.

Both do the upsertion of the passed data and the (optional in the case of upsert_data) passed db_schema: SQLModelMetaclass.

NB: db_schema is optional for upsert_data, as it can be retrieved by reflection in the case of a list of SQLModel objects.

Upsertion means the following:

all new dict data are inserted into the db. New data are identified by their sfield and the upsert_selector method (more on that below)
all dict already present in db are upserted, only editing field (more on that below), and only storing in Version the previous version if the value is indeed modified.

Example:

class UpFoo(SQLModel, table=True):  # type: ignore
    """
    Test class to test db upsertion
    """
    __tablename__ = 'up_foo'
    id: Optional[int] = Field(default=None, primary_key=True)
    bar1: str = sfield()
    bar2: bool = sfield()
    bar3: str = field()
    bar4: bool = field()
    bar5: float = field()
    bar6: int = field()
    bar7: datetime = field()
    bar8: Permission = field()
    bar9: Optional[str] = field(default=None)

foo = UpFoo(bar1='bar', bar2=True, bar3='bar', bar4=False, bar5=42.42, bar6=42,
            bar7=datetime(2025, 3, 17), bar8=Permission.ADMIN)
foo2 = UpFoo(bar1='bar', bar2=True, bar3='babar', bar4=False, bar5=42.42, bar6=42,
             bar7=datetime(2025, 3, 17), bar8=Permission.ADMIN)

Calling upsert_data on first [foo] and then [foo2] will only create one row in up_foo, and one in version (storing the old bar value of column bar4). You can go on the test suite to learn more.

`upsert_deletor`¶

Delete one row of the db corresponding to the passed values: SQLModel.

db_schema is retrieved by reflection.

`field` and `sfield`¶

simple wrapper methods around SQLModel Field. Used to specify whether an attribute is required for uniquely selecting a row (this is sfield) or need to be versioned (this is field). The interface is in both cases identical to Field, with this additional behaviour. A column no used to select and not to be versioned can use the regular Field (as most presumably every id primary key column).

`upsert_selector`¶

Select one row of the db corresponding to the passed values: SQLModel.

db_schema is retrieved by reflection.

`db_to_value`¶

Version store all types as str (or None) type. To convert back into the original column col_type, call this method.

def db_to_value(db_value: str | None, col_type: type | EnumType) -> COL_TYPES:
    """
    Convert back a str version value stored to a real value (types hancled listed in ColType)
    NB: assumption that if the type is not known, this is an enum type.
    """
    if db_value is None:
        return None
    if col_type in [int, str, float]:
        return col_type(db_value)
    if col_type == bool:
        return True if db_value == 'True' else False
    if col_type == datetime:
        return datetime.strptime(db_value, '%Y-%m-%d %H:%M:%S.%f')
    return col_type[db_value]   # type: ignore[index]

DB Upsertion¶

Version¶

get_versions¶

get_row_versions¶

upsert_data and upsert_df_data¶

upsert_deletor¶

field and sfield¶

upsert_selector¶

db_to_value¶