在 SQLAlchemy/Elixir 中选择不同的列值

在使用 SQLAlchemy 和 Elixir 编写的一个小脚本中,我需要获取特定列的所有不同值。在普通的 SQL 中,这只是一个简单的问题

SELECT DISTINCT `column` FROM `table`;

我知道我可以“手动”运行这个查询,但是如果可以的话,我宁愿坚持使用 SQLAlchemy 声明语法(和/或 Elixir)。我确信这一定是可能的,我甚至在 SQLAlchemy 文档中看到过类似的暗示,但是我已经花了几个小时查找这些文档(以及 Elixir 的文档) ,我似乎无法真正弄清楚它是如何完成的。我错过了什么?

116386 次浏览

You can query column properties of mapped classes and the Query class has a generative distinct() method:

for value in Session.query(Table.column).distinct():
pass
for user in session.query(users_table).distinct():
print user.posting_id

For this class:

class Assurance(db.Model):
name = Column(String)

you can do this:

assurances = []
for assurance in Assurance.query.distinct(Assurance.name):
assurances.append(assurance.name)

and you will have the list of distinct values

I wanted to count the distinct values, and using .distinct() and .count() would count first, resulting in a single value, then do the distinct. I had to do the following

from sqlalchemy.sql import func
Session.query(func.count(func.distinct(Table.column))

For class,

class User(Base):
name = Column(Text)
id = Column(Integer, primary_key=True)

Method 1: Using load_only

from sqlalchemy.orm import load_only
records= (db_session.query(User).options(load_only(name)).distinct().all())
values = [record[0] if len(record) == 1 else record for record in records]  # list of distinct values

Method2: without any imports

records = db_session.query(User.name).distinct().all()
l_values = [record.__dict__[l_columns[0]] for record in records]

SQL Alchemy version 2 encourages the use of the select() function. You can use an SQL Alchemy table to build a select statement that extracts unique values:

select(distinct(table.c.column_name))

SQL Alchemy 2.0 migration ORM usage:

"The biggest visible change in SQLAlchemy 2.0 is the use of Session.execute() in conjunction with select() to run ORM queries, instead of using Session.query()."

Reproducible example using pandas to collect the unique values.

Define and insert the iris dataset

Define an ORM structure for the iris dataset, then use pandas to insert the data into an SQLite database. Pandas inserts with if_exists="append" argument so that it keeps the structure defined in SQL Alchemy.

import seaborn
import pandas
from sqlalchemy import create_engine
from sqlalchemy import MetaData, Table, Column, Text, Float
from sqlalchemy.orm import Session

Define metadata and create the table

engine = create_engine('sqlite://')
meta = MetaData()
meta.bind = engine
iris_table = Table('iris',
meta,
Column("sepal_length", Float),
Column("sepal_width", Float),
Column("petal_length", Float),
Column("petal_width", Float),
Column("species", Text))
iris_table.create()

Load data into the table

iris = seaborn.load_dataset("iris")
iris.to_sql(name="iris",
con=engine,
if_exists="append",
index=False,
chunksize=10 ** 6,
)

Select unique values

Re using the iris_table from above.

from sqlalchemy import distinct, select
stmt = select(distinct(iris_table.c.species))
df = pandas.read_sql_query(stmt, engine)
df
#       species
# 0      setosa
# 1  versicolor
# 2   virginica