Django 动态模型字段

我正在开发一个 多租户应用程序,其中一些用户可以定义自己的数据字段(通过管理员) ,以收集表单中的其他数据并报告这些数据。后一点使得 JSONField 不是一个很好的选择,因此我有以下解决方案:

class CustomDataField(models.Model):
"""
Abstract specification for arbitrary data fields.
Not used for holding data itself, but metadata about the fields.
"""
site = models.ForeignKey(Site, default=settings.SITE_ID)
name = models.CharField(max_length=64)


class Meta:
abstract = True


class CustomDataValue(models.Model):
"""
Abstract specification for arbitrary data.
"""
value = models.CharField(max_length=1024)


class Meta:
abstract = True

请注意 CustomDataField 如何有一个 ForeignKey to Site-每个 Site 将有一组不同的自定义数据字段,但使用相同的数据库。 然后可以将各种具体数据字段定义为:

class UserCustomDataField(CustomDataField):
pass


class UserCustomDataValue(CustomDataValue):
custom_field = models.ForeignKey(UserCustomDataField)
user = models.ForeignKey(User, related_name='custom_data')


class Meta:
unique_together=(('user','custom_field'),)

这导致了以下用途:

custom_field = UserCustomDataField.objects.create(name='zodiac', site=my_site) #probably created in the admin
user = User.objects.create(username='foo')
user_sign = UserCustomDataValue(custom_field=custom_field, user=user, data='Libra')
user.custom_data.add(user_sign) #actually, what does this even do?

但是这感觉非常笨重,特别是需要手动创建相关数据并将其与具体模型关联。还有更好的办法吗?

已被先发制人地放弃的选择:

  • 自定义 SQL 以动态修改表。一部分是因为它不能扩展,另一部分是因为它太容易被黑客攻击。
  • 无模式解决方案,如 NoSQL。我不反对他们,但他们还是不合适。最终,输入了这些数据,并且存在使用第三方报告应用程序的可能性。
  • JSONField,如上所列,因为它不能很好地处理查询。
85524 次浏览

到目前为止,有四种可用的方法,其中两种需要一定的存储后端:

  1. Django-eav (原来的包不再保留,但有一些 欣欣向荣的叉子(图片来源: http://github.com/mvpdev/django-eav/network))

    该解决方案基于 实体属性值数据模型,本质上使用多个表来存储对象的动态属性。这种解决方案的伟大之处在于:

    • 使用几个纯粹的、简单的 Django 模型来表示动态字段,使其易于理解和与数据库无关;
    • 允许您通过以下简单命令有效地将动态属性存储附加/分离到 Django 模型:

      eav.unregister(Encounter)
      eav.register(Patient)
      
    • Nicely integrates with Django admin;

    • At the same time being really powerful.

    Downsides:

    • Not very efficient. This is more of a criticism of the EAV pattern itself, which requires manually merging the data from a column format to a set of key-value pairs in the model.
    • Harder to maintain. Maintaining data integrity requires a multi-column unique key constraint, which may be inefficient on some databases.
    • You will need to select one of the forks, since the official package is no longer maintained and there is no clear leader.

    The usage is pretty straightforward:

    import eav
    from app.models import Patient, Encounter
    
    
    eav.register(Encounter)
    eav.register(Patient)
    Attribute.objects.create(name='age', datatype=Attribute.TYPE_INT)
    Attribute.objects.create(name='height', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='weight', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='city', datatype=Attribute.TYPE_TEXT)
    Attribute.objects.create(name='country', datatype=Attribute.TYPE_TEXT)
    
    
    self.yes = EnumValue.objects.create(value='yes')
    self.no = EnumValue.objects.create(value='no')
    self.unkown = EnumValue.objects.create(value='unkown')
    ynu = EnumGroup.objects.create(name='Yes / No / Unknown')
    ynu.enums.add(self.yes)
    ynu.enums.add(self.no)
    ynu.enums.add(self.unkown)
    
    
    Attribute.objects.create(name='fever', datatype=Attribute.TYPE_ENUM,\
    enum_group=ynu)
    
    
    # When you register a model within EAV,
    # you can access all of EAV attributes:
    
    
    Patient.objects.create(name='Bob', eav__age=12,
    eav__fever=no, eav__city='New York',
    eav__country='USA')
    # You can filter queries based on their EAV fields:
    
    
    query1 = Patient.objects.filter(Q(eav__city__contains='Y'))
    query2 = Q(eav__city__contains='Y') |  Q(eav__fever=no)
    
  2. Hstore, JSON or JSONB fields in PostgreSQL

    PostgreSQL supports several more complex data types. Most are supported via third-party packages, but in recent years Django has adopted them into django.contrib.postgres.fields.

    HStoreField:

    Django-hstore was originally a third-party package, but Django 1.8 added HStoreField as a built-in, along with several other PostgreSQL-supported field types.

    This approach is good in a sense that it lets you have the best of both worlds: dynamic fields and relational database. However, hstore is not ideal performance-wise, especially if you are going to end up storing thousands of items in one field. It also only supports strings for values.

    #app/models.py
    from django.contrib.postgres.fields import HStoreField
    class Something(models.Model):
    name = models.CharField(max_length=32)
    data = models.HStoreField(db_index=True)
    

    在 Django 的 shell 中,您可以这样使用它:

    >>> instance = Something.objects.create(
    name='something',
    data={'a': '1', 'b': '2'}
    )
    >>> instance.data['a']
    '1'
    >>> empty = Something.objects.create(name='empty')
    >>> empty.data
    {}
    >>> empty.data['a'] = '1'
    >>> empty.save()
    >>> Something.objects.get(name='something').data['a']
    '1'
    

    您可以对 hstore 字段发出索引查询:

    # equivalence
    Something.objects.filter(data={'a': '1', 'b': '2'})
    
    
    # subset by key/value mapping
    Something.objects.filter(data__a='1')
    
    
    # subset by list of keys
    Something.objects.filter(data__has_keys=['a', 'b'])
    
    
    # subset by single key
    Something.objects.filter(data__has_key='a')
    

    JSONField :

    JSON/JSONB 字段支持任何 JSON 编码的数据类型,不仅仅是键/值对,而且比 Hstore 更快,(对 JSONB 而言)更紧凑。 有几个包实现了包括 < a href = “ https://django-pgfields.readthedocs.org/en/update/fields.html”rel = “ noReferrer”> django-pgfields 在内的 JSON/JSONB 字段,但是从 Django 1.9开始,< a href = “ https://docs.djangoproject.com/en/1.9/ref/Contrib/postgres/fields/# JSONField”rel = “ norefrer”> JSONField 是内置的使用 JSONB 进行存储的。 JSONField 类似于 HStoreField,在大型字典中可能执行得更好。它还支持字符串以外的类型,例如整数、布尔值和嵌套字典。

    #app/models.py
    from django.contrib.postgres.fields import JSONField
    class Something(models.Model):
    name = models.CharField(max_length=32)
    data = JSONField(db_index=True)
    

    在外壳中创建:

    >>> instance = Something.objects.create(
    name='something',
    data={'a': 1, 'b': 2, 'nested': {'c':3}}
    )
    

    索引查询与 HStoreField 几乎完全相同,只是可以嵌套。复杂索引可能需要手动创建(或脚本迁移)。

    >>> Something.objects.filter(data__a=1)
    >>> Something.objects.filter(data__nested__c=3)
    >>> Something.objects.filter(data__has_key='a')
    
  3. Django MongoDB

    Or other NoSQL Django adaptations -- with them you can have fully dynamic models.

    NoSQL Django libraries are great, but keep in mind that they are not 100% the Django-compatible, for example, to migrate to Django-nonrel from standard Django you will need to replace ManyToMany with ListField among other things.

    Checkout this Django MongoDB example:

    from djangotoolbox.fields import DictField
    
    
    class Image(models.Model):
    exif = DictField()
    ...
    
    
    >>> image = Image.objects.create(exif=get_exif_data(...))
    >>> image.exif
    {u'camera_model' : 'Spamcams 4242', 'exposure_time' : 0.3, ...}
    

    您甚至可以创建任何 Django 模型的 嵌入式列表:

    class Container(models.Model):
    stuff = ListField(EmbeddedModelField())
    
    
    class FooModel(models.Model):
    foo = models.IntegerField()
    
    
    class BarModel(models.Model):
    bar = models.CharField()
    ...
    
    
    >>> Container.objects.create(
    stuff=[FooModel(foo=42), BarModel(bar='spam')]
    )
    
  4. Django-mutant: Dynamic models based on syncdb and South-hooks

    Django-mutant implements fully dynamic Foreign Key and m2m fields. And is inspired by incredible but somewhat hackish solutions by Will Hardy and Michael Hall.

    All of these are based on Django South hooks, which, according to Will Hardy's talk at DjangoCon 2011 (watch it!) are nevertheless robust and tested in production (relevant source code).

    First to implement this was Michael Hall.

    Yes, this is magic, with these approaches you can achieve fully dynamic Django apps, models and fields with any relational database backend. But at what cost? Will stability of application suffer upon heavy use? These are the questions to be considered. You need to be sure to maintain a proper lock in order to allow simultaneous database altering requests.

    If you are using Michael Halls lib, your code will look like this:

    from dynamo import models
    
    
    test_app, created = models.DynamicApp.objects.get_or_create(
    name='dynamo'
    )
    test, created = models.DynamicModel.objects.get_or_create(
    name='Test',
    verbose_name='Test Model',
    app=test_app
    )
    foo, created = models.DynamicModelField.objects.get_or_create(
    name = 'foo',
    verbose_name = 'Foo Field',
    model = test,
    field_type = 'dynamiccharfield',
    null = True,
    blank = True,
    unique = False,
    help_text = 'Test field for Foo',
    )
    bar, created = models.DynamicModelField.objects.get_or_create(
    name = 'bar',
    verbose_name = 'Bar Field',
    model = test,
    field_type = 'dynamicintegerfield',
    null = True,
    blank = True,
    unique = False,
    help_text = 'Test field for Bar',
    )
    

进一步的研究表明,这是 实体属性值设计模式的一个比较特殊的情况,它已经通过几个包为 Django 实现。

首先是最初的 Ev-Django项目,它位于 PyPi 上。

其次,第一个项目 姜戈-艾弗有一个更新的分支,它主要是一个重构,允许在第三方应用程序中使用 django 自己的模型或模型的 EAV。

我一直在努力推进姜戈发电机的想法。该项目仍然没有文档,但是您可以在 https://github.com/charettes/django-mutant上阅读代码。

实际上,FK 和 M2M 字段(参见与贡献相关的内容)也可以工作,甚至可以为自己的定制字段定义包装器。

还支持模型选项,比如 only _ together 和排序加上 Model base,这样您就可以子类化模型代理、抽象或混合。

实际上,我正在研究一种非内存锁机制,以确保模型定义可以在多个 django 运行实例之间共享,同时防止它们使用过时的定义。

该项目仍然是非常阿尔法,但它是我的一个项目的基石技术,所以我将不得不采取它的生产准备。最大的计划是支持 django-nonrel,这样我们就可以利用 mongodb 驱动程序。