什么时候将标志存储为位掩码比使用关联表更好?

我正在开发一个应用程序,用户有不同的权限来使用不同的特性(例如,阅读、创建、下载、打印、批准等)。预计权限列表不会经常更改。关于如何在数据库中存储这些权限,我有几个选项。

在什么情况下,选择2会更好?

选择一

使用关联表。

User
----
UserId (PK)
Name
Department
Permission
----
PermissionId (PK)
Name
User_Permission
----
UserId (FK)
PermissionId (FK)

选择二

为每个用户存储一个位掩码。

User
----
UserId (PK)
Name
Department
Permissions
[Flags]
enum Permissions {
Read = 1,
Create = 2,
Download = 4,
Print = 8,
Approve = 16
}
9654 次浏览

Personally, I would use an associative table.

A bitmask field is very difficult to query and join on.

You can always map this to your C# flags enum and if performance becomes and issue refactor the database.

Readability over premature optimization ;)

Your queries will run faster using a flags enumeration (bitmask), because you won't need to include a join to the associated table in order to make sense of the value.

Store the permissions normalized (i.e. not in a bitmask). While it's obviously not a requirement for your scenario (especially if the permissions won't often change), it will make querying much easier and more obvious.

There is no definitive answer, so do what works for you. But here is my catch:

Use option 1 if

  • You expect permissions to grow to many
  • If you might need to do a permission check in the database stored procedures itself
  • You do not expect millions of users so that records in the table do not grow massively

Use option 2 if

  • Permissions are going to be limited to handful
  • You expect millions of users

The only time I can think of when I would use a bitmask field to store permissions, is when you are really really constrained in how much physical memory you have....like maybe on an old mobile device. In truth, the amount of memory you save isn't worth it. Even at millions of users hard drive space is cheap, and you can expand permissions etc. a lot easier by using the non-bitmask approach (this about reporting off of who has what permissions etc.)

One of this biggest headaches I've run into with it is assigning users permissions directly in the database. I know you should try and use the application to administer itself and not-much with application data in general, but sometimes, it's just necessary. Unless the bitmask is actually a character field, and you can easily see what permissions someone has instead of an integer, try explaining to an analyst etc. how to give write access etc. to someone by updating the field.....and pray your arithmetic is correct.

It'll be useful when they won't change in their structure and will always be used together. That way, you have little round trips to the server. They are also good performance-wise because you can affect all the rights in a single assignation of a variable.

I personally don't like them... In some performance intense application, they're still used. I remember implementing a chess-AI using these because you could evaluate a board in a single comparison.. It's a pain to work with.

I would always store it normalized unless the database is merely holding the record for you, and you will never do anything with this besides retrieving and saving. A scenario for this is if upon login, your user's permission string is fetched, and in server code it is processed and cached. In that case it really doesn't matter too much that it's denormalized.

If you're storing it in a string and trying to do work on it at the DB level, you'd have to do some gymnastics to get the permissions for page X out, which can be painful.

Splendid question!

Firstly, let's make some assumptions about "better".

I'm assuming you don't much care about disk space - a bitmask is efficient from a space point of view, but I'm not sure that matters much if you're using SQL server.

I'm assuming you do care about speed. A bitmask can be very fast when using calculations - but you won't be able to use an index when querying the bitmask. This shouldn't matter all that much, but if you want to know which users have create access, your query would be something like

select * from user where permsission & CREATE = TRUE

(haven't got access to SQL Server today, on the road). That query would not be able to use an index because of the mathematical operation - so if you have a huge number of users, this would be quite painful.

I'm assuming you care about maintainability. From a maintainability point of view, the bitmask is not as expressive as the underlying problem domain as storing explicit permissions. You'd almost certainly have to synchronize the value of the bitmask flags across multiple components - including the database. Not impossible, but pain in the backside.

So, unless there's another way of assessing "better", I'd say the bitmask route is not as good as storing the permissions in a normalized database structure. I don't agree that it would be "slower because you have to do a join" - unless you have a totally dysfunctional database, you won't be able to measure this (whereas querying without the benefit of an active index can become noticably slower with even a few thousand records).

I advice against using a bitmask for the following reasons:

  • Index cannot be used efficiently
  • Querying is harder
  • Readability / Maintenance is severely impacted
  • The average developer out there doesn't know what a bitmask is
  • Flexibility is reduced (upper limit to nr of bits in a number)

Depending on your query patterns, planned feature set and data distribution I would go with your option 1, or even something simple as:

user_permissions(
user_id
,read
,create
,download
,print
,approve
,primary key(user_id)
);

Adding a column is a schema modification, but my guess is that adding a privilege "Purge", will require some code to go along with it, so the privileges may not have to be as dynamic as you think.

If you have some sick distribution of data, such as 90% of the user base doesn't have a single permission, the following model also works fine (but falls apart when doing larger scans (one 5-way join vs a single full table scan).

user_permission_read(
user_id
,primary key(user_id)
,foreign key(user_id) references user(user_id)
)


user_permission_write(
user_id
,primary key(user_id)
,foreign key(user_id) references user(user_id)
)


user_permission_etcetera(
user_id
,primary key(user_id)
,foreign key(user_id) references user(user_id)
)