主键/外键变数命名原则

在我们的开发小组中,关于主键和外键的变数命名原则,我们有一个激烈的争论。基本上我们这群人有两种思想:

1:

Primary Table (Employee)
Primary Key is called ID


Foreign table (Event)
Foreign key is called EmployeeID

或者

2:

Primary Table (Employee)
Primary Key is called EmployeeID


Foreign table (Event)
Foreign key is called EmployeeID

我不喜欢在任何列中重复表的名称(所以我更喜欢上面的选项1)。从概念上讲,它与其他语言中的许多推荐做法是一致的,即在属性名中不使用对象的名称。我认为命名外键 EmployeeID(或者 Employee_ID可能更好)告诉读者它是 Employee表的 ID列。

另一些人更喜欢选项2,其中将主键命名为前缀为表名,这样列名在整个数据库中都是相同的。我明白这一点,但是您现在无法在视觉上区分主键和外键。

另外,我认为在列名中使用表名是多余的,因为如果您将表看作一个实体,将列看作该实体的属性或属性,那么您将它看作 Employee的 ID 属性,而不是员工的 EmployeeID属性。我不会去问我的同事他的 PersonAge或者 PersonGender是什么。我问他的年龄是多少。

所以就像我说的,这是一场激烈的辩论,我们一直在讨论这个问题。我有兴趣得到一些新的观点。

70378 次浏览

It doesn't really matter. I've never run into a system where there is a real difference between choice 1 and choice 2.

Jeff Atwood had a great article a while back on this topic. Basically people debate and argue the most furiously those topics which they cannot be proven wrong on. Or from a different angle, those topics which can only be won through filibuster style endurance based last-man-standing arguments.

Pick one and tell them to focus on issues that actually impact your code.

EDIT: If you want to have fun, have them specify at length why their method is superior for recursive table references.

The convention we use where I work is pretty close to A, with the exception that we name tables in the plural form (ie, "employees") and use underscores between the table and column name. The benefit of it is that to refer to a column, it's either "employees _ id" or "employees.id", depending on how you want to access it. If you need to specify what table the column is coming from, "employees.employees _ id" is definitely redundant.

I use convention #2. I'm working with a legacy data model now where I don't know what stands for in a given table. Where's the harm in being verbose?

Neither convention works in all cases, so why have one at all? Use Common sense...

e.g., for self-referencing table, when there are more than one FK column that self-references the same table's PK, you HAVE to violate both "standards", since the two FK columns can't be named the same... e.g., EmployeeTable with EmployeeId PK, SupervisorId FK, MentorId Fk, PartnerId FK, ...

I think it depends on your how you application is put together. If you use ORM or design your tables to represent objects then option 1 may be for you.

I like to code the database as its own layer. I control everything and the app just calls stored procedures. It is nice to have result sets with complete column names, especially when there are many tables joined and many columns returned. With this stype of application, I like option 2. I really like to see column names match on joins. I've worked on old systems where they didn't match and it was a nightmare,

If the two columns have the same name in both tables (convention #2), you can use the USING syntax in SQL to save some typing and some boilerplate noise:

SELECT name, address, amount
FROM employees JOIN payroll USING (employee_id)

Another argument in favor of convention #2 is that it's the way the relational model was designed.

The significance of each column is partially conveyed by labeling it with the name of the corresponding domain.

I agree that there is little to choose between them. To me a much more significant thing about either standard is the "standard" part.

If people start 'doing their own thing' they should be strung up by their nethers. IMHO :)

How about naming the foreign key

role_id

where role is the role the referenced entity has relativ to the table at hand. This solves the issue of recursive reference and multiple fks to the same table.

In many cases will be identical to the referenced table name. In this cases it becomes identically to one of your proposals.

In any case havin long arguments is a bad idea

"Where in "employee INNER JOIN order ON order.employee_id = employee.id" is there a need for additional qualification?".

There is no need for additional qualification because the qualification I talked of is already there.

"the reason that a business user refers to Order ID or Employee ID is to provide context, but at a dabase level you already have context because you are refereing to the table".

Pray, tell me, if the column is named 'ID', then how is that "refereing [sic] to the table" done exactly, unless by qualifying this reference to the ID column exactly in the way I talked of ?

If you are looking at application code, not just database queries, some things seem clear to me:

  1. Table definitions usually directly map to a class that describes one object, so they should be singular. To describe a collection of an object, I usually append "Array" or "List" or "Collection" to the singular name, as it more clearly than use of plurals indicates not only that it is a collection, but what kind of a collection it is. In that view, I see a table name as not the name of the collection, but the name of the type of object of which it is a collection. A DBA who doesn't write application code might miss this point.

  2. The data I deal with often uses "ID" for non-key identification purposes. To eliminate confusion between key "ID"s and non-key "ID"s, for the primary key name, we use "Key" (that's what it is, isn't it?) prefixed with the table name or an abbreviation of the table name. This prefixing (and I reserve this only for the primary key) makes the key name unique, which is especially important because we use variable names that are the same as the database column names, and most classes have a parent, identified by the name of the parent key. This also is needed to make sure that it is not a reserved keyword, which "Key" alone is. To facilitate keeping key variable names consistent, and to provide for programs that do natural joins, foreign keys have the same name as is used in the table in which they are the primary key. I have more than once encountered programs which work much better this way using natural joins. On this last point, I admit a problem with self-referencing tables, which I have used. In this case, I would make an exception to the foreign key naming rule. For example, I would use ManagerKey as a foreign key in the Employee table to point to another record in that table.

I like convention #2 - in researching this topic, and finding this question before posting my own, I ran into the issue where:

I am selecting * from a table with a large number of columns and joining it to a second table that similarly has a large number of columns. Both tables have an "id" column as the primary key, and that means I have to specifically pick out every column (as far as I know) in order to make those two values unique in the result, i.e.:

SELECT table1.id AS parent_id, table2.id AS child_id

Though using convention #2 means I will still have some columns in the result with the same name, I can now specify which id I need (parent or child) and, as Steven Huwig suggested, the USING statement simplifies things further.

I've always used userId as a PK on one table and userId on another table as a FK. 'm seriously thinking about using userIdPK and userIdFK as names to identify one from the other. It will help me to identify PK and FK quickly when looking at the tables and it seems like it will clear up code when using PHP/SQL to access data making it easier to understand. Especially when someone else looks at my code.

Have you considered the following?

Primary Table (Employee)
Primary Key is PK_Employee


Foreign table (Event)
Foreign key is called FK_Employee