我记得在播客014中听到Joel Spolsky提到他几乎从未使用过外键(如果我没记错的话)。然而,对我来说,它们对于避免数据库中的重复和后续数据完整性问题非常重要。
人们是否有一些可靠的理由(以避免与Stack Overflow原则一致的讨论)?
编辑:“我还没有创建外键的理由,所以这可能是我真正建立一个外键的第一个理由。”
我记得在播客014中听到Joel Spolsky提到他几乎从未使用过外键(如果我没记错的话)。然而,对我来说,它们对于避免数据库中的重复和后续数据完整性问题非常重要。
人们是否有一些可靠的理由(以避免与Stack Overflow原则一致的讨论)?
编辑:“我还没有创建外键的理由,所以这可能是我真正建立一个外键的第一个理由。”
当前回答
“在添加记录之前,检查对应的记录是否存在于另一个表中”是业务逻辑。
这里有一些你不希望在数据库中使用它的原因:
If the business rules change, you have to change the database. The database will need to recreate the index in a lot of cases and this is slow on large tables. (Changing rules include: allow guests to post messages or allow users to delete their account despite having posted comments, etc). Changing the database is not as easy as deploying a software fix by pushing the changes to the production repository. We want to avoid changing the database structure as much as possible. The more business logic there is in the database the more you increase the chances of needing to change the databae (and triggering re-indexing). TDD. In unit tests you can substitute the database for mocks and test the functionality. If you have any business logic in your database, you are not doing complete tests and would need to either test with the database or replicate the business logic in code for testing purposes, duplicating the logic and increasing the likelyhood of the logic not working in the same way. Reusing your logic with different data sources. If there is no logic in the database, my application can create objects from records from the database, create them from a web service, a json file or any other source. I just need to swap out the data mapper implementation and can use all my business logic with any source. If there is logic in the database, this isn't possible and you have to implement the logic at the data mapper layer or in the business logic. Either way, you need those checks in your code. If there's no logic in the database I can deploy the application in different locations using different database or flat-file implementations.
其他回答
“在添加记录之前,检查对应的记录是否存在于另一个表中”是业务逻辑。
这里有一些你不希望在数据库中使用它的原因:
If the business rules change, you have to change the database. The database will need to recreate the index in a lot of cases and this is slow on large tables. (Changing rules include: allow guests to post messages or allow users to delete their account despite having posted comments, etc). Changing the database is not as easy as deploying a software fix by pushing the changes to the production repository. We want to avoid changing the database structure as much as possible. The more business logic there is in the database the more you increase the chances of needing to change the databae (and triggering re-indexing). TDD. In unit tests you can substitute the database for mocks and test the functionality. If you have any business logic in your database, you are not doing complete tests and would need to either test with the database or replicate the business logic in code for testing purposes, duplicating the logic and increasing the likelyhood of the logic not working in the same way. Reusing your logic with different data sources. If there is no logic in the database, my application can create objects from records from the database, create them from a web service, a json file or any other source. I just need to swap out the data mapper implementation and can use all my business logic with any source. If there is logic in the database, this isn't possible and you have to implement the logic at the data mapper layer or in the business logic. Either way, you need those checks in your code. If there's no logic in the database I can deploy the application in different locations using different database or flat-file implementations.
我不得不在这里第二多的评论,外键是必要的项目,以确保你有完整的数据。ON DELETE和ON UPDATE的不同选项将允许你绕过一些人们在这里提到的关于它们的使用的“下降”。
我发现在我99%的项目中,我会使用FK来加强数据的完整性,然而,在很少的情况下,我的客户必须保留他们的旧数据,不管它有多糟糕....但后来我花了很多时间写代码,只得到有效的数据,所以它变得毫无意义。
使用外键的原因:
you won't get Orphaned Rows you can get nice "on delete cascade" behavior, automatically cleaning up tables knowing about the relationships between tables in the database helps the Optimizer plan your queries for most efficient execution, since it is able to get better estimates on join cardinality. FKs give a pretty big hint on what statistics are most important to collect on the database, which in turn leads to better performance they enable all kinds of auto-generated support -- ORMs can generate themselves, visualization tools will be able to create nice schema layouts for you, etc. someone new to the project will get into the flow of things faster since otherwise implicit relationships are explicitly documented
不使用外键的原因:
you are making the DB work extra on every CRUD operation because it has to check FK consistency. This can be a big cost if you have a lot of churn by enforcing relationships, FKs specify an order in which you have to add/delete things, which can lead to refusal by the DB to do what you want. (Granted, in such cases, what you are trying to do is create an Orphaned Row, and that's not usually a good thing). This is especially painful when you are doing large batch updates, and you load up one table before another, with the second table creating consistent state (but should you be doing that sort of thing if there is a possibility that the second load fails and your database is now inconsistent?). sometimes you know beforehand your data is going to be dirty, you accept that, and you want the DB to accept it you are just being lazy :-)
我认为(我不确定!)大多数已建立的数据库都提供了一种指定外键的方法,这种方法不是强制的,只是一些元数据。由于不强制执行消除了不使用fk的所有理由,如果第二部分中的任何理由适用,您可能应该走那条路。
更新:我现在总是使用外键。对于反对意见“他们使测试变得复杂”,我的回答是“编写单元测试,这样他们就根本不需要数据库。任何使用该数据库的测试都应该正确地使用它,这包括外键。如果准备工作很痛苦,那就找一种不那么痛苦的方式来做。”
外键使自动化测试复杂化
假设您正在使用外键。您正在编写一个自动测试,该测试表示“当我更新财务帐户时,它应该保存交易记录。”在这个测试中,您只关心两个表:帐户和事务。
但是,accounts对契约有一个外键,契约对客户有一个fk,客户对城市有一个fk,城市对州有一个fk。
现在,数据库将不允许您运行测试,除非在四个与测试无关的表中设置数据。
至少有两种可能的观点:
“这是一件好事:你的测试应该是现实的,这些数据限制将存在于生产中。” “这是一件坏事:你应该能够在不涉及其他部分的情况下对系统的各个部分进行单元测试。您可以为整个系统添加集成测试。”
也可以在运行测试时暂时关闭外键检查。至少MySQL支持这一点。
它们会使删除记录变得更加麻烦——当其他表中存在外键违反约束的记录时,您就不能删除“主”记录。可以使用触发器进行级联删除。
如果不明智地选择主键,则更改该值将变得更加复杂。例如,如果我有我的“客户”表的PK作为人的名字,并使该键在“订单”表中为FK,如果客户想更改他的名字,那么这是一个巨大的痛苦……但这只是粗制滥造的数据库设计。
我相信使用火密钥的优点大于任何假定的缺点。