我应该将.tfstate 文件提交给 Git 吗?

对于是否将 .tfstate文件提交到 Git 的问题,我有点困惑。地形文件指出:

Terraform 还默认将一些状态放入 terraform.tfstate文件中。这个状态文件非常重要; 它将各种资源元数据映射到实际的资源 ID,以便 Terraform 知道它在管理什么。这个文件必须保存并分发给任何可能运行 Terraform 的人。我们建议将其简单地放入版本控制中,因为它通常不太大。

现在,另一方面,使用 Terraform 的最佳实践的公认和反对的答案是:

Terraform 配置可以用来在不同的基础设施上提供许多机顶盒,每个机顶盒可以有不同的状态。由于它也可以由多个人运行,这种状态应该在一个集中的位置(如 S3) ,但 没有 git。

(重点是原作者,不是我)

谁是对的,如果是,为什么?

26905 次浏览

This is probably going to come down to preference but I would say git (or any other source control) is not a particularly good option for storing of state files as they are an output of the code you are writing much like a compiled binary or even minimised JS or LESS compiled to CSS.

On top of that things may change quite rapidly in the state files as an output to things being run rather than things being actually changed in the code which makes the whole thing rather awkward.

However, you do need some way of sharing these state files with any remote team members or even other devices if you are developing on different laptops/machines. You will also want some way to store and back these up because you're going to have some real pain if you lose a state file as Terraform uses the state files to work out what things it's managing so as not to step on the toes of other tooling.

I'd say S3 is probably the best place you can put them right now. It's pretty much free, durability is excellent as is availability, there's very good native support for it in Terraform using the remote state resource. And probably most importantly you only have to create an S3 bucket to get started. Having to build a Consul or etcd cluster first without Terraform (otherwise you have a chicken and egg problem of where do you store the state for creating those?) is a bit of a pain even if you intend to use either of those products.

Obviously if you're using OpenStack then Swift should make a good alternative (although I've not used it). I've also not used Hashicorp's Atlas but if you're happy to pay for that service it might be equally useful.

There are a few reasons not to store your .tfstate files in Git:

  1. You are likely to forget to commit and push your changes after running terraform apply, so your teammates will have out-of-date .tfstate files. Also, without any locking on these state files, if two team members run Terraform at the same time on the same .tfstate files, you may overwrite each other's changes. You can solve both problems by both a) storing .tfstate files in an S3 bucket using Terraform remote state, which will push/pull the .tfstate files automatically every time you run terraform apply and b) using a tool like terragrunt to provide locking for your .tfstate files.
  2. The .tfstate files may contain secrets. For example, if you use the aws_db_instance resource, you have to specify a database password, and Terraform will store that, in plaintext, in the .tfstate file. This is a bad practice on Terraform's behalf to begin with and storing unencrypted secrets in version control only makes it worse. At least if you store .tfstate files in S3, you can enable encryption at rest (SSL provides encryption while in motion) and configure IAM policies to limit who has access. It's very far from ideal and we'll have to see if the see open issue discussing this problem about it ever gets fixed.

For more info, check out How to manage Terraform state and Terraform: Up & Running, both of which I wrote.

TL;DR:

Important! Storing in source control could expose potentially sensitive data and risks running Terraform against an old version of state. Don't do it.

Terraform no longer recommends storing state in source control. Your 'good' options are remote or local.

Remote state grants significant benefits vs both local and storing in source control. Details of these are below.


Original answer:

Yevgeniy's answer is a good one. The issue is somewhat less controversial now as Terraform have updated their docs to state:

Terraform also puts some state into the terraform.tfstate file by default. This state file is extremely important; it maps various resource metadata to actual resource IDs so that Terraform knows what it is managing. This file must be saved and distributed to anyone who might run Terraform. It is generally recommended to setup remote state when working with Terraform. This will mean that any potential secrets stored in the state file, will not be checked into version control

So there is no longer a disagreement between established best practice and official recommendations.


Update 2019-05-17

In the most recent version of the docs this has been changed to say:

... This state is stored by default in a local file named "terraform.tfstate", but it can also be stored remotely, which works better in a team environment. ...

I don't expect the advice will ever revert to source control being the preferred method of storing state.

Despite the docs quote above remote state is still beneficial as a solo developer

Remote state allows the solo developer to:

  • Work on/run their Terraform code from several devices
  • Easily backup and protect against losing the state file, depending on backend chosen
  • Segregate sections of their architecture via outputs
  • Automatically encrypt state file at rest, depending on backend chosen

I see an advantage to share terraform.tfstate via other means, rather than Git.

For example: S3, Dropbox, etc.. (with versioning turned on)

Then it will be possible to roll back to previous infrastructure state.

For example, you roll back repository from commit B, back to commit A. If terraform.tfstate is unchanged - terraform will think how to roll back all stuff you've added during commit B. And rollback will be easy.

In case terraform.tfstate was also rolled back to commit A - then terraform will think that terraform.tfstate is in sync with required configuration and will not apply the rollback to your infrastructure.