Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Enable migration tests for clusters in legacy schema #2975

Merged
merged 17 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ func TestMigAdvancedCluster_singleShardedMultiCloud(t *testing.T) {
}

func TestMigAdvancedCluster_symmetricGeoShardedOldSchema(t *testing.T) {
acc.SkipIfAdvancedClusterV2Schema(t) // unexpected update and then: error operation not permitted, nums_shards from 1 -> > 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last mig test to enable for TPF

testCase := symmetricGeoShardedOldSchemaTestCase(t, false)
mig.CreateAndRunTest(t, &testCase)
}
Expand Down
138 changes: 94 additions & 44 deletions internal/service/advancedclustertpf/move_upgrade_state.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ package advancedclustertpf
import (
"context"
"fmt"
"math/big"
"strings"

"github.com/hashicorp/terraform-plugin-framework-timeouts/resource/timeouts"
"github.com/hashicorp/terraform-plugin-framework/attr"
"github.com/hashicorp/terraform-plugin-framework/diag"
"github.com/hashicorp/terraform-plugin-framework/path"
"github.com/hashicorp/terraform-plugin-framework/resource"
"github.com/hashicorp/terraform-plugin-framework/tfsdk"
"github.com/hashicorp/terraform-plugin-framework/types"
Expand All @@ -18,10 +18,12 @@ import (
"go.mongodb.org/atlas-sdk/v20241113004/admin"
)

// MoveState is used with moved block to upgrade from cluster to adv_cluster
func (r *rs) MoveState(context.Context) []resource.StateMover {
return []resource.StateMover{{StateMover: stateMover}}
}

// UpgradeState is used to upgrade from adv_cluster schema v1 (SDKv2) to v2 (TPF)
func (r *rs) UpgradeState(ctx context.Context) map[int64]resource.StateUpgrader {
return map[int64]resource.StateUpgrader{
1: {StateUpgrader: stateUpgraderFromV1},
Expand All @@ -39,101 +41,149 @@ func stateUpgraderFromV1(ctx context.Context, req resource.UpgradeStateRequest,
setStateResponse(ctx, &resp.Diagnostics, req.RawState, &resp.State)
}

func setStateResponse(ctx context.Context, diags *diag.Diagnostics, stateIn *tfprotov6.RawState, stateOut *tfsdk.State) {
rawStateValue, err := stateIn.UnmarshalWithOpts(tftypes.Object{
// Minimum attributes needed from source schema. Read will fill in the rest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking comment: in the future where new attributes will be added, will there be some compile-time or test failure if that field needs to be specified here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in line 69 , we're using: IgnoreUndefinedAttributes: true}
that means that we're flexible in the schema, if you see the code below the real mandatory ones are project_id and cluster, the ones we try to used them in a best-effort basis, it's ok if they don't come (e.g. later when moving from flex cluster)

so the schema doesn't make it fail if they some attribute doesn't exist, later when the value is tried to be read it will be null

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok if they don't come
ok then why are we even populating them? What is the advantage of populating? Answering this question should also help me with understanding the consequence of:
later when the value is tried to be read it will be null

Copy link
Member Author

@lantoli lantoli Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the previous version/resource have them and we don't set them, then there will a plan change, e.g. if timeouts or retain_backups_enabled is defined in SDKv2 and want to migrate to TPF, users will get a plan change saying that timeouts/retain_backups_enabled will be deleted.
(we don't want plan changes when upgrading from sdkv2 to tpf or moved block from cluster to tpf adv_cluster)

in the case that will come with moving from flex cluster to adv_cluster for example, they don't exist so it's ok not to send them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's to avoid plan changes by filling attributes that Read can't do

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarified here: b67243f

var stateAttrs = map[string]tftypes.Type{
"project_id": tftypes.String, // project_id and name to identify the cluster
"name": tftypes.String,
"retain_backups_enabled": tftypes.Bool, // TF specific so can't be got in Read
"mongo_db_major_version": tftypes.String, // Has special logic in overrideAttributesWithPrevStateValue that needs the previous state
"timeouts": tftypes.Object{ // TF specific so can't be got in Read
AttributeTypes: map[string]tftypes.Type{
"project_id": tftypes.String,
"name": tftypes.String,
"retain_backups_enabled": tftypes.Bool,
"mongo_db_major_version": tftypes.String,
"timeouts": tftypes.Object{
AttributeTypes: map[string]tftypes.Type{
"create": tftypes.String,
"update": tftypes.String,
"delete": tftypes.String,
},
"create": tftypes.String,
"update": tftypes.String,
"delete": tftypes.String,
},
},
"replication_specs": tftypes.List{ // Needed to check if some num_shards are > 1 so we need to force legacy schema
ElementType: tftypes.Object{
AttributeTypes: map[string]tftypes.Type{
"num_shards": tftypes.Number,
},
},
},
}

func setStateResponse(ctx context.Context, diags *diag.Diagnostics, stateIn *tfprotov6.RawState, stateOut *tfsdk.State) {
rawStateValue, err := stateIn.UnmarshalWithOpts(tftypes.Object{
AttributeTypes: stateAttrs,
}, tfprotov6.UnmarshalOpts{ValueFromJSONOpts: tftypes.ValueFromJSONOpts{IgnoreUndefinedAttributes: true}})
if err != nil {
diags.AddError("Unable to Unmarshal state", err.Error())
return
}
var rawState map[string]tftypes.Value
if err := rawStateValue.As(&rawState); err != nil {
var stateObj map[string]tftypes.Value
if err := rawStateValue.As(&stateObj); err != nil {
diags.AddError("Unable to Parse state", err.Error())
return
}

projectID := getAttrFromRawState[string](diags, rawState, "project_id")
name := getAttrFromRawState[string](diags, rawState, "name")
projectID, name := getProjectIDNameFromStateObj(diags, stateObj)
if diags.HasError() {
return
}
if !conversion.IsStringPresent(projectID) || !conversion.IsStringPresent(name) {
diags.AddError("Unable to read project_id or name from state", fmt.Sprintf("project_id: %s, name: %s",
conversion.SafeString(projectID), conversion.SafeString(name)))
return
}

model := NewTFModel(ctx, &admin.ClusterDescription20240805{
GroupId: projectID,
Name: name,
}, getAttrTimeout(diags, rawState), diags, ExtraAPIInfo{})
if diags.HasError() {
return
}

if retainBackupsEnabled := getAttrFromRawState[bool](diags, rawState, "retain_backups_enabled"); retainBackupsEnabled != nil {
model.RetainBackupsEnabled = types.BoolPointerValue(retainBackupsEnabled)
}
if mongoDBMajorVersion := getAttrFromRawState[string](diags, rawState, "mongo_db_major_version"); mongoDBMajorVersion != nil {
model.MongoDBMajorVersion = types.StringPointerValue(mongoDBMajorVersion)
}
}, getTimeoutFromStateObj(stateObj), diags, ExtraAPIInfo{})
if diags.HasError() {
return
}

AddAdvancedConfig(ctx, model, nil, nil, diags)
if diags.HasError() {
return
}
setOptionalModelAttrs(stateObj, model)
diags.Append(stateOut.Set(ctx, model)...)
}

func getAttrFromRawState[T any](diags *diag.Diagnostics, rawState map[string]tftypes.Value, attrName string) *T {
func getAttrFromStateObj[T any](rawState map[string]tftypes.Value, attrName string) *T {
var ret *T
if err := rawState[attrName].As(&ret); err != nil {
diags.AddAttributeError(path.Root(attrName), fmt.Sprintf("Unable to read cluster %s", attrName), err.Error())
return nil
}
return ret
}

func getAttrTimeout(diags *diag.Diagnostics, rawState map[string]tftypes.Value) timeouts.Value {
func getProjectIDNameFromStateObj(diags *diag.Diagnostics, stateObj map[string]tftypes.Value) (projectID, name *string) {
projectID = getAttrFromStateObj[string](stateObj, "project_id")
name = getAttrFromStateObj[string](stateObj, "name")
if !conversion.IsStringPresent(projectID) || !conversion.IsStringPresent(name) {
diags.AddError("Unable to read project_id or name from state", fmt.Sprintf("project_id: %s, name: %s",
conversion.SafeString(projectID), conversion.SafeString(name)))
return
}
return projectID, name
}

func getTimeoutFromStateObj(stateObj map[string]tftypes.Value) timeouts.Value {
attrTypes := map[string]attr.Type{
"create": types.StringType,
"update": types.StringType,
"delete": types.StringType,
}
nullObj := timeouts.Value{Object: types.ObjectNull(attrTypes)}
timeoutState := getAttrFromRawState[map[string]tftypes.Value](diags, rawState, "timeouts")
if diags.HasError() || timeoutState == nil {
timeoutState := getAttrFromStateObj[map[string]tftypes.Value](stateObj, "timeouts")
if timeoutState == nil {
return nullObj
}
timeoutMap := make(map[string]attr.Value)
for action := range attrTypes {
actionTimeout := getAttrFromRawState[string](diags, *timeoutState, action)
actionTimeout := getAttrFromStateObj[string](*timeoutState, action)
if actionTimeout == nil {
timeoutMap[action] = types.StringNull()
} else {
timeoutMap[action] = types.StringPointerValue(actionTimeout)
}
}
obj, d := types.ObjectValue(attrTypes, timeoutMap)
diags.Append(d...)
if diags.HasError() {
if d.HasError() {
return nullObj
}
return timeouts.Value{Object: obj}
}

func setOptionalModelAttrs(stateObj map[string]tftypes.Value, model *TFModel) {
if retainBackupsEnabled := getAttrFromStateObj[bool](stateObj, "retain_backups_enabled"); retainBackupsEnabled != nil {
model.RetainBackupsEnabled = types.BoolPointerValue(retainBackupsEnabled)
}
if mongoDBMajorVersion := getAttrFromStateObj[string](stateObj, "mongo_db_major_version"); mongoDBMajorVersion != nil {
model.MongoDBMajorVersion = types.StringPointerValue(mongoDBMajorVersion)
}
if isLegacySchemaState(stateObj) {
sendLegacySchemaRequestToRead(model)
}
}

func isLegacySchemaState(stateObj map[string]tftypes.Value) bool {
one := big.NewFloat(1.0)
specsVal := getAttrFromStateObj[[]tftypes.Value](stateObj, "replication_specs")
if specsVal == nil {
return false
}
for _, specVal := range *specsVal {
var specObj map[string]tftypes.Value
if err := specVal.As(&specObj); err != nil {
return false
}
numShardsVal := specObj["num_shards"]
var numShards *big.Float
if err := numShardsVal.As(&numShards); err != nil || numShards == nil {
return false
}
if numShards.Cmp(one) > 0 { // legacy schema if numShards > 1
return true
}
}
return false
}

// sendLegacySchemaRequestToRead sets ClusterID to a special value so Read can know whether it must use legacy schema.
// private state can't be used here because it's not available in Move Upgrader.
// ClusterID is computed (not optional) so the value will be overridden in Read and the special value won't ever appear in the state file.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if there is any question about why we use ClusterID as a side channel to communicate between State Move / Upgrader and Read. Also if you have a better idea, please let me know

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we go with this option, do you think we can add a check of ClusterID != forceLegacySchema in one of the tests that are testing the upgrade?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not using ClusterID any more

func sendLegacySchemaRequestToRead(model *TFModel) {
model.ClusterID = types.StringValue("forceLegacySchema")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, is it too complex to populate the replication spec list only with the bare minimum so our existing logic detects the legacy sharding config (objects with num_shards)? With this approach we would avoid receivedLegacySchemaRequestInRead using cluster_id which looks more hacky

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit reluctant in case we needed to pass a lot of structure but at the end we only need to pass the num_shards in the replications_configs, changed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, looks good. If we can leave a comment give context of why we add replication specs that would be good for future reference

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did here: 8fe7cd6

would you add something else to the latest PR doc?


// receivedLegacySchemaRequestInRead checks if Read has to use the legacy schema because a State Move or Upgrader happened just before.
func receivedLegacySchemaRequestInRead(model *TFModel) bool {
return model.ClusterID.ValueString() == "forceLegacySchema"
}
3 changes: 2 additions & 1 deletion internal/service/advancedclustertpf/resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,8 @@ func (r *rs) readCluster(ctx context.Context, diags *diag.Diagnostics, state *TF
return nil
}
warningIfFCVExpiredOrUnpinnedExternally(diags, state, readResp)
modelOut, _ := getBasicClusterModel(ctx, diags, r.Client, readResp, state, false)
forceLegacySchema := receivedLegacySchemaRequestInRead(state)
modelOut, _ := getBasicClusterModel(ctx, diags, r.Client, readResp, state, forceLegacySchema)
if diags.HasError() {
return nil
}
Expand Down
Loading