Fix flaky test `TestAggregatedClusterSuccess_SwitchBetweenLeafAndAggregate` in aggregate_cluster_test.go. (#9009)

Fixes #8989 

This PR addresses the flakiness observed in the
`TestAggregatedClusterSuccess_SwitchBetweenLeafAndAggregate ` test. Test
was using real DNS resolution which introduced non-deterministic delays
and race conditions during testing.

### Changes:
* **Mocked DNS Resolver**: Introduced a `setupDNS` helper function in
`cdsbalancer_test.go` that unregisters the default DNS resolver and
registers a manual resolver for the `dns` scheme. This allows us to
intercept DNS targets and provide mock addresses immediately.

* **Updated Tests**: Updated the following tests in
`aggregate_cluster_test.go` to use the fake DNS resolver:
    *   `TestAggregatedClusterSuccess_SwitchBetweenLeafAndAggregate`
    *   `TestAggregateClusterSuccess_ThenUpdateChildClusters`
    *   `TestAggregateClusterSuccess_ThenChangeRootToEDS`
 
* **Adjusted Test Timeout**: Increased the `defaultTestTimeout` in
`cdsbalancer_test.go` from `5s` to `10s` to align with the standard test
timeout practices used across the `grpc-go` codebase.

### Testing/Validation:
* Successfully reproduced the flakiness locally by manually injecting a
sleep delay to simulate a slow real DNS resolution update in
`TestAggregatedClusterSuccess_SwitchBetweenLeafAndAggregate`.
* Validated that applying the mock DNS resolver completely resolves the
flakiness.

RELEASE NOTES: N/A
2 files changed
tree: a084a36221c16559dc72a50fc0ac494d710612b1
  1. .gemini/
  2. .github/
  3. admin/
  4. attributes/
  5. authz/
  6. backoff/
  7. balancer/
  8. benchmark/
  9. binarylog/
  10. channelz/
  11. cmd/
  12. codes/
  13. connectivity/
  14. credentials/
  15. Documentation/
  16. encoding/
  17. examples/
  18. experimental/
  19. gcp/
  20. grpclog/
  21. health/
  22. internal/
  23. interop/
  24. keepalive/
  25. mem/
  26. metadata/
  27. orca/
  28. peer/
  29. profiling/
  30. reflection/
  31. resolver/
  32. scripts/
  33. security/
  34. serviceconfig/
  35. stats/
  36. status/
  37. tap/
  38. test/
  39. testdata/
  40. xds/
  41. AUTHORS
  42. backoff.go
  43. balancer_wrapper.go
  44. balancer_wrapper_test.go
  45. call.go
  46. clientconn.go
  47. clientconn_authority_test.go
  48. clientconn_parsed_target_test.go
  49. clientconn_test.go
  50. CODE-OF-CONDUCT.md
  51. codec.go
  52. codec_test.go
  53. CONTRIBUTING.md
  54. default_dial_option_server_option_test.go
  55. dial_test.go
  56. dialoptions.go
  57. doc.go
  58. go.mod
  59. go.sum
  60. GOVERNANCE.md
  61. grpc_test.go
  62. interceptor.go
  63. LICENSE
  64. MAINTAINERS.md
  65. Makefile
  66. NOTICE.txt
  67. picker_wrapper.go
  68. picker_wrapper_test.go
  69. preloader.go
  70. producer_ext_test.go
  71. README.md
  72. resolver_balancer_ext_test.go
  73. resolver_test.go
  74. resolver_wrapper.go
  75. rpc_util.go
  76. rpc_util_test.go
  77. SECURITY.md
  78. server.go
  79. server_ext_test.go
  80. server_test.go
  81. service_config.go
  82. service_config_test.go
  83. stream.go
  84. stream_interfaces.go
  85. stream_test.go
  86. trace.go
  87. trace_notrace.go
  88. trace_test.go
  89. trace_withtrace.go
  90. version.go
README.md

gRPC-Go

GoDoc GoReportCard codecov

The Go implementation of gRPC: A high performance, open source, general RPC framework that puts mobile and HTTP/2 first. For more information see the Go gRPC docs, or jump directly into the quick start.

Prerequisites

Installation

Simply add the following import to your code, and then go [build|run|test] will automatically fetch the necessary dependencies:

import "google.golang.org/grpc"

Note: If you are trying to access grpc-go from China, see the FAQ below.

Learn more

FAQ

I/O Timeout Errors

The golang.org domain may be blocked from some countries. go get usually produces an error like the following when this happens:

$ go get -u google.golang.org/grpc
package google.golang.org/grpc: unrecognized import path "google.golang.org/grpc" (https fetch: Get https://google.golang.org/grpc?go-get=1: dial tcp 216.239.37.1:443: i/o timeout)

To build Go code, there are several options:

  • Set up a VPN and access google.golang.org through that.

  • With Go module support: it is possible to use the replace feature of go mod to create aliases for golang.org packages. In your project's directory:

    go mod edit -replace=google.golang.org/grpc=github.com/grpc/grpc-go@latest
    go mod tidy
    go mod vendor
    go build -mod=vendor
    

    Again, this will need to be done for all transitive dependencies hosted on golang.org as well. For details, refer to golang/go issue #28652.

Compiling error, undefined: grpc.SupportPackageIsVersion

Please update to the latest version of gRPC-Go using go get google.golang.org/grpc.

How to turn on logging

The default logger is controlled by environment variables. Turn everything on like this:

$ export GRPC_GO_LOG_VERBOSITY_LEVEL=99
$ export GRPC_GO_LOG_SEVERITY_LEVEL=info

The RPC failed with error "code = Unavailable desc = transport is closing"

This error means the connection the RPC is using was closed, and there are many possible reasons, including:

  1. mis-configured transport credentials, connection failed on handshaking
  2. bytes disrupted, possibly by a proxy in between
  3. server shutdown
  4. Keepalive parameters caused connection shutdown, for example if you have configured your server to terminate connections regularly to trigger DNS lookups. If this is the case, you may want to increase your MaxConnectionAgeGrace, to allow longer RPC calls to finish.

It can be tricky to debug this because the error happens on the client side but the root cause of the connection being closed is on the server side. Turn on logging on both client and server, and see if there are any transport errors.