stats/opentelemetry: restore the changes from #8342 and fix the flaky test. (#8923)

Fixes https://github.com/grpc/grpc-go/issues/8700

This PR re-lands the changes from #8342 , which were reverted due to a
flaky test. It cherry-picks the original commits and adds a fix for the
underlying race condition that caused the test
`TestTraceSpan_WithRetriesAndNameResolutionDelay` to flake.

**Problem**

As mentioned
[here](https://github.com/grpc/grpc-go/pull/8715#issuecomment-3581229693),
the test was flaky because of a race condition between the load
balancing policy creating a ready picker and the RPC attempting to pick
a connection. If the picker became ready before the first Pick attempt,
the RPC would not be delayed, the "Delayed LB pick complete" event would
not be emitted, and the test would fail.

**Fix**

To solve the race condition, the test now uses a custom stub balancer
and returns a blocking picker that guarantees the RPC will wait for a
connection. As soon as the RPC attempts to Pick and is confirmed to be
in a waiting state, the balancer then provides a valid, "non-blocking"
picker, allowing the RPC to succeed. This sequence reliably triggers the
`DelayedPickComplete` event.

RELEASE NOTES: 
* stats/opentelemetry: Retry attempts (grpc.previous-rpc-attempts) are
now recorded as span attributes for non-transparent client retries.

---------

Co-authored-by: vinothkumarr227 <vinothkumarr@google.com>
4 files changed
tree: 9f8f1ff48d4b5927fbfc6f70c9e34e49873da2ed
  1. .gemini/
  2. .github/
  3. admin/
  4. attributes/
  5. authz/
  6. backoff/
  7. balancer/
  8. benchmark/
  9. binarylog/
  10. channelz/
  11. cmd/
  12. codes/
  13. connectivity/
  14. credentials/
  15. Documentation/
  16. encoding/
  17. examples/
  18. experimental/
  19. gcp/
  20. grpclog/
  21. health/
  22. internal/
  23. interop/
  24. keepalive/
  25. mem/
  26. metadata/
  27. orca/
  28. peer/
  29. profiling/
  30. reflection/
  31. resolver/
  32. scripts/
  33. security/
  34. serviceconfig/
  35. stats/
  36. status/
  37. tap/
  38. test/
  39. testdata/
  40. xds/
  41. AUTHORS
  42. backoff.go
  43. balancer_wrapper.go
  44. balancer_wrapper_test.go
  45. call.go
  46. clientconn.go
  47. clientconn_authority_test.go
  48. clientconn_parsed_target_test.go
  49. clientconn_test.go
  50. CODE-OF-CONDUCT.md
  51. codec.go
  52. codec_test.go
  53. CONTRIBUTING.md
  54. default_dial_option_server_option_test.go
  55. dial_test.go
  56. dialoptions.go
  57. doc.go
  58. go.mod
  59. go.sum
  60. GOVERNANCE.md
  61. grpc_test.go
  62. interceptor.go
  63. LICENSE
  64. MAINTAINERS.md
  65. Makefile
  66. NOTICE.txt
  67. picker_wrapper.go
  68. picker_wrapper_test.go
  69. preloader.go
  70. producer_ext_test.go
  71. README.md
  72. resolver_balancer_ext_test.go
  73. resolver_test.go
  74. resolver_wrapper.go
  75. rpc_util.go
  76. rpc_util_test.go
  77. SECURITY.md
  78. server.go
  79. server_ext_test.go
  80. server_test.go
  81. service_config.go
  82. service_config_test.go
  83. stream.go
  84. stream_interfaces.go
  85. stream_test.go
  86. trace.go
  87. trace_notrace.go
  88. trace_test.go
  89. trace_withtrace.go
  90. version.go
README.md

gRPC-Go

GoDoc GoReportCard codecov

The Go implementation of gRPC: A high performance, open source, general RPC framework that puts mobile and HTTP/2 first. For more information see the Go gRPC docs, or jump directly into the quick start.

Prerequisites

Installation

Simply add the following import to your code, and then go [build|run|test] will automatically fetch the necessary dependencies:

import "google.golang.org/grpc"

Note: If you are trying to access grpc-go from China, see the FAQ below.

Learn more

FAQ

I/O Timeout Errors

The golang.org domain may be blocked from some countries. go get usually produces an error like the following when this happens:

$ go get -u google.golang.org/grpc
package google.golang.org/grpc: unrecognized import path "google.golang.org/grpc" (https fetch: Get https://google.golang.org/grpc?go-get=1: dial tcp 216.239.37.1:443: i/o timeout)

To build Go code, there are several options:

  • Set up a VPN and access google.golang.org through that.

  • With Go module support: it is possible to use the replace feature of go mod to create aliases for golang.org packages. In your project's directory:

    go mod edit -replace=google.golang.org/grpc=github.com/grpc/grpc-go@latest
    go mod tidy
    go mod vendor
    go build -mod=vendor
    

    Again, this will need to be done for all transitive dependencies hosted on golang.org as well. For details, refer to golang/go issue #28652.

Compiling error, undefined: grpc.SupportPackageIsVersion

Please update to the latest version of gRPC-Go using go get google.golang.org/grpc.

How to turn on logging

The default logger is controlled by environment variables. Turn everything on like this:

$ export GRPC_GO_LOG_VERBOSITY_LEVEL=99
$ export GRPC_GO_LOG_SEVERITY_LEVEL=info

The RPC failed with error "code = Unavailable desc = transport is closing"

This error means the connection the RPC is using was closed, and there are many possible reasons, including:

  1. mis-configured transport credentials, connection failed on handshaking
  2. bytes disrupted, possibly by a proxy in between
  3. server shutdown
  4. Keepalive parameters caused connection shutdown, for example if you have configured your server to terminate connections regularly to trigger DNS lookups. If this is the case, you may want to increase your MaxConnectionAgeGrace, to allow longer RPC calls to finish.

It can be tricky to debug this because the error happens on the client side but the root cause of the connection being closed is on the server side. Turn on logging on both client and server, and see if there are any transport errors.