blob: 9d55f00652def258ac9f86b52c12e5490ae82e83 [file] [log] [blame]
<html><body>
<style>
body, h1, h2, h3, div, span, p, pre, a {
margin: 0;
padding: 0;
border: 0;
font-weight: inherit;
font-style: inherit;
font-size: 100%;
font-family: inherit;
vertical-align: baseline;
}
body {
font-size: 13px;
padding: 1em;
}
h1 {
font-size: 26px;
margin-bottom: 1em;
}
h2 {
font-size: 24px;
margin-bottom: 1em;
}
h3 {
font-size: 20px;
margin-bottom: 1em;
margin-top: 1em;
}
pre, code {
line-height: 1.5;
font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
}
pre {
margin-top: 0.5em;
}
h1, h2, h3, p {
font-family: Arial, sans serif;
}
h1, h2, h3 {
border-bottom: solid #CCC 1px;
}
.toc_element {
margin-top: 0.5em;
}
.firstline {
margin-left: 2 em;
}
.method {
margin-top: 1em;
border: solid 1px #CCC;
padding: 1em;
background: #EEE;
}
.details {
font-weight: bold;
font-size: 14px;
}
</style>
<h1><a href="dataflow_v1b3.html">Dataflow API</a> . <a href="dataflow_v1b3.projects.html">projects</a> . <a href="dataflow_v1b3.projects.locations.html">locations</a> . <a href="dataflow_v1b3.projects.locations.jobs.html">jobs</a></h1>
<h2>Instance Methods</h2>
<p class="toc_element">
<code><a href="dataflow_v1b3.projects.locations.jobs.debug.html">debug()</a></code>
</p>
<p class="firstline">Returns the debug Resource.</p>
<p class="toc_element">
<code><a href="dataflow_v1b3.projects.locations.jobs.messages.html">messages()</a></code>
</p>
<p class="firstline">Returns the messages Resource.</p>
<p class="toc_element">
<code><a href="dataflow_v1b3.projects.locations.jobs.snapshots.html">snapshots()</a></code>
</p>
<p class="firstline">Returns the snapshots Resource.</p>
<p class="toc_element">
<code><a href="dataflow_v1b3.projects.locations.jobs.workItems.html">workItems()</a></code>
</p>
<p class="firstline">Returns the workItems Resource.</p>
<p class="toc_element">
<code><a href="#create">create(projectId, location, body=None, view=None, replaceJobId=None, x__xgafv=None)</a></code></p>
<p class="firstline">Creates a Cloud Dataflow job.</p>
<p class="toc_element">
<code><a href="#get">get(projectId, location, jobId, view=None, x__xgafv=None)</a></code></p>
<p class="firstline">Gets the state of the specified Cloud Dataflow job.</p>
<p class="toc_element">
<code><a href="#getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</a></code></p>
<p class="firstline">Request the job status.</p>
<p class="toc_element">
<code><a href="#list">list(projectId, location, pageToken=None, view=None, pageSize=None, filter=None, x__xgafv=None)</a></code></p>
<p class="firstline">List the jobs of a project.</p>
<p class="toc_element">
<code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
<p class="firstline">Retrieves the next page of results.</p>
<p class="toc_element">
<code><a href="#snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
<p class="firstline">Snapshot the state of a streaming job.</p>
<p class="toc_element">
<code><a href="#update">update(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
<p class="firstline">Updates the state of an existing Cloud Dataflow job.</p>
<h3>Method Details</h3>
<div class="method">
<code class="details" id="create">create(projectId, location, body=None, view=None, replaceJobId=None, x__xgafv=None)</code>
<pre>Creates a Cloud Dataflow job.
To create a job, we recommend using `projects.locations.jobs.create` with a
[regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
`projects.jobs.create` is not recommended, as your job will always start
in `us-central1`.
Args:
projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
location: string, The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
contains this job. (required)
body: object, The request body.
The object takes the form of:
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
}
view: string, The level of information requested in response.
replaceJobId: string, Deprecated. This field is now in the Job message.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
}</pre>
</div>
<div class="method">
<code class="details" id="get">get(projectId, location, jobId, view=None, x__xgafv=None)</code>
<pre>Gets the state of the specified Cloud Dataflow job.
To get the state of a job, we recommend using `projects.locations.jobs.get`
with a [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
`projects.jobs.get` is not recommended, as you can only get the state of
jobs that are running in `us-central1`.
Args:
projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
location: string, The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
contains this job. (required)
jobId: string, The job ID. (required)
view: string, The level of information requested in response.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
}</pre>
</div>
<div class="method">
<code class="details" id="getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</code>
<pre>Request the job status.
To request the status of a job, we recommend using
`projects.locations.jobs.getMetrics` with a [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
`projects.jobs.getMetrics` is not recommended, as you can only request the
status of jobs that are running in `us-central1`.
Args:
projectId: string, A project id. (required)
location: string, The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
contains the job specified by job_id. (required)
jobId: string, The job to get messages for. (required)
startTime: string, Return only metric data that has changed since this time.
Default is to return all information about all metrics for the job.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # JobMetrics contains a collection of metrics describing the detailed progress
# of a Dataflow job. Metrics correspond to user-defined and system-defined
# metrics in the job.
#
# This resource captures only the most recent values of each metric;
# time-series data can be queried for them (under the same metric names)
# from Cloud Monitoring.
&quot;metricTime&quot;: &quot;A String&quot;, # Timestamp as of which metric values are current.
&quot;metrics&quot;: [ # All metrics for this job.
{ # Describes the state of a metric.
&quot;distribution&quot;: &quot;&quot;, # A struct value describing properties of a distribution of numeric values.
&quot;kind&quot;: &quot;A String&quot;, # Metric aggregation kind. The possible metric aggregation kinds are
# &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;, &quot;Mean&quot;, &quot;Set&quot;, &quot;And&quot;, &quot;Or&quot;, and &quot;Distribution&quot;.
# The specified aggregation kind is case-insensitive.
#
# If omitted, this is not an aggregated value but instead
# a single metric sample value.
&quot;gauge&quot;: &quot;&quot;, # A struct value describing properties of a Gauge.
# Metrics of gauge type show the value of a metric across time, and is
# aggregated based on the newest value.
&quot;updateTime&quot;: &quot;A String&quot;, # Timestamp associated with the metric value. Optional when workers are
# reporting work progress; it will be filled in responses from the
# metrics API.
&quot;scalar&quot;: &quot;&quot;, # Worker-computed aggregate value for aggregation kinds &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;,
# &quot;And&quot;, and &quot;Or&quot;. The possible value types are Long, Double, and Boolean.
&quot;cumulative&quot;: True or False, # True if this metric is reported as the total cumulative aggregate
# value accumulated since the worker started working on this WorkItem.
# By default this is false, indicating that this metric is reported
# as a delta that is not associated with any WorkItem.
&quot;name&quot;: { # Identifies a metric, by describing the source which generated the # Name of the metric.
# metric.
&quot;context&quot;: { # Zero or more labeled fields which identify the part of the job this
# metric is associated with, such as the name of a step or collection.
#
# For example, built-in counters associated with steps will have
# context[&#x27;step&#x27;] = &lt;step-name&gt;. Counters associated with PCollections
# in the SDK will have context[&#x27;pcollection&#x27;] = &lt;pcollection-name&gt;.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;name&quot;: &quot;A String&quot;, # Worker-defined metric name.
&quot;origin&quot;: &quot;A String&quot;, # Origin (namespace) of metric name. May be blank for user-define metrics;
# will be &quot;dataflow&quot; for metrics defined by the Dataflow service or SDK.
},
&quot;meanCount&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
# This holds the count of the aggregated values and is used in combination
# with mean_sum above to obtain the actual mean aggregate value.
# The only possible value type is Long.
&quot;meanSum&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
# This holds the sum of the aggregated values and is used in combination
# with mean_count below to obtain the actual mean aggregate value.
# The only possible value types are Long and Double.
&quot;set&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Set&quot; aggregation kind. The only
# possible value type is a list of Values whose type can be Long, Double,
# or String, according to the metric&#x27;s type. All Values in the list must
# be of the same type.
&quot;internal&quot;: &quot;&quot;, # Worker-computed aggregate value for internal use by the Dataflow
# service.
},
],
}</pre>
</div>
<div class="method">
<code class="details" id="list">list(projectId, location, pageToken=None, view=None, pageSize=None, filter=None, x__xgafv=None)</code>
<pre>List the jobs of a project.
To list the jobs of a project in a region, we recommend using
`projects.locations.jobs.list` with a [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To
list the all jobs across all regions, use `projects.jobs.aggregated`. Using
`projects.jobs.list` is not recommended, as you can only get the list of
jobs that are running in `us-central1`.
Args:
projectId: string, The project which owns the jobs. (required)
location: string, The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
contains this job. (required)
pageToken: string, Set this to the &#x27;next_page_token&#x27; field of a previous response
to request additional results in a long list.
view: string, Level of information requested in response. Default is `JOB_VIEW_SUMMARY`.
pageSize: integer, If there are many jobs, limit response to at most this many.
The actual number of jobs returned will be the lesser of max_responses
and an unspecified server-defined limit.
filter: string, The kind of filter to use.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Response to a request to list Cloud Dataflow jobs in a project. This might
# be a partial response, depending on the page size in the ListJobsRequest.
# However, if the project does not have any jobs, an instance of
# ListJobsResponse is not returned and the requests&#x27;s response
# body is empty {}.
&quot;jobs&quot;: [ # A subset of the requested job information.
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
},
],
&quot;nextPageToken&quot;: &quot;A String&quot;, # Set if there may be more results than fit in this response.
&quot;failedLocation&quot;: [ # Zero or more messages describing the [regional endpoints]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# failed to respond.
{ # Indicates which [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed
# to respond to a request for data.
&quot;name&quot;: &quot;A String&quot;, # The name of the [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# failed to respond.
},
],
}</pre>
</div>
<div class="method">
<code class="details" id="list_next">list_next(previous_request, previous_response)</code>
<pre>Retrieves the next page of results.
Args:
previous_request: The request for the previous page. (required)
previous_response: The response from the request for the previous page. (required)
Returns:
A request object that you can call &#x27;execute()&#x27; on to request the next
page. Returns None if there are no more items in the collection.
</pre>
</div>
<div class="method">
<code class="details" id="snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</code>
<pre>Snapshot the state of a streaming job.
Args:
projectId: string, The project which owns the job to be snapshotted. (required)
location: string, The location that contains this job. (required)
jobId: string, The job to be snapshotted. (required)
body: object, The request body.
The object takes the form of:
{ # Request to create a snapshot of a job.
&quot;snapshotSources&quot;: True or False, # If true, perform snapshots for sources which support this.
&quot;location&quot;: &quot;A String&quot;, # The location that contains this job.
&quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
&quot;ttl&quot;: &quot;A String&quot;, # TTL for the snapshot.
}
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Represents a snapshot of a job.
&quot;ttl&quot;: &quot;A String&quot;, # The time after which this snapshot will be automatically deleted.
&quot;state&quot;: &quot;A String&quot;, # State of the snapshot.
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this snapshot.
&quot;sourceJobId&quot;: &quot;A String&quot;, # The job this snapshot was created from.
&quot;creationTime&quot;: &quot;A String&quot;, # The time this snapshot was created.
&quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
&quot;pubsubMetadata&quot;: [ # PubSub snapshot metadata.
{ # Represents a Pubsub snapshot.
&quot;snapshotName&quot;: &quot;A String&quot;, # The name of the Pubsub snapshot.
&quot;expireTime&quot;: &quot;A String&quot;, # The expire time of the Pubsub snapshot.
&quot;topicName&quot;: &quot;A String&quot;, # The name of the Pubsub topic.
},
],
&quot;projectId&quot;: &quot;A String&quot;, # The project this snapshot belongs to.
&quot;diskSizeBytes&quot;: &quot;A String&quot;, # The disk byte size of the snapshot. Only available for snapshots in READY
# state.
}</pre>
</div>
<div class="method">
<code class="details" id="update">update(projectId, location, jobId, body=None, x__xgafv=None)</code>
<pre>Updates the state of an existing Cloud Dataflow job.
To update the state of an existing job, we recommend using
`projects.locations.jobs.update` with a [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
`projects.jobs.update` is not recommended, as you can only update the state
of jobs that are running in `us-central1`.
Args:
projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
location: string, The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
contains this job. (required)
jobId: string, The job ID. (required)
body: object, The request body.
The object takes the form of:
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
}
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Defines a job to be run by the Cloud Dataflow service.
&quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
# A description of the user pipeline and stages through which it is executed.
# Created by Cloud Dataflow service. Only retrieved with
# JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
# form. This data is provided by the Dataflow service for ease of visualizing
# the pipeline and interpreting Dataflow provided metrics.
&quot;displayData&quot;: [ # Pipeline level display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
{ # Description of the type, names/ids, and input/outputs for a transform.
&quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
&quot;A String&quot;,
],
&quot;displayData&quot;: [ # Transform-specific display data.
{ # Data provided with a pipeline or transform to provide descriptive info.
&quot;url&quot;: &quot;A String&quot;, # An optional full URL.
&quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
&quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
&quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
&quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
&quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
# This is intended to be used as a label for the display data
# when viewed in a dax monitoring system.
&quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
# language namespace (i.e. python module) which defines the display data.
# This allows a dax monitoring system to specially handle the data
# and perform custom rendering.
&quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
&quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
&quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
&quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
&quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
# For example a java_class_name_value of com.mypackage.MyDoFn
# will be stored with MyDoFn as the short_str_value and
# com.mypackage.MyDoFn as the java_class_name value.
# short_str_value can be displayed and java_class_name_value
# will be displayed as a tooltip.
},
],
&quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
&quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
&quot;A String&quot;,
],
&quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
&quot;kind&quot;: &quot;A String&quot;, # Type of transform.
},
],
&quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
{ # Description of the composing transforms, names/ids, and input/outputs of a
# stage of execution. Some composing transforms and sources may have been
# generated by the Dataflow service during execution planning.
&quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
{ # Description of an interstitial value between transforms in an execution
# stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
},
],
&quot;inputSource&quot;: [ # Input sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
&quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
{ # Description of a transform executed as part of an execution stage.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
&quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
# most closely associated.
},
],
&quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
&quot;outputSource&quot;: [ # Output sources for this stage.
{ # Description of an input or output of an execution stage.
&quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
&quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
# source is most closely associated.
&quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
&quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
},
],
&quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
},
],
},
&quot;labels&quot;: { # User-defined labels for this job.
#
# The labels map can contain no more than 64 entries. Entries of the labels
# map are UTF8 strings that comply with the following restrictions:
#
# * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
# * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
# * Both keys and values are additionally constrained to be &lt;= 128 bytes in
# size.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
&quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
&quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
&quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
# with worker_zone. If neither worker_region nor worker_zone is specified,
# default to the control plane&#x27;s region.
&quot;userAgent&quot;: { # A description of the process that generated the request.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
&quot;version&quot;: { # A structure describing which components and their versions of the service
# are required in order to run the job.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
# at rest, AKA a Customer Managed Encryption Key (CMEK).
#
# Format:
# projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
&quot;experiments&quot;: [ # The list of experiments to enable.
&quot;A String&quot;,
],
&quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
# (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
# which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
# with worker_region. If neither worker_region nor worker_zone is specified,
# a zone in the control plane&#x27;s region is chosen based on available capacity.
&quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
# specified in order for the job to have workers.
{ # Describes one particular pool of Cloud Dataflow workers to be
# instantiated by the Cloud Dataflow service in order to perform the
# computations required by a job. Note that a workflow job may use
# multiple pools, in order to match the various computational
# requirements of the various stages of the job.
&quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
# Compute Engine API.
&quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
# only be set in the Fn API path. For non-cross-language pipelines this
# should have only one entry. Cross-language pipelines will have two or more
# entries.
{ # Defines a SDK harness container for executing Dataflow pipelines.
&quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
&quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
# container instance with this image. If false (or unset) recommends using
# more than one core per SDK container instance with this image for
# efficiency. Note that Dataflow service may choose to override this property
# if needed.
},
],
&quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
# will attempt to choose a reasonable default.
&quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
# are supported.
&quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
&quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
{ # Describes the data disk used by a workflow job.
&quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
# must be a disk type appropriate to the project and zone in which
# the workers will run. If unknown or unspecified, the service
# will attempt to choose a reasonable default.
#
# For example, the standard persistent disk type is a resource name
# typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
# available, the resource name typically ends with &quot;pd-ssd&quot;. The
# actual valid values are defined the Google Compute Engine API,
# not by the Cloud Dataflow API; consult the Google Compute Engine
# documentation for more information about determining the set of
# available disk types for a particular project and zone.
#
# Google Compute Engine Disk types are local to a particular
# project in a particular zone, and so the resource name will
# typically look something like this:
#
# compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
&quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
},
],
&quot;packages&quot;: [ # Packages to be installed on workers.
{ # The packages that must be installed in order for a worker to run the
# steps of the Cloud Dataflow job that will be assigned to its worker
# pool.
#
# This is the mechanism by which the Cloud Dataflow SDK causes code to
# be loaded onto the workers. For example, the Cloud Dataflow Java SDK
# might use this to install jars containing the user&#x27;s code and all of the
# various dependencies (libraries, data files, etc.) required in order
# for that code to run.
&quot;name&quot;: &quot;A String&quot;, # The name of the package.
&quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}
# bucket.storage.googleapis.com/
},
],
&quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
# Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
# `TEARDOWN_NEVER`.
# `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
# the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
# if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
# down.
#
# If the workers are not torn down by the service, they will
# continue to run and use Google Compute Engine VM resources in the
# user&#x27;s project until they are explicitly terminated by the user.
# Because of this, Google recommends using the `TEARDOWN_ALWAYS`
# policy except for small, manually supervised test jobs.
#
# If unknown or unspecified, the service will attempt to choose a reasonable
# default.
&quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
# the service will use the network &quot;default&quot;.
&quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
&quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
&quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
&quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
},
&quot;poolArgs&quot;: { # Extra arguments for this worker pool.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
# the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
&quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
# execute the job. If zero or unspecified, the service will
# attempt to choose a reasonable default.
&quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
# service will choose a number of threads (according to the number of cores
# on the selected machine type for batch, or 1 by convention for streaming).
&quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
# harness, residing in Google Container Registry.
#
# Deprecated for the Fn API path. Use sdk_harness_container_images instead.
&quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
# using the standard Dataflow task runner. Users should ignore
# this field.
&quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
&quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
# access the Cloud Dataflow API.
&quot;A String&quot;,
],
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
&quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
# console.
&quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
&quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;root&quot;.
&quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
&quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
&quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
&quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
# &quot;shuffle/v1beta1&quot;.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
&quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
# &quot;dataflow/v1b3/projects&quot;.
&quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
#
# When workers access Google Cloud APIs, they logically do so via
# relative URLs. If this field is specified, it supplies the base
# URL to use for resolving these relative URLs. The normative
# algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
# Locators&quot;.
#
# If not specified, the default value is &quot;http://www.googleapis.com/&quot;
&quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
},
&quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
&quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
&quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
&quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
&quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
# taskrunner; e.g. &quot;wheel&quot;.
&quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
# will not be uploaded.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
&quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
# temporary storage.
#
# The supported resource type is:
#
# Google Cloud Storage:
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
},
&quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
# attempt to choose a reasonable default.
&quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
# select a default set of packages which are useful to worker
# harnesses written in a particular language.
&quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
# service will attempt to choose a reasonable default.
},
],
&quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
# storage. The system will append the suffix &quot;/temp-{JOBNAME} to
# this resource prefix, where {JOBNAME} is the value of the
# job_name field. The resulting bucket and object prefix is used
# as the prefix of the resources used to store temporary data
# needed during the job execution. NOTE: This will override the
# value in taskrunner_settings.
# The supported resource type is:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;internalExperiments&quot;: { # Experimental settings.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
},
&quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
# options are passed through the service and are used to recreate the
# SDK pipeline options on the worker in a language agnostic and platform
# independent way.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
# related tables are stored.
#
# The supported resource type is:
#
# Google BigQuery:
# bigquery.googleapis.com/{dataset}
&quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
# unspecified, the service will attempt to choose a reasonable
# default. This should be in the form of the API service name,
# e.g. &quot;compute.googleapis.com&quot;.
},
&quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
&quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
#
# The top-level steps that constitute the entire job.
{ # Defines a particular step within a Cloud Dataflow job.
#
# A job consists of multiple steps, each of which performs some
# specific operation as part of the overall job. Data is typically
# passed from one step to another as part of the job.
#
# Here&#x27;s an example of a sequence of steps which together implement a
# Map-Reduce job:
#
# * Read a collection of data from some source, parsing the
# collection&#x27;s elements.
#
# * Validate the elements.
#
# * Apply a user-defined function to map each element to some value
# and extract an element-specific key value.
#
# * Group elements with the same key into a single element with
# that key, transforming a multiply-keyed collection into a
# uniquely-keyed collection.
#
# * Write the elements out to some data sink.
#
# Note that the Cloud Dataflow service may be used to run many different
# types of jobs, not just Map-Reduce.
&quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
&quot;properties&quot;: { # Named properties associated with the step. Each kind of
# predefined step has its own required set of properties.
# Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
&quot;a_key&quot;: &quot;&quot;, # Properties of the object.
},
&quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
# step with respect to all other steps in the Cloud Dataflow job.
},
],
&quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
{ # A message describing the state of a particular execution stage.
&quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
&quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
&quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
},
],
&quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
# `JOB_STATE_UPDATED`), this field contains the ID of that job.
&quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
# by the metadata values provided here. Populated for ListJobs and all GetJob
# views SUMMARY and higher.
# ListJob response and Job SUMMARY view.
&quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
&quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
&quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
&quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
},
&quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
{ # Metadata for a BigTable connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
{ # Metadata for a PubSub connector used by the job.
&quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
&quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
},
],
&quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
{ # Metadata for a BigQuery connector used by the job.
&quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
&quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
&quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
},
],
&quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
{ # Metadata for a File connector used by the job.
&quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
},
],
&quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
{ # Metadata for a Datastore connector used by the job.
&quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
&quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
{ # Metadata for a Spanner connector used by the job.
&quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
&quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
&quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
},
],
},
&quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
# (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
# contains this job.
&quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
# corresponding name prefixes of the new job.
&quot;a_key&quot;: &quot;A String&quot;,
},
&quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
# Flexible resource scheduling jobs are started with some delay after job
# creation, so start_time is unset before start and is updated when the
# job is started by the Cloud Dataflow service. For other jobs, start_time
# always equals to create_time and is immutable and set by the Cloud Dataflow
# service.
&quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
# If this field is set, the service will ensure its uniqueness.
# The request to create a job will fail if the service has knowledge of a
# previously submitted job with the same client&#x27;s ID and job name.
# The caller may use this field to ensure idempotence of job
# creation across retried attempts to create a job.
# By default, the field is empty and, in that case, the service ignores it.
&quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
# isn&#x27;t contained in the submitted job.
&quot;stages&quot;: { # A mapping from each stage to the information about that stage.
&quot;a_key&quot;: { # Contains information about how a particular
# google.dataflow.v1beta3.Step will be executed.
&quot;stepName&quot;: [ # The steps associated with the execution stage.
# Note that stages may have several steps, and that a given step
# might be run by more than one stage.
&quot;A String&quot;,
],
},
},
},
&quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
&quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
# Cloud Dataflow service.
&quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
# for temporary storage. These temporary files will be
# removed on job completion.
# No duplicates are allowed.
# No file patterns are supported.
#
# The supported files are:
#
# Google Cloud Storage:
#
# storage.googleapis.com/{bucket}/{object}
# bucket.storage.googleapis.com/{object}
&quot;A String&quot;,
],
&quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
#
# This field is set by the Cloud Dataflow service when the Job is
# created, and is immutable for the life of the job.
&quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
#
# `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
# `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
# also be used to directly set a job&#x27;s requested state to
# `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
# job if it has not already reached a terminal state.
&quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
# of the job it replaced.
#
# When sending a `CreateJobRequest`, you can update a job by specifying it
# here. The job named here is stopped, and its intermediate state is
# transferred to this job.
&quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
# snapshot.
&quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
#
# Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
# specified.
#
# A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
# terminal state. After a job has reached a terminal state, no
# further state updates may be made.
#
# This field may be mutated by the Cloud Dataflow service;
# callers cannot mutate it.
&quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
#
# Only one Job with a given name may exist in a project at any
# given time. If a caller attempts to create a Job with the same
# name as an already-existing Job, the attempt returns the
# existing Job.
#
# The name must match the regular expression
# `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
&quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
}</pre>
</div>
</body></html>