CreateClusterAndSubmitSteps
yaml
type: "io.kestra.plugin.aws.emr.createclusterandsubmitsteps"
Examples
yaml
id: aws_emr_create_cluster
namespace: company.team
tasks:
- id: create_cluster
type: io.kestra.plugin.aws.emr.CreateClusterAndSubmitSteps
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_KEY_ID') }}"
region: eu-west-3
clusterName: "Spark job cluster"
logUri: "s3://my-bucket/test-emr-logs"
keepJobFlowAliveWhenNoSteps: true
applications:
- Spark
masterInstanceType: m5.xlarge
slaveInstanceType: m5.xlarge
instanceCount: 3
ec2KeyName: my-ec2-ssh-key-pair-name
steps:
- name: Spark_job_test
jar: "command-runner.jar"
actionOnFailure: CONTINUE
commands:
- spark-submit s3://mybucket/health_violations.py --data_source s3://mybucket/food_establishment_data.csv --output_uri s3://mybucket/test-emr-output
wait: true
Properties
clusterName *Requiredstring
instanceCount *Requiredintegerstring
masterInstanceType *Requiredstring
slaveInstanceType *Requiredstring
accessKeyId string
applications array
SubType string
compatibilityMode booleanstring
completionCheckInterval string
Default
PT10S
Format
duration
ec2KeyName string
ec2SubnetId string
endpointOverride string
forcePathStyle booleanstring
jobFlowRole string
Default
EMR_EC2_DefaultRole
keepJobFlowAliveWhenNoSteps booleanstring
Default
false
logUri string
region string
releaseLabel string
Default
emr-5.20.0
secretKeyId string
serviceRole string
Default
EMR_DefaultRole
sessionToken string
stsEndpointOverride string
stsRoleArn string
stsRoleExternalId string
stsRoleSessionDuration string
Default
PT15M
Format
duration
stsRoleSessionName string
visibleToAllUsers booleanstring
Default
true
wait booleanstring
Default
false
waitUntilCompletion string
Default
PT1H
Format
duration
Outputs
jobFlowId string
Definitions
io.kestra.plugin.aws.emr.models.StepConfig
actionOnFailure *Requiredstring
Possible Values
TERMINATE_CLUSTER
CANCEL_AND_WAIT
CONTINUE
TERMINATE_JOB_FLOW
jar *Requiredstring
name *Requiredstring
commands array
SubType string