Input and Output Files
Manage Input and Output files with your scripts.
Input Files
You can pass additional files to any script or CLI task using the inputFiles
property:
id: ansiblenamespace: company.team
tasks: - id: ansible_task type: io.kestra.plugin.ansible.cli.AnsibleCLI inputFiles: inventory.ini: | localhost ansible_connection=local myplaybook.yml: | --- - hosts: localhost tasks: - name: Print Hello World debug: msg: "Hello, World!" containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini myplaybook.yml
You can also leverage Namespace Files as follows:
id: ansiblenamespace: company.team
tasks: - id: ansible_task type: io.kestra.plugin.ansible.cli.AnsibleCLI namespaceFiles: enabled: true inputFiles: inventory.ini: "{{ read('inventory.ini') }}" myplaybook.yml: "{{ read('myplaybook.yml') }}" containerImage: cytopia/ansible:latest-tools commands: - ansible-playbook -i inventory.ini myplaybook.yml
Using input files to pass data from a trigger to a script task
Another use case for input files is when your custom scripts need input coming from other tasks or triggers.
Consider the following example flow that runs when a new object with the prefix "raw/"
arrives in the S3 bucket "declarative-orchestration"
:
id: s3TriggerCommandsnamespace: company.teamdescription: process CSV file from S3 trigger
tasks: - id: wdir type: io.kestra.plugin.core.flow.WorkingDirectory inputFiles: data.csv: "{{ trigger.objects | jq('.[].uri') | first }}" outputFiles: - "*.csv" - "*.parquet" tasks: - id: cloneRepo type: io.kestra.plugin.git.Clone url: https://github.com/kestra-io/examples branch: main
- id: python type: io.kestra.plugin.scripts.python.Commands description: this script reads a file `data.csv` from the S3 trigger taskRunner: type: io.kestra.plugin.scripts.runner.docker.Docker containerImage: ghcr.io/kestra-io/pydata:latest commands: - python scripts/clean_messy_dataset.py
triggers: - id: waitForS3object type: io.kestra.plugin.aws.s3.Trigger bucket: declarative-orchestration maxKeys: 1 interval: PT1S filter: FILES action: MOVE prefix: raw/ moveTo: key: archive/raw/ accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}" secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}" region: "{{ secret('AWS_DEFAULT_REGION') }}"
Note that nothing is hardcoded specifically to Kestra in the Python script from GitHub. That script remains pure Python that you can run anywhere. Kestra’s trigger logic is stored along with orchestration and infrastructure configuration in the YAML flow definition.
This separation of concerns (i.e., not mixing orchestration and business logic) makes your code easier to test and keeps your business logic vendor-agnostic.
Output files
If you want to generate files in your script to make them available for download and usable in downstream tasks, you can leverage either the outputFiles
property.
Generating outputs from a script task using outputFiles
From 0.17.0, outputDir
has been deprecated. Use the outputFiles
property instead.
The outputFiles
property allows to specify a list of files to be persisted in Kestra’s internal storage. Here is an example:
id: output_text_filesnamespace: company.team
tasks: - id: python_output type: io.kestra.plugin.scripts.python.Script taskRunner: type: io.kestra.plugin.core.runner.Process outputFiles: - "*.txt" script: | f = open("myfile.txt", "a") f.write("Hi, this is output from a script 👋") f.close()
- id: read_output type: io.kestra.plugin.scripts.shell.Commands taskRunner: type: io.kestra.plugin.core.runner.Process commands: - cat {{ outputs.python_output.outputFiles['myfile.txt'] }}
Note how the outputFiles
property is used to specify the list of files to be persisted in Kestra’s internal storage. The outputFiles
property supports glob patterns.
The subsequent task can access the output file by leveraging the syntax {{outputs.yourTaskId.outputFiles['yourFileName.fileExtension']}}
.