Module Variables¶
When calling the terraglue module from GitHub, users can set some variables to apply custom configurations. There are a lot of variables available and, to make things easier for users, the variables definition will be splitted into different tables according to their context.
About default values and required variables
Although there are a lot of variables in terraglue, almost everyone of them has a default value. The big idea is to open up a wide range of configuration possibilities and keep the project complexity at a low level.
So, to make things clear, note the following labels:
- signs that this variable is requried regardless of which operation mode was chosen
- signs that this variable is required in some situations and conditions
- signs that maybe you want to change this variable when using the production mode
General Variables¶
Those are variables presented in a generic context or that affects the whole project.
Variable | Type | Description | Default |
---|---|---|---|
aws_provider_config | map |
References a local file where AWS credentials are stored | Check the source |
mode | string |
Defines an operation mode that enables users to choose to use the module for learning or production/development purposes | production |
IAM Configuration¶
The following variables affects IAM roles and policies creation.
Variable | Type | Description | Default |
---|---|---|---|
flag_create_iam_role | bool |
Flag that enables the creation of an IAM role within the module | false |
glue_policies_path | string |
Folder where JSON files are located in order to create IAM policies for a Glue IAM role. Users should pass this variable in case of var.flag_create_iam_role is true | policy/glue |
glue_role_name | string |
Role name for IAM role to be assumed by a Glue job. This variable is used just in case of var.flag_create_iam_role is true or when var.mode = learning | terraglue-glue-job-role |
glue_role_arn | string |
IAM role ARN to be assumed by the Glue job. Users must pass this variable in case of var.flag_create_iam_role is false | "" but required if flag_create_iam_role is false |
KMS Key Configuration¶
The following variables help to take specific actions related to KMS keys.
Variable | Type | Description | Default |
---|---|---|---|
flag_create_kms_key | bool |
Flag that enables the creation of a KMS key to be used in the Glue job security configuration | false |
kms_policies_path | string |
Folder where JSON files are located in order to create IAM policies for a KMS key. Users should pass this variable in case of var.flag_create_kms_key is true | policy/kms |
kms_key_alias | string |
Alias for the KMS key created. Users should pass this variable in case of var.flag_create_kms_key is true | alias/kms-glue-s3 |
kms_key_arn | string |
KMS key ARN to encrypt data generated by the Glue job. Users must pass this variable in case of var.flag_create_kms_key is false | "" but required if flag_create_kms_key is false |
Glue Configuration¶
Finally, we reached the set of variables created to make it possible to customize the deployment of a Glue job at the deepest level. There are a lot of variables available in the context of deploying a Glue job, so it's worth to split them in more sections.
S3 Files¶
Variable | Type | Description | Default |
---|---|---|---|
glue_app_dir | string |
Application directory where Glue subfolders and files that should be uploaded do S3 are located. It references the root Terraform module where terraglue is called from | app |
subfolders_to_upload | list |
A list with all valid subfolders located in the var.glue_app_dir variable that will be uploaded to S3 | ["src", "sql", "utils"] |
file_extensions_to_upload | list |
A list with all valid file extensions for files in glue_scripts_local_dir variable to be uploaded to S3 | [".py", ".json", ".sql"] |
glue_scripts_bucket_name | string |
Bucket name where Glue application files will be stored | Required |
glue_scripts_bucket_prefix | string |
An optional S3 prefix to organize Glue application files | jobs/ |
glue_main_script_path | string |
Location of the python file to be assumed as the main Spark application script for the Glue job. The path reference is the root Terraform module where terraglue is called | app/src/main.py |
Security Configuration¶
Variable | Type | Description | Default |
---|---|---|---|
glue_apply_security_configuration | bool |
Flag to guide the application of the Security Configuration to the Glue job | true |
glue_cloudwatch_encryption_mode | string |
Encryption definition for CloudWatch logs generated on the Glue job in order to set the job security configuration | SSE-KMS |
glue_job_bookmark_encryption_mode | string |
Encryption definition for job bookmarks on Glue job in order to set the job security configuration | DISABLED |
glue_s3_encryption_mode | string |
Encryption definition for s3 data generated on the Glue job in order to set the job security configuration | SSE-KMS |
Job¶
Variable | Type | Description | Default |
---|---|---|---|
glue_job_name | string |
A name reference for the Glue job to be created | terraglue-sample-job |
glue_job_description | string |
A short description for the Glue job | An example of a Glue job from the terraglue source Terraform module |
glue_job_version | string |
Glue version for the job to be created | "4.0" |
glue_job_max_retries | string |
Max retries in cases of running the job with failures | "0" |
glue_job_timeout | number |
Timeout (in minutes) for job execution | 10 |
glue_job_worker_type | string |
Node/worker type to process data in AWS managed Glue cluster | "G.1X" |
glue_job_number_of_workers | number |
Number of workers to process data in AWS managed Glue cluster | 3 |
glue_job_python_version | string |
Python version to be used in the job | "3" |
glue_job_max_concurrent_runs | number |
Max number of concurrent runs for the job | 2 |
glue_job_args | map |
A map of all job arguments to be deployed within the Glue job | Check the source |
job_output_bucket_name | string |
The name of the S3 output bucket for the Glue job when calling the module on learning mode | "" but Required on learning mode |
job_output_database | string |
The name of the Glue database for the Glue job when calling the module on learning mode | "" but Required on learning mode |