Skip to content

Module Variables

When calling the terraglue module from GitHub, users can set some variables to apply custom configurations. There are a lot of variables available and, to make things easier for users, the variables definition will be splitted into different tables according to their context.

About default values and required variables

Although there are a lot of variables in terraglue, almost everyone of them has a default value. The big idea is to open up a wide range of configuration possibilities and keep the project complexity at a low level.

So, to make things clear, note the following labels:

  • signs that this variable is requried regardless of which operation mode was chosen
  • ⭐ signs that this variable is required in some situations and conditions
  • 🚨 signs that maybe you want to change this variable when using the production mode

General Variables

Those are variables presented in a generic context or that affects the whole project.

Variable Type Description Default
aws_provider_config map References a local file where AWS credentials are stored Check the source
mode string Defines an operation mode that enables users to choose to use the module for learning or production/development purposes production

IAM Configuration

The following variables affects IAM roles and policies creation.

Variable Type Description Default
flag_create_iam_role bool Flag that enables the creation of an IAM role within the module false
glue_policies_path string Folder where JSON files are located in order to create IAM policies for a Glue IAM role. Users should pass this variable in case of var.flag_create_iam_role is true policy/glue
glue_role_name string Role name for IAM role to be assumed by a Glue job. This variable is used just in case of var.flag_create_iam_role is true or when var.mode = learning terraglue-glue-job-role
⭐ glue_role_arn string IAM role ARN to be assumed by the Glue job. Users must pass this variable in case of var.flag_create_iam_role is false "" but required if flag_create_iam_role is false

KMS Key Configuration

The following variables help to take specific actions related to KMS keys.

Variable Type Description Default
flag_create_kms_key bool Flag that enables the creation of a KMS key to be used in the Glue job security configuration false
kms_policies_path string Folder where JSON files are located in order to create IAM policies for a KMS key. Users should pass this variable in case of var.flag_create_kms_key is true policy/kms
kms_key_alias string Alias for the KMS key created. Users should pass this variable in case of var.flag_create_kms_key is true alias/kms-glue-s3
⭐ kms_key_arn string KMS key ARN to encrypt data generated by the Glue job. Users must pass this variable in case of var.flag_create_kms_key is false "" but required if flag_create_kms_key is false

Glue Configuration

Finally, we reached the set of variables created to make it possible to customize the deployment of a Glue job at the deepest level. There are a lot of variables available in the context of deploying a Glue job, so it's worth to split them in more sections.

S3 Files

Variable Type Description Default
glue_app_dir string Application directory where Glue subfolders and files that should be uploaded do S3 are located. It references the root Terraform module where terraglue is called from app
subfolders_to_upload list A list with all valid subfolders located in the var.glue_app_dir variable that will be uploaded to S3 ["src", "sql", "utils"]
file_extensions_to_upload list A list with all valid file extensions for files in glue_scripts_local_dir variable to be uploaded to S3 [".py", ".json", ".sql"]
glue_scripts_bucket_name string Bucket name where Glue application files will be stored Required
glue_scripts_bucket_prefix string An optional S3 prefix to organize Glue application files jobs/
glue_main_script_path string Location of the python file to be assumed as the main Spark application script for the Glue job. The path reference is the root Terraform module where terraglue is called app/src/main.py

Security Configuration

Variable Type Description Default
glue_apply_security_configuration bool Flag to guide the application of the Security Configuration to the Glue job true
glue_cloudwatch_encryption_mode string Encryption definition for CloudWatch logs generated on the Glue job in order to set the job security configuration SSE-KMS
glue_job_bookmark_encryption_mode string Encryption definition for job bookmarks on Glue job in order to set the job security configuration DISABLED
glue_s3_encryption_mode string Encryption definition for s3 data generated on the Glue job in order to set the job security configuration SSE-KMS

Job

Variable Type Description Default
🚨 glue_job_name string A name reference for the Glue job to be created terraglue-sample-job
🚨 glue_job_description string A short description for the Glue job An example of a Glue job from the terraglue source Terraform module
glue_job_version string Glue version for the job to be created "4.0"
🚨 glue_job_max_retries string Max retries in cases of running the job with failures "0"
🚨 glue_job_timeout number Timeout (in minutes) for job execution 10
🚨 glue_job_worker_type string Node/worker type to process data in AWS managed Glue cluster "G.1X"
🚨 glue_job_number_of_workers number Number of workers to process data in AWS managed Glue cluster 3
glue_job_python_version string Python version to be used in the job "3"
🚨 glue_job_max_concurrent_runs number Max number of concurrent runs for the job 2
🚨 glue_job_args map A map of all job arguments to be deployed within the Glue job Check the source
⭐ job_output_bucket_name string The name of the S3 output bucket for the Glue job when calling the module on learning mode "" but Required on learning mode
⭐ job_output_database string The name of the Glue database for the Glue job when calling the module on learning mode "" but Required on learning mode