Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic expansion of environment variables in config can break config #554

Open
marvin-roesch opened this issue Nov 20, 2024 · 1 comment

Comments

@marvin-roesch
Copy link

This affects version 12 with the introduction of environment variable expansion in the config file (#483). A simple regex replacement is not a sufficient mechanism for handling these substitutions, as the values of environment variables may contain special characters that can escape and break the structure of the YAML file when not handled carefully.

For instance, assume a password contains a colon and is provided as environment variable, e.g. DB_PASSWORD=myp@ssword:. Furthermore, assume the trino-gateway config contains a section like this:

dataStore:
  driver: org.postgresql.Driver
  jdbcUrl: jdbc:postgresql://database:5432/postgres
  user: postgres
  password: ${ENV:DB_PASSWORD}

This is a valid YAML file and the user would expect this to be correctly expanded such that myp@assword: is the password used for connecting to the Postgres database. However, due to the simple replacement logic and since the variable is not contained in quotes in the config file, this leads to a YAML file where an empty dictionary with key myp@assword is added to the password "dictionary".

A further confusing factor is that the serverConfig section already supported "correct" environment variable replacements through Airlift's configuration system. Maybe it makes sense to tap into that for the rest of the config rather than building in fragile and error-prone replacement logic that operates on YAML files?

@oneonestar
Copy link
Member

Airlift (ie. serverConfig) works because it only support simple value and doesn't support nested types. It uses Map<String, String> for internal representation. It traverses all the values and substitutes the environment variables.

Other parts in the config file are real YAML which supports arbitrary nesting. It uses Map<String, Object> for internal representation and the above method doesn't work. We might need to hack into the YAML parsing / deserialization in order to make a perfect implementation. Or we might need to introduce some complex YAML template language.

We took an another approach which is to do a string replacement before parsing the YAML. Quoted string or block scalars can mitigate the problem. We should add a section in documentation to mention this limitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants