RedShift

Amazon's Redshift is a cloud-based data warehouse designed to support analytical queries. This driver receives less testing than our BigQuery driver, because the cheapest possible RedShift test system costs over $100/month. Sponsors are welcome!

Example locators

These are identical to PostgreSQL locators, except that postgres is replaced by redshift:

  • redshift://postgres:$PASSWORD@127.0.0.1:5432/postgres#my_table

Configuration & authentication

Authentication is currently handled using the redshift://user:pass@... syntax. We may add alternative mechanisms at some point to avoid passing credentials on the command-line.

The following environment variables are required.

  • AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY: Set these to your AWS credentials.
  • AWS_SESSION_TOKEN (optional): This should work, but it hasn't been tested.

The following --temporary flag is required:

  • --temporary=s3://$S3_TEMP_BUCKET: Specify where to stage files for loading or unloading data.

Authentication credentials for COPY may be passed using --to-arg. For example:

  • --to-arg=iam_role=$ROLE
  • --to-arg=region=$REGION

This may require some experimentation.

If you need to generate "-- partner:" SQL comments for an AWS RedShift partner program, you can do it as follows:

  • --to-arg=partner="myapp v1.0"

Supported features

redshift features:
- conv FROM
- cp FROM:
  --from-arg=$NAME=$VALUE --where=$SQL_EXPR
- cp TO:
  --to-arg=$NAME=$VALUE
  --if-exists=error --if-exists=append --if-exists=overwrite --if-exists=upsert-on:col