Meltano
Meltano is an ingestion tool that makes it easy to ingest from hundreds of data sources using pre-built connectors. Essentially, it handles the EL (“Extract, Load”) part of the ELT process.
The reason we favor Meltano over other managed connectors is that it is open-source and it’s CLI based rather than UI based. Primarily UI based tools are not easy to automate and we believe automation is an important part of a robust data pipeline.
Meltano CLI
In Sidetrek, Meltano is already integrated with Dagster and you can trigger Meltano ingestion jobs from Dagster (or schedule it).
This is great for production automation, but often, you might want to use Meltano directly using CLI - for testing and debugging, for example.
It’s faster and also in case of issues, easier to debug because you don’t have to try and figure out if the error is from Meltano or from Dagster.
You can run any meltano CLI commands using sidetrek run in the project root:
It should show you the version of meltano installed.
meltano, version 3.4.2All sidetrek run is doing underneath is running Meltano CLI within the right directory (meltano command must be run inside meltano project directory) and wrapping the command with poetry run so it’s in the right vitrual env.
So it’s identical to running:
You can run any meltano command using sidetrek run meltano:
Installing Meltano Taps and Targets
To use Meltano to ingest data, you need to set up two things: 1) an extractor (“tap”) and 2) a loader (“target”).
Taps and Targets?
For example, to ingest data from a CSV file to Postgres, you can use the tap-csv to extract data from any CSV files and target-postgres to load the extracted data into Postgres tables.
If multiple variants of a discoverable plugin are available, you can choose a specific (non-default) variant using the --variant option on meltano add:
Installing Custom Taps or Targets
Custom plugins for packages can be added to your project using the --custom:
For example, in our example we have a custom loader for Iceberg called target-iceberg because there’s no official loader for Iceberg yet. You can install a custom target like this:
Configuring Taps or Targets
Once the taps and targets are installed, we need to configure them. You can set the configuration settings for a plugin using the meltano config command.
You can also use --interactive flag instead:
Running this will show you a list of all configurations.
Finally, you can alternatively add the configuration settings in the meltano.yml file in your project directory.
...
plugins:
extractors:
- name: tap-csv
variant: meltanolabs
pip_url: git+https://github.com/MeltanoLabs/tap-csv.git
config:
csv_files_definition: extract/example_csv_files_def.json
loaders:
...If you change anything directly in the meltano.yml file, you need to run meltano install to apply the changes.
This will install all the taps and targets specified in the meltano.yml file. If you’ve installed them already, it’ll simply use the cached version.
Sometimes you want to make sure you re-install them from scratch. You can do that using the --clean flag:
If you have a lot of taps and targets, you probably don’t want to have to reinstall everything, so you can update a specific tap or target like this:
Or for a target:
Explicit Inheritance
Sometimes you want to use the same plugin with multiple different configurations.
You can do so using explicit inheritance. Essentially, you’re creating a new plugin that inherits from an existing plugin.
For example, to create a variation of tap-postgres called tap-postgres--billing, you can use the --inherit-from flag:
The corresponding inheriting plugin definition in your meltano.yml project file will have an inherit_from field to specify the parent plugin.
...
plugins:
extractors:
- name: tap-postgres--billing
inherit_from: tap-postgres
variant: transferwise
pip_url: pipelinewise-tap-postgres
...Updating plugins
You can update a plugin in your project using the --update option. Updating a plugin will re-add it to your project. This will do two things:
- Update the plugin lock file the same as
meltano lock --updatewould - Update the plugin entry in the
meltano.yml, without overwriting any user-defined config or extras
For example, this will update the tap-gitlab extractor without changing any of the existing configurations: