Meltano
Meltano is an ingestion tool that makes it easy to ingest from hundreds of data sources using pre-built connectors. Essentially, it handles the EL (“Extract, Load”) part of the ELT process.
The reason we favor Meltano over other managed connectors is that it is open-source and it’s CLI based rather than UI based. Primarily UI based tools are not easy to automate and we believe automation is an important part of a robust data pipeline.
Meltano CLI
In Sidetrek, Meltano is already integrated with Dagster and you can trigger Meltano ingestion jobs from Dagster (or schedule it).
This is great for production automation, but often, you might want to use Meltano directly using CLI - for testing and debugging, for example.
It’s faster and also in case of issues, easier to debug because you don’t have to try and figure out if the error is from Meltano or from Dagster.
You can run any meltano
CLI commands using sidetrek run
in the project root:
It should show you the version of meltano installed.
All sidetrek run
is doing underneath is running Meltano CLI within the right directory (meltano
command must be run inside meltano project directory) and wrapping the command with poetry run
so it’s in the right vitrual env.
So it’s identical to running:
You can run any meltano
command using sidetrek run meltano
:
Installing Meltano Taps and Targets
To use Meltano to ingest data, you need to set up two things: 1) an extractor (“tap”) and 2) a loader (“target”).
Taps and Targets?
For example, to ingest data from a CSV file to Postgres, you can use the tap-csv
to extract data from any CSV files and target-postgres
to load the extracted data into Postgres tables.
If multiple variants of a discoverable plugin are available, you can choose a specific (non-default) variant using the --variant
option on meltano add
:
Installing Custom Taps or Targets
Custom plugins for packages can be added to your project using the --custom
:
For example, in our example we have a custom loader for Iceberg called target-iceberg
because there’s no official loader for Iceberg yet. You can install a custom target like this:
Configuring Taps or Targets
Once the taps
and targets
are installed, we need to configure them. You can set the configuration settings for a plugin using the meltano config
command.
You can also use --interactive
flag instead:
Running this will show you a list of all configurations.
Finally, you can alternatively add the configuration settings in the meltano.yml
file in your project directory.
If you change anything directly in the meltano.yml
file, you need to run meltano install
to apply the changes.
This will install all the taps
and targets
specified in the meltano.yml
file. If you’ve installed them already, it’ll simply use the cached version.
Sometimes you want to make sure you re-install them from scratch. You can do that using the --clean
flag:
If you have a lot of taps
and targets
, you probably don’t want to have to reinstall everything, so you can update a specific tap or target like this:
Or for a target:
Explicit Inheritance
Sometimes you want to use the same plugin with multiple different configurations.
You can do so using explicit inheritance. Essentially, you’re creating a new plugin that inherits from an existing plugin.
For example, to create a variation of tap-postgres
called tap-postgres--billing
, you can use the --inherit-from
flag:
The corresponding inheriting plugin definition in your meltano.yml
project file will have an inherit_from
field to specify the parent plugin.
Updating plugins
You can update a plugin in your project using the --update
option. Updating a plugin will re-add it to your project. This will do two things:
- Update the plugin lock file the same as
meltano lock --update
would - Update the plugin entry in the
meltano.yml
, without overwriting any user-defined config or extras
For example, this will update the tap-gitlab
extractor without changing any of the existing configurations: