Run Your Data Project

Let’s use Sidetrek CLI to run your newly created data project.

Run Your Project

Let’s first go into the created project directory.

cd your_project

Run the following command to start the end-to-end data pipeline you’ve just created.

sidetrek start

If you’re running this command for the first time, it will take a few minutes to download all the necessary Docker images. Superset in particular is quite heavy and can take a while to start. Please be patient! 🧘🏽‍♀️

There are two places you can check the status of your services:

Dagster server - you’ll immediately see the logs in the terminal once you run sidetrek start.
Docker Desktop - you can check the status of all the other services in the Docker Desktop app.

Once everything is running, you can access the Dagster dashboard at http://localhost:3000 and the Superset dashboard at http://localhost:8088.

What’s Happening Underneath?

Underneath, all this command is doing is running three separate commands:

dagster dev to start the Dagster server - this runs the user code (Dagster, Meltano, and DBT).
docker-compose up -d in the project root to run Minio, Iceberg, and Trino in the background.
docker-compose up -d inside superset directory to run Superset in the background.

The step 1 runs all your user code. The step 2 runs all other core services for your data pipeline.

The step 3 runs Superset separately because it’s not part of the core data pipeline. This way you can easily swap it out for another data visualization solution.

Common Issues

Port collisions

Because Sidetrek runs multiple services covering multiple ports, you could get the bind: address already in use error if you’re already using one of these ports.

If you encounter this error, you’ll have to first free the occupied port to proceed. Run sudo lsof -i :<port-number> to retrieve the PID for the process that’s occupying the port and then run sudo kill <PID> to free up that port.

Question or Problems?

If you have any questions or problems, please don’t hesitate to reach out to us on Slack or on GitHub.

We’ll do our best to help!

Next Steps

Awesome job! You’ve successfully created and run your first data project.

You can now start using this data pipeline to work with your own data or explore the example project we’ve included.

If you’re interested in exploring the example project, let’s move onto the next step.