GitLab announced a new ELT (extract, load, transform) platform. Meltano is a self-funded business that receives financial support from several prominent VC and angel investors, including Alphabet’s GV.
Meltano is an open-source platform that relies on several open-source tools, including Singer. Using hundreds of pre-built connectors, it aims to be an “open-source standard” for creating data integration scripts. Soon, Meltano will also be a reliable Apache superset for data visualization.
Meltano CEO Douwe Maan, said, “A big challenge with [the old ETL way] If you need to change your business logic or transformations, you’ll have to re-extract all your data, which will slow down the time to value. With the advent of cheaper storage solutions and big data, the ELT pattern has become more common. “
Mann added, “Most of today’s solutions are pay-to-play, limiting the number of companies that have access to high-quality tools. Proprietary also means that you need to rely on vendors to add extraction and loading capabilities to all the sources of interest. There can be dozens of them. Being open-source means that a large community can better provide a long tail of integration, as it typically supports only about 150. “
Meltano was a debut from a DevOps powerhouse, which went through various iterations before becoming an open-source platform for data integration and transformation.
A modern data stack typically includes a variety of tools, ranging from data ingestion to data warehousing, allowing businesses to capture and move raw data between systems, or just query the data. There is a way to make it more user-friendly. Before entering the data warehouse, this data can be transformed. The process is called “extract, transform, load” (ETL). Most people consider this method of saving “old school”. The process is expensive and can result in terribly slow data conversion.
The latest alternative is to convert data on-demand directly from the warehouse via ELT. ELT is fast, but requires more processing power, as provided by cloud-based data-warehouses such as Databricks, Snowflake, Google’s BigQuery, and Amazon’s Redshift