Poorly Configured Apache Airflow Instances Leak Credentials for Popular Services

Apache Airflow

Cybersecurity researchers on Monday discovered misconfigurations across older versions of Apache Airflow instances belonging to a number of high-profile companies across various sectors, resulting in the exposure of sensitive credentials for popular platforms and services such as Amazon Web Services (AWS), Binance, Google Cloud Platform (GCP), PayPal, Slack, and Stripe.

“These unsecured instances expose sensitive information of companies across the media, finance, manufacturing, information technology (IT), biotech, e-commerce, health, energy, cybersecurity, and transportation industries,” Intezer said in a report shared with The Hacker News.

Automatic GitHub Backups

Originally launched in June 2015, Apache Airflow is an open-source workflow management platform that enables programmatic scheduling and monitoring of workflows on AWS, GCP, Microsoft Azure, and other third-party services. It’s also one of the most popular task orchestration tools, followed by Luigi, Kubeflow, and MLflow.

It’s worth stressing that the findings from Intezer concern exposed Apache Airflow managed by individuals and organizations, causing a leak of credentials associated with different applications and services, and doesn’t indicate a weakness in the platforms themselves.

Some of the most common insecure coding practices uncovered by Intezer include the use of hard-coded database passwords in Python code for orchestrating workflows as well as in a feature called Variables, plaintext credentials in the “Extra” field of Connections, and cleartext keys in publicly-accessible configuration files (airflow.cfg).

Chief among the concerns associated with misconfigured Airflow instances is the leakage of credentials, making them ideal candidates for exploitation by threat actors, who can abuse the information to spread laterally and gain access to accounts and databases, thus leading to violation of data protection laws and giving the attackers an insight into the organizations’ tools and packages, which could later be exploited to stage supply-chain attacks.

Prevent Data Breaches

“If a large number of passwords are visible, a threat actor can also use this data to detect patterns and common words to infer other passwords,” Intezer researchers said. “These can be leveraged in dictionary or brute-force-style attacks against other platforms.”

Even more concerning is also the possibility that malware can be launched on the exposed production environments by leveraging the Variables feature to modify the container image variables to point to a different image laced with unauthorized code.

Apache Airflow, for its part, has remediated a lot of security issues with version 2.0.0 that was released in December 2020, making it critical that users of the software update to the latest version and adopt secure coding practices to prevent passwords from being exposed.

Leave a Reply

Your email address will not be published. Required fields are marked *