

You can assign a custom domain, but you will have an issue with not matching the domain between certificate and assignment.ĭid I say that documentation sucks badly? It needs a lot of improvements - if we could have it on Github, it would be much better already, but it’s not there yet.Another limitation, which is strange and not acceptable in strict environments: at the moment, there is no way to connect a custom SSL certificate to the Apache Airflow cluster.Speaking about AWS IAM: default roles are generated with invalid IAM statements around s3:ListAllMyBuckets permission.Aaand it’s gone… that’s not true anymore, but it stays in the default IAM policies and inside docs.Documentation and GUI states that you need to prefix Amazon S3 bucket with airflow-, otherwise it won’t work.Also, I have to admit that documentation in this place is extremely poor.That’s, in most restrictive cases, unacceptable.

Amazon managed airflow update#
I totally understand that this version was released a few weeks before the announcement, but at this point, the service started with a significant lag, and that probably will introduce more drag to update environments later. Well, I would like to start with something tough to understand for me: it’s not Apache Airflow 2.0.
Amazon managed airflow full#
It means that you deal with a fully-managed service that supports well-known plugins and has full compatibility and integration with AWS portfolio.Īs a person who worked with Amazon Data Pipeline, AWS Glue Workflows, and AWS Step Functions, I am thrilled that we received an alternative that is fully compatible with an open source version - because that removed another point from the list of contraindications related to diving deeper into the cloud. The service selling point is that you have the same Apache Airflow as the open source version. In that sense AWS did what they do the best very consistently: they’ve monetized their operational knowledge by providing a fully-managed service. Not to mention that Apache Airflow itself is very pesky to manage and operate reliably. Because of that, other cloud and SaaS providers already allowed us to use this service in a managed way. By combining it with Kubernetes, many data teams used that as a data infrastructure design pattern. Apache Airflow is a state-of-the-art workflow management platform for data analytics. This is one of the pre-re:Invent 2020 announcements. Recently, I had an opportunity to dive deeper into the newly released AWS service that allows us to provision and use a fully-managed version of Apache Airflow.
