Around 2016, the term "serverless functions" started to take off in the tech industry. In short order, it was presented as the undeniable future of infrastructure. It's the ultimate solution to redundancy, geographic resilience, load balancing and autoscaling. Never again would we need to patch, tweak or monitor an application.
Hm. this is a very strange article for me to read because in my experience, only 1 or 2 things in the whole article have been true for our company (17k employee company with 300 people in the tech org).
We don’t use API Gateway. The best use for lambdas is as a direct call, using the ARN. You don’t need to worry about CORS, permissions, etc. You either have access to call the lambda or you don’t. You can directly control exactly who can call your service, and you never need to set up IAM at all.
I think if you’re literally recreating your monolith in a lambda then you’re doing something fundamentally wrong. Our entire team only has a few lambdas (10-15) and they’re very easy to manage. But yes, testing locally is an issue. We’ve solved this with testcontainers, which would be the same solution if you were just deploying docker services to k8s or openshift or even directly to a VM. This is the first very large issue with lambdas that the article is correct about though.
I do not understand this. How are your resources changing like that? We’ve only had to touch the resources for our very large functions, and even then we’ve touched them only once or twice in 3 years. This is absolutely a non-issue. Set it to the lowest to start, then when it times out update it to the next level. It really isn’t difficult.
please. why in the world would you think a ‘medium sized application’ would have 100+ functions? That’s absolutely insane. there’s no way to manage that. That’s not using serverless properly. We have a medium sized ‘application’ (250k+ lines of kotlin, with tens of millions of lines of generated Java from Drools Rules) and it’s 10-15 lambdas. You should not have hundreds of lambdas for a medium sized app. That’s just idiotic, I’m sorry, but that was never what serverless was meant for.
I can agree with some of this partially, but I’m not sure why the author thinks that getting rid of uptime checks is a problem. I’ve never once had to worry about whether our lambdas are up. There’s no uptime! It either works every time or it doesn’t work at all. It’s pretty awesome actually. Of course you do need to test it when you deploy, but that’s a simple http call and boom you know whether the deploy worked or not.
Traces are also a great way of tracking where you have slowness in your system. I’m guessing a lot of this depends on which ‘ecosystem’ you choose, but with Quarkus and XRay, tracing is dead simple. Add a dependency, you’ve got tracing. Done.
Now, the big problem here is the error passing, which the author talks about later.
well sure, but if you have 100+ functions then you’re multiplying your instantiation costs by 100+. This is not a non-issue for fewer lambdas, but it’s much less of a problem.
correct. but it’s a server you don’t have to manage. I don’t know why the author calls this out this way, you have to manage autoscaling servers at a much finer grained level. Provisioned Concurrency is literally “how many functions do you want to be running at any point in time by default”. There’s not much else to it, besides the next point.
this is a major issue. no other thing to say about it. I do not understand why this is the case, but yes it’s a huge problem, and makes scaling across an org very difficult. The one solution to this is to split your org into separate aws accounts (not sure how gcp manages it, but we do use gcp too), which helps with that, but it’s still a weird restriction.
Hm. maybe this depends on using RDS, because we’ve never seen this with Dynamo.
This has not been our experience. Lambdas have been simple set it and forget it, allowing our team (and company) to focus on the business rather than infra. We only spend time configuring lambdas when we are creating new ones, which isn’t too often. It’s been 3 years since we started using lambdas, and I would say we create maybe 3-5 a year and they’re all for new features, not new individual functions. The companies business has grown more than 3x in that time and we have daily spikes and it all just works.
this is by far the most annoying thing about lambdas. they are http under the covers, but you can’t modify any http headers, response codes, etc. It’s either ‘throw an exception’ or ‘200’. nothing in between. very annoying.
continued
we have never encountered this. this is probably exacerbated by the fact that the author thinks having 100+ lambdas for a medium sized app is normal. you focus even more on the startup time, rather than solving business problems.
lambdas are saving us tens of thousands of dollars a month because we don’t need to worry about massive monoliths and the required ec2 autoscaling instances needed, nor the insane costs of RDS.
why do you need this? That’s not how most testing works. you mock what you need. Unless you’re using a monolith then this applies to any architecture.
if by traditional you mean monoliths. Any sort of microservices, or even a slightly macroservices architecture.
but why? this isn’t explained. We haven’t seen this. Maybe it’s how we use lambdas, but we use versioned lambdas and we deploy and immediately forget about it. There’s nothing to maintain about old versions, rollbacks are automatic.
why are you ‘pushing out a ton of new lambdas’? The whole point is for things to be self contained. If you are needing to touch multiple things often then your lambdas should be a single thing, not multiple. This comes back to the ‘100+’ lambdas thing. That’s just bad design. Don’t blame lambdas for this.
We are able to build GraalVM Kotlin lambdas in less time than that, along with the deploy. The slowest part is literally the CDK synthesis. If we were using CF yaml then it would be half the time.
This is going to completely depend on your team, your languages, and the frameworks you’re using. For us, it’s dead simple to keep up to date. Snyk helps us, it’s one click deploy for each lambda, and we can send to prod immediately due to having a very mature CI/CD pipeline. We are getting even better at this as we’ll be switching to gradle’s version catalogs which means that all of the applications can use the exact same version catalog and it will then require a single change whenever we need to update stuff, instead of hundreds.
so now you have to maintain your linux security, your autoscaling on linux, your deployment pipelines for linux, your nginx configs for linux, or if you’re using k8s you have to learn two stacks, k8s and AWS. If you’re adding loadbalancers then you should be deploying those with CDK anyway so now you’re both using cdk and k8s along with maintaining security on your ec2 instances.
O___o we literally have an entire infra team that is unable to manage Jenkins to the level that devs need due to how difficult it is to maintain a jenkins build pipeline. Not only that, but now you’re dependent on maintaining security for jenkins which is, and has always been, a nightmare. Jenkins pipelines aren’t testable locally (github actions you can use Act along with something like mock-github and act.js to even test your pipelines as part of ci/cd!). We’re currently switching the entire org to github actions due to how terrible jenkins is. And then you’re writing more pipelines to do monitoring! The author claims that serverless has more monitoring, but then goes on to say that you can ‘simply set up’ all this other stuff which is wayyyy harder to maintain in the long run.
100+ lambdas once again.
I don’t know what the author is doing, but it’s so dead simple to run lambdas locally and test locally that I really really don’t understand this. There’s only a single entrypoint. You know what the request was going in and out. It takes me way less time to debug something in a lambda than it ever did in a monolith (I’ve worked in a lot of monoliths and we still maintain a monolith on my team). If you have 100+ lambdas then maybe you should start blaming your architecture, rather than lambdas. It would be the exact same if you had 100+ microservices…a nightmare.
I am very sorry, but I honestly read through the whole article and agreed with a lot of it, and then when I went to write this up just got angrier and angrier because it’s very very clear that the author has a terrible architecture and is blaming it on lambdas. Lambdas don’t work for everything. In general don’t use them for web servers! But for a great solution to small self contained applications, or an architecture that might need one side to scale differently than others, or for step functions where you’re a state-flow diagram, the list goes on and on… then it’s a fantastic solution.
Man, I have to agree. Your write up reflect my experience with Azure Functions in a mid-large sized application way more than the post. Fantastic
I’d even go further with Azure Functions and say that running them locally is really simple. Of all the issues I’ve had with them, running them locally was never an issue.