Cloud Academy’s CEO Stefano Bellasio sat down with Serverless Advocate Lee Gilmore to chat about the current and future state of the technology. Lee shares a ton of technical insights and passion for his work in this interview.
Lee, thank you for your time. Let’s start with a quick intro about you and why you have been spending so much time in the serveless world.
It’s great to chat with you! I’m Lee, a Global Serverless Architect working for City Electrical Factors in the UK, and City Electric Supply in the US; supporting the business in their global adoption of Serverless. I’ve also recently been a technical advisor for Sedai based in California who are doing amazing things in the AI and autonomous World, and I’m an AWS Community Builder, active blogger, mentor and speaker of all things Serverless!
I first got into Serverless in 2014 when Lambda and FaaS was a new ‘thing’, and quickly started to use it in production solutions in enterprise organisations, starting with background processing tasks and progressing usage in line with new AWS Serverless services as they came out (Step Functions, AppSync, EventBridge etc). Jump forward to 2022 and Serverless in my opinion is both the current, and future, of enterprise architectures.
We all remember AWS Lambda being the pioneer of this space but that was 2014. What happened in the last few years and what are the current trends for serverless?
The last few years has had major focus and investment from AWS in the Serverless space, with many traditional ‘serverful’ services like RDS Aurora, Redshift and MSK now making their Serverless introduction! I only see this continuing over the coming years, as teams increase their agility, and reduce their costs, through a move to services requiring no maintenance or operation. I also see more teams integrating services like Lego building blocks, reducing the need for ‘glue code’ using Lambda, which many in the industry believe to be the future of low or no code solutions.
We do however have a way to go on the database front, with the only viable options for production workloads being DynamoDB and RDS/RDS Proxy in my opinion, due to scaling and connection management concerns with alternatives. I would love to see a Serverless version of DocumentDB with the connection management taken care of for us, and Serverless Aurora v2 is still very new to go all in on.
What’s your experience with Serveless in large organizations? Is it a normal component and framework inside their orgs or is it still “new” for most of them?
I have been lucky to work for technically minded organisations like City (CEF/CES), AO.com and Sage PLC; all of which had an event-driven and Serverless first mindset from the top down.
A great example of scale would be at Sage where I architected the Sage Online Services (Online Payslips, Online Timesheets, Online Documents etc), which had 3.5m+ employees using the system, and over 150K companies registered, each company typically with many users for managing the employees. The domain services which made up the overall solution were fully Serverless, and it was estimated to have saved the company around £80K per annum on AWS operational costs alone; whilst increasing agility to allow for deployments to other countries like France and Canada through internationalisation which was baked into the design from the start.
What do you think of some of the newer features of something like function URLs? What would be its main advantages and challenges?
This is a great question. I can see this feature being used in a limited capacity in enterprise organisations, as we already have API Gateway and HTTP APIs, and they are simple to setup through IaC with no real overheads, as well as being fully featured.
An example where I may use this feature would be a very simple isolated webhook; but the limitations on request and response validations, caching, custom domain names (which you can work around using CloudFront distributions), and limited authentication options means it’s use case is niche for me.
New patterns and reasons to use this feature will emerge over time though, as the Serverless community are great at finding new use cases and patterns for new features!
AWS announced that Lambda will now support up to 10 GB Ephemeral Storage. What do you think the effect of this will be for users?
This is such a great feature, and opens up so many new use cases! In the past, the teams I worked with typically used the available 512MB storage for limited use cases such as storing dynamic small config files and email templates as HTML with placeholders. This allowed us to cache a small amount of email templates so we didn’t need to keep reading from S3 on every invocation.
Now we can support use cases like pulling down a huge quantity of sizeable images and large font files dynamically which are subsequently cached for generating different sized marketing ads on the fly in different languages, without needing to constantly read from S3 or EFS every time the Lambda is invoked. This is a use case I had a year ago which would become very simple now, yet back then I had to work around this limitation for speed and cost with a complex solution.
I was reading about AWS Lambda Power Tuning: at what stage of development/architecture would using this become more relevant?
So for me this should ideally be done autonomously over time through AI Ops based on machine learning algorithms. This is where companies such as Sedai are excelling in this field.
This to me is not something we can do once within the SDLC; this is something which is fluid over time based on changes in environment and user behaviour, and code and dependency updates. Ideally this would run in your CI/CD pipelines for constant validation.
That being said, historically I think this is something teams unfortunately forget about and generally just over provision memory, and then never come back to it unless they hit issues.
Are there issues specific to security that enterprises need to be aware of?
I am a big advocate of Serverless Threat Modelling, as serverless as a paradigm typically means a greater attack surface for bad actors due to the use of many more services compared to Serverful solutions; each service with their own specific configurations and limits, extrapolated out across the full enterprise! Serverless Threat Modelling allows teams to look at the proposed architecture as a group to weed out and mitigate some of these potential threats as early as possible in the SDLC. I have written a detailed article on Medium explaining how to use this approach with your teams, and what the tangible benefits are.
Examples of common serverless threats could be denial of wallet attacks due to poor/non-existent rate limiting and authentication, lack of controls around malicious payloads or file uploads containing malware, or information disclosure due to privileges which are too open from a security perspective (for example publicly accessible S3 buckets or Opensearch clusters).
What are some disadvantages to serverless architectures for companies already using ‘serverful’ – and what would be some good ways to overcome these challenges?
Serverless is complex when done at an enterprise level, and one of the biggest issues I typically see is around education and lack of experience across an organisation. There is a famous tweet from Elon Musk stating, “Prototypes are easy, production is hard”, and this is no different in the Serverless World!
I typically see teams starting out falling into what I call the ‘Serverless Dunning-Kruger’ effect, where they quickly create a basic serverless app and push it to the cloud, falling into the trap of thinking it’s that easy. This in my experience would be prototype quality at this stage. As teams get more experienced they soon realise that there are a lot more areas to consider when productionising a serverless solution, such as choosing from many services which have feature parity, huge amounts of service configurations to reason about, disaster recovery needs, compliance, caching, authorisation and authentication, load testing for downstream systems etc etc – I could go on! This can lead to cognitive load and frustration at times, and it takes time for teams to become fully comfortable with using Serverless at an enterprise scale, compared to perhaps more legacy based containers and n-tier style applications.
There are undoubtedly huge advantages too serverless, but for an organisation that is currently excelling with serverful solutions, and that have experts and experience in that field, they should see a move to serverless as a marathon not a race in my opinion.
What have you found to be the state of serverless experts in the field, or at least practitioners that are experienced? Is there a knowledge gap?
I think there is a large gap currently when it comes to expert serverless experience at an enterprise level, as this requires more thought leadership around design patterns, reference architectures and governance at large scale. An example of somebody who is excelling in this field would be Sheen Brisals in my humble opinion. When I start working with an organisation at a global level I typically look at three key areas:
Firstly, I use the ‘Serverless Architecture Layers’ pattern to help distributed teams focus on how to design their solutions so that they can easily be consumed, integrated with each other, that business logic is reusable, we have sharable components, and teams don’t need to reinvent the wheel when it comes to cross cutting concerns (such as logging, tracing, authentication, authorisation, event-driven communication etc to name a few). Without this governance, organisations can get into a mess when they have many autonomous teams working in their own silos as shown in Conways Law.
I secondly look to implement a Thoughtworks Tech Radar to get some governance around how teams work day to day with serverless and the technologies and frameworks they use; with the main aim to allow us to focus on reference architectures, reusable components and cross-cutting concerns, as well as ways of working for example. Again, without this governance, and with teams working in silos; it means that reuse of both knowledge and solutions is near impossible without tackling this early.
Lastly, I help teams working through the ‘Serverless Dunning Kruger’ effect we discussed earlier using a method I call TACTICAL DD(R), which prompts teams to think about non functional requirements at the definition of ready and definition of done stages; again as a light framework which contains the main areas which are missed typically when productionising serverless solutions. This alongside the Well Architected Framework is very beneficial in my experience.
Using these three strategic approaches can help ensure we have a more productive transition to Serverless across an organisation in my experience.
Last question, what’s a dream project you’d love to work on and solve by using serverless architecture?
It’s the project I am working on currently at City, which is taking our more monolithic solutions and transitioning them to Serverless over time globally! When I was approached by our Global Director of Software Engineering, Matthew Carr, about the role of Global Serverless Architect, I was compelled to apply due to his exciting serverless architectural vision for City, and the challenges that come with this for a hugely successful global organisation. The great thing about City is the people, with the development teams I work with having a real thirst and drive for serverless knowledge and experience, whilst pushing the boundaries of serverless innovation internally through well thought out POCs.
References
Serverless Architecture Layers: https://levelup.gitconnected.com/serverless-architecture-layers-a9dc50e9b342
Serverless Threat Modelling: https://leejamesgilmore.medium.com/serverless-threat-modelling-df8e4028ef6d
Serverless TACTICAL DD(R): https://levelup.gitconnected.com/serverless-tactical-dd-r-23d18d529fa1