Mistakes were Made, Lessons were Learned
Mark Nunnikhoven, AWS Community Hero and Trend Micro Vice President of Cloud Research, explores how to leverage the AWS Well Architected Framework, and six core design principles to build in the AWS cloud with confidence.
Save to Folio
At a glance:
Mark Nunnikhoven, AWS Community Hero and Trend Micro Vice President of Cloud Research, explores how you can leverage the AWS Well-Architected Framework to build in the cloud with confidence.
The 5 Pillars of the AWS Well-Architected Framework
Below are the 5 pillars and their focus areas. Consider these pillars a package deal—to properly evaluate your architecture, you must see how well it aligns with each pillar.
- Operational Excellence: Supporting your business objectives
- Cost Optimisation: Avoiding unnecessary costs
- Reliability: Ensuring workload performs as intended
- Performance Efficiency: Using resources efficiently
- Security: Protecting information and systems
The 6 Core Principles of the AWS Well-Architected Framework
- Use on-demand resources whenever possible: Idle resources waste time and money
- Automate absolutely everything: Save money and time by quickly replicating workloads
- Test regularly and at scale: Regularly simulate a live environment for all consumption sizes
- Fact-based feedback loops: Collect data to inform architecture choices
- Evolve architectures: Take advantage of innovations that allow you to best meet business requirements
- Practise, practise, practise: Make sure you’re prepared for any possible hiccups on launch day by practising for them beforehand.
Mark Nunnikhoven [0:02]
When you're building in the AWS cloud, you're going to make mistakes. And you know what? That's OK. That's part of the process. It really, there's two kinds of mistakes, you're gonna make the ones that you had no clue were coming, they came out of left field, and they knocked you senseless. And the others that in hindsight, you probably could have seen coming and done something about, but you know what, both are gonna happen. Both are OK. And why is that? Well, because the AWS cloud is full of possibilities, we get a mountain of new services and features and functionality right here at AWS re:Invent. And over the last few years, AWS has absolutely spoilt us with new functionality. There's just no way that you can possibly keep up. Even all these great icons, you're seeing right here is just a tiny fraction of what's available today in the AWS cloud. How could you ever know the ins and outs of all of these, you're not going to be able to, so you're going to fall into sort of a natural IT pattern that's developed over the years over the decades, and it gets even worse in, it's more prominent in large enterprises. And that's going looking for a solution, you want something that's going to solve these problems for you. And those are the best practises.
Mark Nunnikhoven [1:17]
I gotta tell you, I gotta be honest, I don't like best practises. I don't like the word best practises, because it infers that there's only one way, there's one way to do this. And we know with a multitude of possibilities in the AWS cloud, but that's simply not true. If you think there's one way to do it, you're not asking the rest of the question. It's based on a certain amount of constraints that you have right now, on the situation that you're trying to deal with. There's a whole bunch of variables here that you can't just boil down to one thing to do for each solution. So let's give the BIG thumbs down to best practises. And, you know, look for something better. Of course, that's a huge setup. There's absolutely something better out there, and it's called the AWS well architected framework. Now this framework, you may be thinking, wow, this guy's selling it like it's, you know, unicorns and rainbows. It's not, I'll be perfectly honest with you. It's a series of white papers presented as PDF documents, at least that's the start of it. There's an overview paper, then some sub papers, but it's what's contained within. That's really amazing. This was built off hard fought knowledge from 1000s of AWS customers, 1000s of AWS engineers and procedures, teams from around the world, who have poured their knowledge into this to help you understand how to build better how to build well. Now, there isn't actually a tooling component to this, if you go into your management console, you're gonna see something called the AWS well architected tool. Here we are, in the middle of a review, drilling in on the security section, Question four. And you'll see here something that looks an awful lot like a best practise, configure service and application logging. You can even see right in the example, ensure that AWS Cloudtrail, Amazon Cloudwatch, logs, Amazon Guard Duty and AWS Security Hub are enabled for all accounts in your organisation. That's great advice, you should absolutely follow that, that would be considered a best practise. And you know, I don't like them. So that was a big setup. Because while you should turn this on, you need to go that extra step. What are you going to do when cloudwatch logs raises an alert? When security hub presents you with a finding? How is your team going to react to that data? I'll tell you from experience that I'm a security guy at heart. So I'll use a security example. If you go before the board and say we had no idea this attack had happened. It invaded all of our metrics and monitoring. And we're responding to it appropriately. Now that we know about it. That's one thing to have that same conversation with the board and say, Hey, we knew about it, because our systems alerted us. And we had no plan to take advantage of that alert, which conversation you think is gonna go better. So setting yourself up, absolutely a fantastic prep practise to follow not a best practise, but practise. And what you need to do is then figure out how you're going to use this information. That's the difference. It may seem like I'm splitting hairs here, but it's super, super important to understand. So the framework itself is divided across five pillars, operational excellence, cost optimisation, reliability, performance efficiency, and security. Now, these five pillars are just an easy way to kind of understand the questions you're discussing amongst your team. But the important thing here is to understand they weave together, you can't just look at one you need to look at all of them in context. Now that may not line up with how your teams are set up today, especially, you know, with the security side of things. But for me, I'm a firm believer that security is just one part of building well, and you need to integrate it tightly into everything that you're doing.
Mark Nunnikhoven [4:43]
More important than the five pillars. I know that's weird, because that's how the framework is presented. But more important are the six core principles that are advocated by the framework. And the first is used on demand resources wherever possible. Essentially think of this one is idle resources are your enemy. You don't want to pay for them, you don't have to maintain them. They're just a waste of everything. So use on demand wherever possible, you want to automate absolutely everything. If you're typing something into a command line, it better be to either a test it out. And then to save it into a script that you can use for an automation, or just typing it directly into a script, you shouldn't be doing things by hand any more. Long gone are those days, you also need to be testing regularly. And at scale, we're going to dive into this principle with our first example in a couple minutes. And you need to be using fact based feedback loops. If you've ever dealt directly or worked closely with the team at AWS, they have internalised this principle in an amazing way. And essentially, what ends up happening is everything they do is based on a simple feedback loop of you have an idea, test it out with some metrics around it, so that you can gauge how well it went. Learn from that, iterate on the idea and keep on going a fact based feedback loop critical concept that helps you drive the evolution in your architecture. So even if you are comfortable sort of standing still, the AWS cloud is not, you're always getting new features, new functionality that can improve your builds that can make things that used to take a whole chunk of code that you wrote by hand and turn it into a checkbox regularly. I guarantee there's at least one feature this, you know, during reinvent here over the three weeks that fits that bill for you. This is why you need to keep evolving your architecture, and my personal favourite: Practise, practise practice, you have to work through situations that if you have a failure, or an outage or a bad code commit that gets deployed, how are you going to handle this, you want to practise that before you get in front of your customers or before it gets to production. So think of it this way, if you're a professional football team, there's no way that you're just going to go on the field to face your rival and hope that you win the match, that is not going to happen, you're going to run through practise, you're going to run through different scenarios, so that when it comes to the game, you're prepared, we need to do the same thing in IT. It's just, it's you know, a lot less sweaty and much easier and a little more organised. But we still have the same principles, practise, practise, practise.
Mark Nunnikhoven [7:05]
Now, before we go any further, let's just get this out of the way, no judgements, we're gonna walk through a couple stories of teams that I've had discussions with, that have shared their stories with me, I'm not going to use specifics about who they are. But we are going to walk through these examples. And we need to practise that core engineering principle of you know, the blameless post mortem, just remember, they made decisions that made sense at the time, given the constraints and knowledge they had, we're here to learn from it, they learnt from it. And let's share that knowledge. So the first example we're going to use is a team that created a video streaming platform. So this was at the start of lockdowns around the world. And the available options commercially just didn't really fit the bill for them, they wanted something that was very niche, they could sell access to the streams to their targeted audience, and then deliver customised content, really straightforward, great business model. And they knew their niche was underserved. And they were a small team that wanted to deliver a quality solution. So the first thing they came up with, or what they're aiming for, I should say, as a goal is this very standard thing, we've seen it a million times, you're probably looking at it right now, we have a live video window that is streaming this content that their customers have paid for. Then on the side, there's a live chat where you can connect with other people who are watching the stream at the same time, as well as the team delivering the content through the stream. Very simple, very straightforward, you can slap a number of different logos up on here. And it's a very common interface example. And this is what they were shooting for, you know, great, this is awesome, this is going to meet the customer needs, it's going to be highly usable for their customers given that familiarity. And this is the architecture that they started off with, again, makes perfect sense, right, they've got an application load balancer out front, they've got a pool of EC2 instances running from one image that's in an auto scaling group that runs their custom application. That custom application pulls data from Amazon s3, as well as Amazon document DB instance. And behind the scenes not pictured here is the actual source of the streaming, it wasn't really relevant to our lesson. But just keep in mind, there is a source streaming to an EC2 instance, have to spool that video out to the other instances in the scaling pool in the scaling group. And of course, all this is monitored by Amazon cloudwatch. Again, if you were setting this up with a small team, this makes sense. But you may have already spotted where there is one potential issue that may come back to bite the team.
Mark Nunnikhoven [9:24]
You know, I say may like it's not a foregone conclusion. But you know where we're going with this. The video and the chat functionality are running on the same set of instances, right? So they've tied those resources together again, that makes perfect sense. They started with a proof of concept and as we all know, proof of concepts don't stay in the lab it got rolled out to production with one easy to operate instance type. So what happened was in their testing on the left on the y axis here, we have response time in seconds, so end to end what the user experiences. And on the x axis we've got the number of concurrent users and the team ran some automated tests between 10 and 50 users again, this data is normalised to make it easier to get the point across between 10 and 50 users, they saw a reasonable response time one to two and a half seconds, which was great, that was adequate for their needs. And customers were pretty happy. If you can get below once a second, that sort of ideal, the nice Nielsen Norman group who do a tonne of UX research, and have repeatedly generated the results and say under a second, and it just feels natural and fluid for customers. Over a second to I think two is basically, it feels like the customers interacting with something and it's responding in kind as opposed to a fluidity there. But anything over you know, two and a half, you get to start to get worried. So as they creep creeped up to 100 users concurrently, they started to see these times increase. Now what you might be thinking is, well, wait a minute, aren't they selling tickets to this streaming platform? Yeah. And should they know how many people are coming to the stream? Yeah, they should. But what happened was the technical team had set everything up, they run their tests, they were happy with its performance, because they were testing with sub 50 users. And on the day of the stream, ticket sales spiked up very close to this chart. And they got more and more people going coming onto the stream, which is, again, a great business success. The problem is the technical team wasn't aware of the spike in demand. So they weren't ready for what was coming. Because as this increased to about 1000 users, and the response time was up to 10 seconds. Now, I don't know about you, but the last time I waited 10 seconds for anything on the web was probably about 1997. Right. And this is not acceptable from a user experience. So the users were very, very unhappy, they were leaving the stream. And in fact, as it crested closer to 10,000, which again, is an amazing business success, they are delivering the streaming service into a very specific niche and seeing amazing response to, to their offer. The challenges is it gotten to this really weird scaling loop because people would request to access to the stream and the app, and that would scale it as appropriate. But by the time that scaling capacity was in place, those requests had been abandoned. And then more people would come in, and so out of scaling was kind of trying to figure out, should I scale up? Should I scale down, and it just wasn't a good scene because they didn't test regularly at scale, they were testing at a very small amount, which they thought was adequate for their needs. But the capacity is there in the AWS cloud is gonna cost you a few dollars to scale it out in your test. But wouldn't that be better than having egg on your face from crashing in front of customers.
Mark Nunnikhoven [12:26]
They also didn't practise they didn't know what to do when things started to go wrong. And worse yet, because video and chat were tied together, they had no easy way to talk to their customers, and they had no backup plan. So they scrambled on social media, and they scrambled through email. But it was not a good experience for either the team responding or the customers who had paid to the stream, or the person that they were showcasing in the stream all around, a lot of stuff could have been avoided by testing at scale, and more importantly, practising to see OK, if this crashes, what goes on? But let's take a look at this solution through the lens of the well architected framework. If we look at the pillars, operationally, it was OK, it wasn't that bad, because they had that one image for easy two instances that everything was spawned off of. So they kept the operations low, everything was was pretty much the same. Cost wasn't bad, they blinked. They're scaling directly to paying customers. Of course, reliability, performance was not great. And the security was solid, they had to manage those instances, which is always going to bump up this pillar. But you know, not bad overall is pretty good for a first kick of the can, except for the fact that it failed utterly to meet their business requirements. Yikes, very important to keep those in mind. So the team went to the back to the drawing board after that first disastrous day because they have other events lined up. And the first thing they did was they broke apart the application, they said, You know what, we can't have video and chat on the same thing, we need to separate them, we're gonna put chat on its own pool of EC2 instances. And we're gonna put video on its own separate pool as well. They didn't have to change the app to do this. And they just rerouted things through the Alp. They wrote a chat to one pool, and they routed the video to the other. But this raised a couple questions as they were evolving the architecture. So this is a great evolution, we can do this really, really quickly. And we can test this out to see if it works. But are there other evolutions we can take to make our lives even easier? Well, it turns out they could. Since they've separated with a video, they started to ask the logical question, do we even need EC2 instances to serve out this video that was a sort of a knee jerk reaction, it made sense that a proof of concept it made sense with a small team trying to get everything on one common instance type. But if we have to spin it out and sort of split it out and have two different pools of resources, maybe there's a better resource. And it turns out there was Amazon CloudFront distributions were a better fit for the team. There's a lot of great streaming functionality in CloudFront. That will increase your performance because it gets geography geographically closer to the user. It's streamlined for delivering this type of content. So this is a strong evolution that made it a lot easier for the team to deliver a better quality product. It also I would argue follows the principle of using on demand resources wherever possible. Now, I know what you're thinking and wait a minute, are they using EC2 on demand instances? Because they haven't figured out if they wanted to make a reservation yet? Oh, yeah, yeah, they were. But that doesn't mean they were doing the best alignment of resources to cost that they could. And that's really what this principle is, I know that word on demand kind of captures people because they think EC2 on demand. But realistically, what happens is, when it was EC2 instances, there was capacity that wasn't being used, whether that was CPU or network bandwidth. Moving to CloudFront, there's a direct relationship now between a customer accessing that stream and CloudFront serving it up, right. So there's a every cost, they have outbound for CloudFront, is linked to income coming from that ticket. So it's a much closer alignment, and they don't have CloudFront, sitting there idle, that they're paying for it right CloudFront is only charging you for the bandwidth that you're consuming beyond the the initial setup. So this is a much better alignment with this principle, which as a small team really helped. So operationally, they're still about the same, because they still do have a pool of EC2 instances, but they've gotten better in the cost optimisation. It's far more reliable. It's far more preformant. And security is about the same, because again, they still have EC2 instances.
Mark Nunnikhoven [16:09]
So the team, you know, having broken this apart, having shifted the video delivery through CloudFront, of course, is asking themselves, well, why is the chat app so big and so heavy? Well, it turns out, it doesn't need to be. There's in fact, an example project on the AWS blog that shows the exact solution where the team started to lean towards. And that was a serverless based solution with AWS lambda, AWS app sync, and then they switch to Amazon dynamoDB table in the back end, because it was a much better fit for an online real time chat. So moving this to an entire serverless design was a great evolution. Because again, they're lowering their operations, and they're aligning their on demand uses of their resources, right, great evolution. And then they're exploring yet another evolution here, they're actually looking at removing CloudFront, because it's too much of a burden, they kind of got hooked on the whole serverless thing, they kind of got hooked on that, which is a great thing. They got hooked on the whole, like, what else can we move up the stack to a better, less touch service for us. And they're investigating Amazon interactive video service, which is something that is a recently new service. And essentially, it hides the CloudFront and s3 bucket usage in the background. And you have an endpoint you create in the service, you push your video to that endpoint. And then you use the services SDK, to integrate it into your application, whether it's a web browser or a mobile app, or what have you. So the teams in the phase, just exploring this now to roll this out for their customers. And this goes to the core principle of automate all the things, if you've got things stable, if you're delivering a good value to your customer, is there something that you can do to make your life easier from an operational perspective, so customers won't see really a big difference with Amazon interactive video service, but the team will because it's one less set of things they have to worry about right now, they'll have a highly managed service where they can just push video through, and it scales out for them at a great, great rate and streamlines their back end experience. So that ties in with automating it, because even you know, setting up the distributions, worrying about the cache, worrying about the s3 bucket, that kind of stuff. If you don't have to do it, don't do it. And I mean, that was a much better solution than the instances. But this is an even better solution of going to a fully managed service fit for purpose.
Mark Nunnikhoven [18:18]
So we look at overall where we are with the solution. And operations is through the roof cost optimisation, everything we've done better, if we look at where we started, we've made significant increases is a great work by the team. They've used multiple rounds, based on facts they're seeing, and each rounds are using those feedback loops. They're evolving the architecture through lessons learnt, right, really great fit for them. And, you know, kudos to the team for making these improvements and moving forward. So let's use another example here. And this one was a team that I spoke to about their legacy system data storage. So they had taken an application that was designed for an on premise environment. And they were serving it out from their own data centres, and they're migrating it into the cloud. So they forklift into the AWS cloud. And then they're starting their experience from there. So again, we've removed some of the details, just so that it's not obvious as to who we're talking about here. But there's some great lessons in this one.
Mark Nunnikhoven [19:12]
And we're gonna look at this from a number of devices that are sitting out in labs around the world. And each of these are sending live data back in this example, we're just showing weather data, because it's something easy to relate. And there's a lot of different metrics coming in pretty much constantly. It's a real time system that sends them to a central server system in order to analyse the data and cross reference it. So there's a need to pull in real time events. And then there's a secondary need for this analysis across all of these events. So this is a data heavy application. And all of the good news is all of the processing is done centrally. So you may be screaming in your head, this is going to be an IoT solution, and it would be for part of it. We're going to focus on the data storage piece because I think this is where a lot of people are sitting especially in large enterprises. Where you have something that was built, that is working, that is solving your business needs in this particular case was making a lot of money. But the when they were moving into the AWS cloud, they're going along this journey that I think we've all been on. And so we want to highlight that. So if we look at the architecture, when they forklift what they had into the AWS cloud, they moved the devices, were talking to an elastic load balancer, an EOB, that went to an Amazon EC2 instance, pool and an auto scaling group that was running their custom app. And in the back end, it was running an Amazon RDS Oracle instance, or set of instances. And of course, they were monitoring it all with cloud watch. This solution, and this is really, really important to call out. This solution is a forklift from what was on premises, and it worked great. It was highly reliable, and it fit the purpose, and they had enough income to justify high licencing costs in Oracle. And they had they were an Oracle shop already. So you know, you know, where are we said no judgements, no judgements, they're making a tonne of bank office.
Mark Nunnikhoven [20:59]
So you can probably figure out where we're going with this one as well, though some of the data and collection analysis all being in an instance pool, may cause some challenges here, right. But that's the The important thing to think of. But it's that our Amazon RDS Oracle instance, where is going to pop up with our first example. So here, we're looking at a chart again, normalised to make it easier for you, on the y axis is the database size, right? So the amount of data that we're storing in the database and terabytes, and across the x axis is our event volume in millions. So when we're looking at sort of one to 5 million, we don't have that much data storing being stored when we hit 10, we're starting to creep into the, you know, maybe one terabyte, when we get to 100 million events, we're at about eight or nine terabytes. And we quickly quickly spike up north of 65-70 terabytes, when we're hitting 1000 million events here. So or groups of 1000s of a million events. So the challenge is, is that, you know, based on the fact that we're seeing is that as these events spike up, our database costs are going to increase massively, they're going to follow this chart, because storing data in a traditional RDBMS is going to be expensive, just it's, but again, this application was designed on premise, and it was working fine. The challenge specifically is that per day, per day, this team was seeing about eight or nine terabytes of data every day coming in. And that was only going up as they were being more successful. So why is that a problem?
Mark Nunnikhoven [22:31]
Well, because it's all being stored in the database, they were spending a lot of money. So if they're seeing about eight terabytes a day, one of the maxed out 64 terabyte storage nodes, reasonably, you know, savings on the three year reservation was costing them 15 and a half thousand dollars a month after a 32 and a half $1,000 reservation over the three years. So thing to remember, again, that they're filling one of these every eight days. So every eight days, they're spinning up another one of these nodes is costing them $15,500. Again, they're making enough money that it's not huge, but it's an area of concern. So right now, if we look at the solution status across the, the pillars, the framework, operationally, it's fine. It's a little burdensome, but that's OK. It's not really optimised the cost, but it's not that bad, because they're making money, rock solid reliability, not great on performance, cause of those Oracle instances in the back end, and security is OK.
Mark Nunnikhoven [23:30]
But of course, the team's looking at that going, that's a big bill. So they start there, they switch that Oracle instance over to Amazon RDS Aurora. This requires a little bit of custom code change in their app, but not much doesn't fundamentally change how they're handling data. So when they switch this to an Amazon Aurora instance, for the similar performance and storage tier that drops down to $8000 a month and a $5,000 reservation, so they actually end up saving 49% through this code change, which is an absolutely massive win for the team. And they were ecstatic about that because it was a minimal effort on their part, and a huge savings on the back end. So this readjusts our pillars where cost optimisation is now rising steadily, because we've made 49 of one move to save 49% on our data storage costs. But of course, they're not stopping there, there's a lot more to do. And they realise that, you know, storing all the data in a relational database isn't really the way to go, let's start taking advantage of Amazon s3. So instead of filling up the database every eight days, now we can turn around and go, OK, we can keep some stuff live in the database, but we can start pushing more and more events out to s3, and that's going to cut our costs down significantly, storing a similar amount in s3 was costing them around $1,000 a month. So significantly cheaper, way better for scalability. And that was a great evolution in their architecture. They're taking advantage of The cloud. So this is a very common pattern, we see people forklift what they have, because it was working, move it as is great step one, you should absolutely do that. But according to the well architected framework, we should continue to evolve. And as we're evolving, we're taking advantage of other powers that are available in the cloud. So here, we've cut down cost by switching to Aurora. And we're cutting down costs even more by moving some of that data that's not required to be available instantly, through their custom code. And we can make it available in a different manner. In fact, they realise, wait a minute, we don't even really need the relational database in the way that it's existing. If we're willing to rewrite our data code architecture and our code, we can start to leverage dynamodb. Now, is that going to be cheaper than Aurora? Not necessarily, um, it ended up being less expensive for them. But it was more preformant. And they got a lot more flexibility, they were able to decrease the response time for requests in the back end, which was great. And this is all just changing. This is all just changing their custom code, again, evolving that architecture to see those results, either back end for the team, or front end for their customers.
Mark Nunnikhoven [26:04]
But they didn't stop there, they went even further, because it's kind of addictive, they switch the EOB to an application load balancer. And they started using AWS lambda functions for data ingest. And then they would decide if the data needed to go into Amazon dynamodb table to update it. And they did some aggregate work in the lambda functions right there. And in Dynamo, and then they pushed a bunch of it into s3. And they moved a lot of their batch processing and their big analysis to Fargate, they took their custom code, they wrapped it up a new container, dropped it into fargate, in order to not only increase scalability, they decreased response time, which is great. And they also then made it less expensive and less operational burdensome on the back end team in order to make it easier to keep focusing on value as opposed to maintaining stuff. So they tested regularly at scale, because of their tests. This is what they were finding. They said, Hey, you know, if we make these changes, we'll be able to scale better, and continue to be successful, because they kind of hit this beautiful business snowball, where labs started talking to each other and said, Hey, have you seen this great service. And they started getting more and more customers, which was really critical for them to free up time to deliver more features, because they had a broader customer base at that point. It's also better aligns with using on demand resources, they've eliminated those instances, they're spinning up those containers as required in order to do batch jobs, and actually reduce the execution time of those batch jobs through this approach, which is great. And they automated everything, right. So everything is becoming highly automated, directly tied to customer demands. And this is really just a really nice evolution of where they went. So if we look at the overall solution, status, reliability stayed the same throughout even though all the architecture is changed, which I think is a key call of performance is now through the roof. Security is easier, because they've gotten rid of a lot of stuff, cost is even more directly optimised to their revenue and their needs. And it's easier to operate overall, just a huge win by following the principles of the well architected framework. So what do you need to take away here? What's
Mark Nunnikhoven [28:03]
the key? What do you need to understand? Well, the AWS well architected framework is a freely available resource from AWS, you can get started with this right right out of the gate right today. And it's all about finding the best solution at the time. That's key at the time, there's no one best solution, you're never done. Either you're changing the AWS cloud is changing, your customer demand is changing. The beautiful thing is, is you have the tools to meet that demand and to change with it to innovate with it, you just need to be able to apply it in a logical manner, which is what the framework is all about. It does that through applying principles, the six core principles of the AWS well architected framework, and they're really stuff that you were doing now, hopefully doing now kind of felt instinctively was there, the framework codifies them, formalise them, formalises them a little bit better, you know, removing those idle resources, automating everything, making sure you're testing at scale, changing and evolving your architecture based on data, and practising for any possible scenario that could go wrong, or even could go right, and so that you're not stuck at a loss if something happens in production. The architected framework for your ease of use is divided across five main pillars. They do have individual deep dive white papers on them, as well as the AWS well architected tool is aligned to these pillars. But remember, the pillars are an easy way to look at it, but they connect, you can't have one without the others, you have to pay attention to all five. Finally, the core idea of these feedback loops, have an idea, try it out, learn from it, iterate, repeat that process, just keep on going. That's how you're going to move forward. That's how you're going to build better. That's how you're going to build well in the AWS cloud. Thank you very much. My name is Mark Nunnikhoven. I'm an AWS community hero. I'm also the VP of cloud research at Trend Micro, you may have noticed from the dash "s" on this session, that it's actually a sponsored session. But you may have also noticed that we didn't talk about product. I think it's really important just to learn how to build better. Even though Trend Micro is a security company, we obviously sell security products and you can learn more about those on our sponsor page or a trendmicro.com/cloud-one. But we're firm believers that security is just part of building well, which is why I wanted to teach you today about the AWS well architected framework. If you have any questions, you can always hit me up online at markn.ca. Thanks a lot for attending. Enjoy the rest of AWS re:Invent.