Architecting for Sustainability in the cloud by Katja Philipp

Automatic Summary

Architecting for Sustainability in the Cloud: Understanding the Complexities and Best Practices

Welcome to this insightful session where we will explore the concept of architecting for sustainability in the cloud. I'm Katya, a solutions architect at Amazon Web Services (AWS), specializing in sustainability. In this session, we delve into optimizing workloads and architectures for sustainability and the best practices that exist in this sphere.

Ready? Let's dive in.

Understanding Sustainability

Defining Sustainability: Before we dive into best architectural practices for sustainability, let's briefly understand the term. According to the United Nations, sustainability or sustainable development refers to meeting the present needs without compromising the ability of future generations to meet their own needs.

Sustainability at AWS

AWS focuses on various initiatives in the sphere of sustainability. We will be mainly discussing environmental sustainability initiatives that address decarbonization and methods as technologists we can harness to optimize our applications for sustainability.

Sustainability Transformation

The digital era has transformed how we connect with our customers and conduct our businesses. Guessed what's next on the horizon? Sustainability Transformation. This transformation, paralleled with digital transformation, is reshaping businesses that understand the intersections of digitalization and sustainability. This shift has resulted in the emergence of a term called the "Twin Transformers or Twin Transformation."

Cloud, Sustainability and AWS

In this era of sustainability transformation, the cloud plays an instrumental role. The cloud offers three general ways primarily to collaborate with sustainability transformation:

  1. Migrating workloads to the cloud for higher energy efficiencies
  2. Optimizing workloads for further conservation of energy (our focal point)
  3. Transforming other business areas using technology and data to reduce carbon emissions

The Greenhouse Gas Protocol and AWS Initiatives

Data centers are an integral part of the cloud infrastructure. They contribute to various forms of greenhouse gas emissions – Scope 1 (direct emissions), Scope 2 (indirect emissions), and Scope 3 (everything else). AWS is ardently working towards reducing its carbon footprint by heavily investing in renewable energy projects and partnerships with processor producers.

Sustainability Shared Responsibility

Sustainability is a shared responsibility. While AWS focuses on the sustainability of the cloud, customers have the responsibility for the sustainability inside the cloud. This includes designing resource-efficient, sustainable applications, optimizing data usage patterns, and software design.

Workload Optimization for Sustainability

The three primary aspects to consider while optimizing workloads for sustainability are resource usage (compute, storage, data transfer), resources per unit of work, and utilization per resource. Increasing the utilization of each resource and optimizing overall network utilization by metrics like data size or distance the data has to travel can significantly contribute to sustainability.

Best Practices for workload optimization

For optimal workload optimization, best practices can be organized under five different areas:

  • User Behavior Patterns: How are your users using your workload? Where are they located?
  • Software and Architecture Patterns: How energy-efficient is the programming language that you're using? How have you architected your code?
  • Hardware Patterns: Have you picked the right hardware for your needs?
  • Data Patterns: Understanding and classifying your data and using the appropriate storage technology
  • Development and Deployment Processes: How are you developing your applications and utilizing your build environments?

A Practical Application: Machine Learning Workload for Weather Forecast

Let's consider optimizing a Machine Learning workload developed for forecasting extreme weather events. Throughout all parts of the Machine Learning lifecycle, we can apply the best practices of the sustainability pillar.

Conclusion

Creating sustainable architectures in the cloud is no longer a choice but a requirement in today's digital and increasingly eco-aware world. By setting sustainability goals, assessing your proxy metrics, identifying areas of improvement, and continuously reassessing the impact of those changes, we can create architectures that are not only technologically sound but are also environmentally friendly.

Have more queries? Feel free to connect on LinkedIn. Hope this session enlightened you about sustainability in the cloud. Remember, it's a shared responsibility. Let's create a sustainable future together!


Video Transcription

All right, it's half past. So let's start this session and welcome everyone um to my session on architecting for sustainability in the cloud.My name is um Katya and I am a solutions architect with Amazon web services um focusing on the topic of sustainability and within that I support my customers within their whole um sustainability transformation. So that means for example, in terms of reporting, but also because I'm a solutions architect, how can I actually optimize my workloads and my architectures for sustainability and what kind of best practices um exist here? Oh, sorry, I just have a small issue with my um browser here.

OK. But I'm back. All right. And yeah, if you have any questions, feel free to put that in the chat and I'll be um checking that out. OK, let's jump right into it. So before we'll dive into architectural best practices, I wanted to start with a brief overview of the term sustainability. What do we actually mean with that? Um Then talk a bit about sustainability at Aws just to give you um some kind of overview what kind of initiatives we have here. And then of course, talk about the different best practices on how can we optimize workloads for sustainability. So I'm sure um all of you are aware of this term sustainability and how it is um actually important for us. Um The OK, the slides look OK here. Yeah. Um but I anyways brought the definition of the United Nations with me here just so we are all on the same page. Um So the United Nations defines sustainability or sustainable development as meeting the needs of the present without compromising the ability of future generations to meet their own needs.

So I think this is a very it it's an abstract definition but it gets um comes to the point of course, on what sustainability actually is about sustainability is a broad term. So it means not only decarbonising our business operations, but it's also around water conservation, responsible employment or circular economy is really a very broad term. And today we will focus mainly on environmental sustainability and initiatives that address um decarbonisation and how we as technologists can bring our power to optimize our applications for um sustainability. And um sustainability is of course, yeah, important to all of us.

And in this um in this uh technical field, we probably also see a shift towards a sustainability transformation. I think that's always quite interesting because as technologists, we are so familiar with the term of digital transformation, we've done that in the last decade, we've been able to connect um with our customers, with everything that we make and we've been able to transform our businesses.

But now we really, we are embarking on the sustainability transformation using the same tools as a digital transformation. But focusing this time on sustainability. And even though we see the shift, um we sh we should still see these two transformations together because actually companies that have understood both digital and sustainability transformation and are driving innovation at exactly the intersection of digitalization and sustainability are much more successful in the market.

And here you may have heard the term of the Twin Transformers or Twin Transformation, but there definitely is a very important difference between sustainability and digital transformation. Um And you may have already thought about that because sustainability of course, takes decades.

Well, digital transformation happens in maybe months or years. So the really important challenge here is with sustainability is how can we achieve an outcome that is decades away while we actually know in order to get there, we need to accelerate the process today. So this is really such a big challenge in itself. And here I just brought some other challenges, some examples that our customers face. But also we as aws are facing. That is how can I actually identify carbon emission hotspots? But also how can I measure and reduce the environmental footprint, including energy um waste or water? And how can I collaborate with others in my value chain to actually reduce emissions? And I as a solutions architect focusing on sustainability um support my customers with regards to these different challenges. And now when we look at specifically the cloud, and now I will talk uh about AWS of course, um because that's where I work, but in general, these, these concepts can be applied to different um different cloud infrastructures or cloud providers. Um So here are the three general ways that we can collaborate with the sustainability transformation in mind. And that is first of all, migrating workloads to the cloud in order to take advantage of higher energy efficiencies and higher utilizations of cloud infrastructure.

Then secondly, optimizing workloads for a further conservation of energy. And this is what we'll be talking about uh today. And then lastly transforming other areas of your business with the use of technology and data to reduce carbon emissions across all of your operations and not just it.

And um oops, that was a bit too fast. Yeah, just um because of the limited time we have today, we'll focus on the first two parts. So let's first of all, um talk about migrating. So migrating workloads to the cloud can lower your carbon footprint compared to the typical on prem data center. And I think that is just um important to, to understand I was at a conference a couple of weeks ago and someone um yeah, asked me we are already or we already did all these um optimizations for resource efficiency 40 years ago. But what has actually changed now. And of course, with the cloud, um and with this more centralization of a data center that multiple companies are using and where the cloud providers are optimizing for energy efficiency. We can of course, um reduce the the carbon footprint just by itself instead of having a lot of small little data centers that are not as energy efficient and that are also not as highly utilized. But um I I would like to just give you some more insights on to the different emissions that we actually have in a data center and then talk about how we are reducing these.

You may be familiar with the greenhouse gas protocol and the different scope 12 and three emissions. But I'll anyways um explain these. So no worries if you haven't heard of these um yet. So what kind of emissions do we have in a typical data center? We probably have a backup generator in case of power outages or to stabilize the grid and these direct emissions contribute to our scope one emissions. So the direct emissions, we are also consuming an electricity mix coming from the local grid as well as from our investments into renewable energy projects. So these carbon emissions contribute to our scope to emissions. As you can imagine these take up really a significant portion of our overall greenhouse gas emissions. And scope three has been really everything else. So the data center itself, which needed to be built from concrete and steel, the equipment that we have inside the data center.

So electronical equipment racks and servers, all of these upstream emissions. So the the production of these equipment and the data center itself or the transportation of these different equipments, but also the downstream emissions. So things like how do we um retire our servers when they come to their end of life in terms of recycling or refurbishing. So as you see all of these details have different uh carbon emissions attached to it. And all of this needs to get to net zero carbon by 2040 which is our long term target at Aws. And here are just some um initiatives we started to move closer to this goal. So first of all, of course, scope two emissions. So our uh the the energy that we're consuming here, we are investing heavily in different renewable energy projects. And Amazon is actually the largest corporate buyer of renewable energy in the world. And we are on a path to powering all of our operations with 100% renewable energy by 2025. And that means not only Aws data centers, but also corporate offices or fulfillment centers. With regards to uh scope three, we can think about new um new initiatives, new innovations and for example, the construction of our data centers like the use of low carbon concrete. And there are also several um goals on water usage effectiveness or power usage effectiveness.

And another really interesting part um I think is the chip development. So here we partner with arm to develop a new family of instances that are more energy efficient. Just last year, we have um brought out the Graviton three powered instances which are um more energy efficient.

They, they are 60% more energy efficient than comparable instances. So just by picking actually this um instance type, you can already improve the the energy efficiency and the carbon footprint of your applications so much. So with all of these initiatives, we are reducing our carbon footprint um in order to give you access to a cloud that is most sustainable. And here just to quantify this a bit. So in a benchmark of typical on prem data centers, colo locations and cloud providers, they found out that for example, in Europe, our AWS infrastructure is up to five times more energy efficient than the typical on Prem data center, which also means that just by migrating a workload to the cloud, a customer can reduce their carbon footprint by nearly 80%.

And here we didn't even yet talk about these architectural optimizations that you can can take into account and where does it come from? So that comes from the higher energy efficiency of both um servers and data center facilities and also the higher utilization of of our servers.

All right. And then here if you're familiar with Aws, you may have seen the shared responsibility model for se for security and this also um is applicable for sustainability. So sustainability is a shared responsibility between us as Aws and our customers. So we as Aws are of course responsible for the sustainability of the cloud. So our data centers, the global infrastructure, the use of renewable energy or partnerships with processor uh producers. On the other hand, our customers are responsible for the sustainability in the cloud.

So your workloads essentially designing your applications for sustainability and resource efficiency, optimizing data usage patterns or software design. And we'll uh now jump into the best practices on how you can actually um optimize your workloads. So first of all, um let's talk a bit about what that actually means and based on what metrics, I can optimize my my uh workloads for sustainability. When we talk about optimizing for sustainability, we focus on the energy reduction and the optimization of efficiency across all components of our workload, essentially using the least amount of resources needed and using these to the fullest. And now you may thinking that uh my application teams are already optimizing their applications for different metrics such as cost or response time. So what kind of metric can I use for sustainability? And here I want to quickly talk a bit about um proxy metrics before going into the recommendations. Proxy metrics are metrics that serve as fine granular substitute for CO2 emissions. Because of course, we can look at the carbon footprint and we should look at the total carbon footprint that um our for example, Aws usage um test. And you can do this with the customer carbon footprint tool.

But um this number of total um carbon emissions is not really tangible for an optimization because we just want to reduce it, but we don't really know how. So we should also look at other metrics that actually make this a bit more tangible. And here you already heard that optimizing for sustainability is in large parts about resource efficiency. So first of all, we can look at the resources that we are um using in terms of uh compute storage and data transfer. In general. We want to reduce the cost. As in the cloud, we have a pay per use model in order to know normalize the resource usage um to a unit of work, we should also look at the resources per unit of work and reduce this number. So that could be, for example, how many resources do we need to serve 10,000 customers or how many VCPU uh hours do I need to handle 100 orders? So essentially putting our resource usage in proportion in order to compare this KP I to previous month and fluctuating usage, we can also um look at increasing the utilization per resource. And now if we talk about um networking here, the utilization in terms of sustainability is a bit hard to measure. So we can optimize the overall network utilization by metrics like data size or distance my data has to travel.

And here it's just important to keep in mind that there may be different um tradeoffs with common nonfunctional requirements such as data retention, response time or availability. So if you think of storage, just think about um do I need access to all of my data in milliseconds? OK.

So now we have talked about the metrics. Um So let's continue um on to the best practices. So how can I actually optimize my workloads? And here again, you may be familiar with the well-architected framework which just provides uh different best practices to evaluate and implement architectures um on these different pillars. And last year, we have announced the new sustainability pillar, which also includes best practices to design and implement and operate workloads in a sustainable and resource efficient way to essentially further reduce your energy usage. And this is now such an important step to seeing sustainability as the same nonfunctional requirement as cost or security. Uh When you're already designing uh an application, the best practices um comprise these five different areas. First of all user behavior patterns. So looking at how are your users actually using your workload? Where are they located? It uh when do they use it in order to drive decisions on your architecture and software and architecture patterns, we may look at how energy efficient is the the programming language that you're using because there are a lot of differences here already.

But also how have you architected your code, which areas of your code consume the most resources hardware patterns is essentially about have you picked the right hardware for what you are trying to achieve? So here again, talking about Graviton, but also talking about right sizing and next to hardware patterns, um we have data patterns where I also often see a huge area for optimization. So um gaining an understanding about your data classifying it and using the right uh storage technology and then lastly in development and deployment processes. Um looking at how are we developing our applications and how are we utilizing our built environments, for example?

And just to make this a bit more tangible, I brought an example. Um So let's just imagine that we have a machine learning workload that we want to to optimize as you may know or can imagine training artificial intelligence and machine learning workloads. Of course, uses a lot of energy, but we can also use machine learning to optimize for resource efficiency or fight the effects of of climate change. So let's just imagine we are building a machine learning model to forecast extreme weather events. And here on the right, you can see the machine learning life cycle, an iterative process with the six phases of developing a machine learning workload. So that starts at the identification of a business goal goal framing this as a machine learning problem, collecting and preparing data developing and training a model and then deploying and, and monitoring it. And throughout all parts of this life cycle, we can of course apply the the best practices of the sustainability pillar. So let's just exemplary uh look at the um the data patterns. So here, oops, sorry. Um Here um already at the data processing stage, we can um evaluate if we can avoid data duplication by using existing publicly available data sets.

So here, for example, the Amazon Sustainability Data Initiative provides access to um different data sets for free such as weather data or satellite data. We should also classify our data and implement uh data life cycle policies in order to move data to energy efficient storage and use appropriate storage tiers, essentially moving data to colder storage tiers. The less frequently we use it, we should also define data retention periods and delete unused data and lastly look at the data movement across networks. So here we should um of course, minimize the movement um over the network by storing our data as close as possible to our producers and training our models as close as possible to our data. If we still need to move data over the network, we can apply compression um in order to reduce the size of the data. So these were just some examples um in order to make it a bit more tangible. If you're interested to, to learn more about this, then please check out the the well architected pillar and white paper. And yeah, I know I've um gone a few minutes over.

So here just as a summary of what I would like you to take um away from this session. So first of all, of course, that is sustainability goal and introduce mechanisms to assess these proxy metrics per team and per application. Because of course, what we can't optimize what we can't measure, then look at these different best practices and identify areas of improvement. And then prior to his findings make changes and assess the impact again by looking at, at the the proxy metrics. All right. Um Yeah.

Thank you so much for listening in. If you would like to know more about this topic, you can definitely reach out on linkedin. And um yeah, I hope you have an amazing session. Uh amazing day and lots of interesting sessions by everyone. Thank you.