Sunday, 2 September 2018

Air v Liquid - Part 3 - Cooling v Heat rejection

Following on from the previous 2 articles, I'm now going to look at cooling v heat rejection in the data centre environment.

I stated with some authority that cooling per se is a misnomer, to date I've not had any comments refuting my assertion so I'll continue.

Cooling (by the strict technical term is "is the transfer of thermal energy via thermal radiation, heat conduction or convection.
Heat rejection is a component part of these processes.
So, when we cool something we effectively remove heat and thats precisely what we do in a data centre.

The question, and the topic of these articles is....

Air v Liquid

If we were to look at the data centre ecosystems in almost every continent on this planet we will find that in 99.99% of cases, the medium for heat transfer is air, good old air.

And the principle reason for this is that most computer equipment, servers, storage and networking equipment is designed around the use of air as a cooling medium.

Lets look at air.

So, the main concept here is the "air cycle", for the purposes of this article we're going to start the cycle at the exit of the CRAC/H unit, but we could easily start anywhere in the cycle.

Air is pushed via a fan into a floor void at positive pressure, the air escapes into the room by the prudent placement of floor tiles (in front of the rack, please refer to the EUCOC for futher guidance!) but could easily escape from a whole host of gaps, holes and other routes (hence why the best practice recommend the stopping of all potential sources of leaks), from the tile, air is forced upwards and hopefully into the main inlet of the server, air then passes over the "heat" producing components and is exhausted through the rear of the server and moves upwards (hot air tends to rise, recall your physics lesssons from school) the air then rises to the ceiling level and may, if coerced, find its way back to the top of the CRAC/H unit. What happens inside the CRAC/H unit is a mystery to me!
Nah, its not, just jesting, the air is passed over a cooling coil and heat is transferred to the coil, inside the coil is a liquid, this is pumped to the outside unit, and disappears into the ether by a host of different methods dry coolers, evaporative, cooling towers or through a chiller, what is key is that the liquid transfers its heat to the outside source and thus becomes cooler, this liquid is then returned back to the internal unit and at the bottom of the coil is considerably cooler than the air at the top. Thus the heat generated by the IT equipment is removed and cooler air is supplied.

[NOTE: Some systems will differ in the approach and method of heat rejection but the principle is the same]

The temperatures of the air will differ depending on the desired temperature, but if you recall, average room temperatures from my students are between 18-21℃.

Lets introduce the concept of supply and return temperatures, and the control of them, so supply is the temperature of the air from the bottom of the CRAC/Hs being pushed into the room, the return is the temperature as it enters the top of the CRAC/Hs, so when my students speak of a range of 18-21℃ this may be controlled either by a supply temperature of 18-21℃ or a return temperature of between 30-35℃ which equates to a supply temperature of 18-21℃, the key being somthing called the delta t, or the difference between the cold air and the hot air.

The optimum delta is 15℃ so 33℃ return will provide 19℃ supply.

Understanding this key concept is an important point, many facilities unfortunately have AC systems that have no user controls, and the control points are set by factory to be a standard range. Some facilities operate on return temperatures and sometimes these are set quite low, which in turn means that the delta t forces the supply temperature lower, in some cases (anecdotally from colleagues, temperatures where meat could be stored, so around 5-8℃ causing a considerable amount of energy to be consumed and potentially causing problems for IT kit at the lower end of the operational range (5℃)

Got it ? Good, now to liquid..

There are 4 main liquid cooled solutions in use today, listed as follows:

1. Rear door cooling, this is where the cooling loop for CRAC/H's is extended to the rack where it meets with heat exchangers in the door frames, thus the hot air from the servers is cooled immediately before it leaves the rack footprint, normally the room itself is not cooled.

2. Cold plate, this is where the heat producing components have copper piping to remove the heat at source, this is them treated similarily to rear door cooling and the heat taken away using conventional methods, again the room is not cooled

3. Immersion (1), - this is where server motherboards are actually immersed in baths filled with a non conductive fluid and there is a heat exchanger situated near the bath to remove the heat, natural convection methods move the heated liquid to the heat exchanger. Power and connectivity are provided by a common bar. In this and the following senarios, the rooms are not cooled.

4. Immersion (2) - this is where individual motherboards are encased in a cartridge which contained the non conductive fluid and they slot into a rack with a cooling loop integrated within, valves and other connections provide power and connectivity to the board.

The key thing here is that the liquid in 2, 3, and 4 above is hotter than the air that leaves the rear of the server, around 50℃ and is potentialy more useful and can be used for other processes, such as heat for office areas, or transferred to adjacent buildings for heating swimming pools or greenhouses.


There is a current EU funded project that is looking into heat re-use from data centres (both air and liquid) called Catalyst, more information on this link

In the next article, I'm going to look at the relative costs of both ecosystems, the air cooled ecosystem, the non immersed systems and the immersed ecosystems.

Friday, 1 June 2018

Air v Liquid - Part 2 Data Centre Basics

So, continuing on from the previous post, we're going to look at the basics of a data centre, and for this I'm going to cover all types for all sorts of organisations.

As a user of digital services (as covered in the previous post) I don't really care about how my digital services get to me, as long as they do. But, just because I don't care doesn't mean that someone else doesn't.

Lets go back to basics. In order for me to access a digital service someone needs to provide 3 things, the first a network, a physical or wireless network that transmits my outbound and any inbound signals to a server somewhere, the second, that destination server or servers to process and deal with my request, be it, a financial transaction, a look at my bank balance or to view the latest news updates, and thirdly some power so both the server and the network can run.

At a most basic level, I could use domestic power points in my gargage to power my server and connect it to the internet via my wireless connection or broadband. Is that a data centre? Well, actually technically it isn't, although some organisations back in the early days did precisely that!

The official defination of a data centre (for me) is that contained within the EU Code of Conduct for Data Centres (Energy Efficiency) aka EUCOC guidance document, that can be downloaded from this link

It states "For the purposes of the Code of Conduct, the term “data centres” includes all buildings, facilities and rooms which contain enterprise servers, server communication equipment, cooling equipment and power equipment, and provide some form of data service (e.g. large scale mission critical facilities all the way down to small server rooms located in office buildings)

Very clear, insofar, that the building, facility or room must contain, servers, server communication equipment (network), power equipment and lastly cooling equipment and provide some sort of data service and that those buildings, facilities or rooms can range from large scale mission critical facilities all the way down to small server rooms located in office buildings.

Thus the EUCOC covers all types of communication cupboards, server rooms, machine rooms, mini data centres, medium data centres, telco switch locations, hyperscale data centres etc and those belonging to all types of organisations, in fact any organisation that operates in this increasingly digital world. The only thing I didn't discuss was the cooling element.

Cooling is actually a bit of a misnomer, what we're actually doing when we "cool" is to reject heat, that is to carry away the heat away from the servers and then cool it (the air) down and reinject that cooled air back into the loop. There is plenty of stuff available online if you really want to know about the airflow cycle BUT...

And, it is a BIG BUT...

What are we actually rejecting the heat from?
 

In essence, servers generate a lot of waste heat, and if we dont manage the air flow we can get thermal problems, but where is this heat coming from? 

In a server, there are 2 heat sources, the first is the processor itself (more on that later) and the power supply unit, this device converts the 220-240AC power into the micro DC voltages found on the motherboard and other components.

Server chipsets (the processors) undergo Computational Fluid Dynamics (CFD) modelling to channel the heat away from the core to the heatsinks, the surface of a chip can reach temperatures in excess of 140°C, by the time the heat gets to the end of the fins it can be around 50-60°C.

Most servers themselves undergo CFD modelling to determine the air flow across the heat producing components, listed above within the chassis of the server, so the air flow is optimised as it enters through the front of the device, passes over the processor/power supply and then is exhausted through the rear of the unit. The fans in the device actually pulls the air through the unit, assisted by any positive pressure provided by the AC units.

Most servers are actually designed to operate quite happily in ambient air, and as such can operate between 5° - 40°C,  Thermal monitoring controls the fan speed. It is only when we cluster a great many servers togther for instance in a rack that heat problems can appear.

When I teach the EU Code of Conduct for Data Centres (Energy Efficiency) I ask the students 3 questions, as follows:

1. What is the target temperature in your facility (server room etc) ?
2. What is the target humidity range in your facility?
3. Why these numbers specifically?  

The answers (with some exceptions) are:

1. 18-21℃
2. 50%RH +/-5%
3. Er, we dont know!

Well, those specific numbers relate to the use of paper tapes and cards back in the 50's and 60's and then when some old magnetic tapes needed cool and dry environments back in the 80's and 90's. (I could do a whole blog post on this!)

The thing is that these temperature and humidity ranges belong in the 20th Century, IT equipment can run at higher temperatures today so the question is, do we need cooling or just simply heat rejection?

Well, we'll cover that in the next blog post. 

Before we do though, we'll just go through the absolute basics for a server room and by definition the rest of the data centre ecosystem types as the only real difference is scale and risk profile/appetite.

Scale is clearly, the amount of IT equipment you are using, for smaller organisations, it may not be too big an IT estate, whilst for the hyperscale search engines or social media platforms it may number tens of thousands, if not millions of servers, storage units and network switches. This amount will create a great deal of heat which needs to be managed.

The risk profile is essentially, your own appetite to the risk of the IT going down, if it is absolutely mission critical that your IT systems stays up all the time, or as we say in the sector 24hours a day, 7 days a week, 365 days a year or 24/7/365 then you will require some duplicate systems to deal with any failure, and not just in the IT but your power, network and cooling solutions as well, this will add cost, complexity and an increased maintainance regime to your calculations. There are certain classifications that can be applied, such as the EN50600 (ISO22237) Classes, the Uptime Institute Tiers as well as others, I dont want to go into much detail on these at the present time, but will cover them later in this series.

Lets take a quick look at the minimums:

Power, we'll need electricity to power the servers, networking equipment and storage solutions.
The power train is, at its most basic a standard 13Amp socket on the wall, but if we have more than one server and you're using a rack it may be prudent to consider other options.

Space, computer racks come in a number of different sizes and configurations, most prevalent today are 800mm x 800mm and approximately 1.8 high for a 42U rack and you'll need to access the front and rear of the rack with the doors open so allow at least 1m around the rack for access. You could just use a table, and I've seen that in a number of locations!

Technically, thats it, but it may be prudent to allow the hot air to escape the room and a standard extract ventilator fan can do the job (bear in mind thought that outside air can also come in through this fan unit so some filtration equipment may be useful.)

If you want cooling, then you may want a false floor to allow the cool air to surround your rack and some tiles to direct it to where it needs to be (please refer to the EUCOC section 5 cooling for the myriad of best practices about air flow direction and the containment of that air to the right place). You can get away with not having any air flow management (indeed up until very recently (circa 2008) and in some places even today, but this will come at the risk of hotspots, thermal overload failures and increased energy bills.

You'll also need some sort of cooling unit (if so you dont really need the extract fan), the most basic of this is a standard domestic DX cooler, this unit provides cold air (it should have control to specify the exact temperature) and rejects the air via pipework and an external unit to the outside, again there are a multitude of cooling options available on the market today and some even optimised for data centres!

In the next post, I'll be looking at the cooling v heat rejection.




Thursday, 31 May 2018

Air v Liquid - Part 1 The direction of travel!

This is going to be the first in a series of articles on liquid cooling in the data centre environment, with a view to sparking some debate.
The use of liquid cooled technologies in the data centre has been akin to debates about nuclear fusion, insofar that its always 5-10 years from adoption.
We've been tracking the use of liquid cooled technologies for some time now and we are party to some developments that have been taking place recently, and it is my belief that we are going to see some serious disruption taking place in the sector sooner rather than later.

We'll be looking at a unique way of looking at how to implement liquid cooled solutions in the next few posts, but before that we need to understand the direction of travel outside of the DC Space.

Customers are increasingly using cloud services and it appears that organisations that are buying physical equipment are only doing so because of either "thats the way we've always done IT" or that they are using specialist applications that cannot be provide via cloud services or they are wedded to some out of date procurement process. This is supported by the fact that conventional off the shelf server sales are in decline, whilst specialist cloud enabled servers on on the up but being purchased by Cloud Vendors and the Hyperscalers (albeit that they are designing and building their own servers which by and large are not suitable for conventional on premise deployements) There is also the Open Compute Project that is mostly being used for pilots via development teams.
 
So, with cloud servers, the customer, in fact any customer has no say in the type of physical hardware deployed to provide that SaaS, PaaS or IaaS solution and that is the right approach. IT is increasingly becoming a Utility, just like Electricity, Water and Gas. As a customer of these utilities I have no desire to know how my electricity reaches my house, all I want is that when I flick the light switch or turn on the TV, that the light comes on and that I can watch the "Game of Thrones" I certainly do not care which power station generated the electricity, how many pylons, how many transformers, how many different cables etc that the "juice" passed through to get to my plug/switch.

For digital services, I access a "service" on my smartphone, tablet, desktop etc and via a broadband or wifi connection, access the "internet" and then route through various buildings containing network transmission equipment to the server (which can be located anywhere on the planet), the physical server(s) that the "service" resides upon. Everything, apart from my own access device is not my asset, I merely "pay" for the use of it. And I dont pay directly, I pay my ISP or Internet Service Provider a monthly fee for access and the supplier directly for the service that I am accessing and somehow they pay everybody else.

So, in essence, digital services are (to me) an app I select on my digital device, information is then sent and received over digital infrastructure, either fixed such as a broadband connection or via a wifi network. I have no knowledge or, nor do I care how the information is sent or received, merely that it is. 

What does this have to do with the use of air or liquid in a data centre?

Well, its actually quite important, but before we cover that, we'll be covering the basics of data centres and we'll be doing that in the next post.

Sunday, 25 March 2018

A very late 2017 Annual Review

2017 was an interesting year, we managed to keep going which is always a result as most SME's fail after the first 18 months of being in business. I'll make no bones about it 2017 was very tough.

The EURECA project continued all through 2017, although it should have been finished in September. The reason for a 6 month extension was to include 2 late pilot projects, the first was a UK government departments IT consolidation project and the second a EU Member state government ICT consolidation project which is ongoing at the time of writing.
Overall, the project has exceeded its intended target by almost triple in terms of GW/y energy saved.
The information relating to the project can be found on our website here

Many of you will remember the National Data Centre Academy (NDCA) project, which continues albeit after a radical rethink, we hope to have more information on that shortly.
We spent a considerable amount of time and treasure on 2 events and received some great feedback, we need to find some funding and continue discussions with various parties but prospects are good. We will report back on this, hopefully before the end of Q2 with an update.

SFL continues to be a challenge, we are now part of a EU wide initative Green IT Global which brings together 4 EU organisations, namely Green IT Amsterdam, Swiss Infomatics Group (Green IT SG), AGIT and ourselves and we have signed MOU's and hold regular meetings, you should begin to see more activity over the coming months.

Travel wise, 2017 was a very busy year, we attended events in Stockholm, Barcelona, & Dublin for EURECA and Stockholm, Sao Paolo, Dublin, Gibraltar and Frankfurt for CEEDA assessments.
We also travelled to Monaco on behalf of the EU-JRC and Milan for the annual EUCOC meeting and Seville for a GPP meeting.
We also attended events in Manchester (DC North/Data Centre Transformation) and London (DCD/DCD Awards as well as some local and regional events.

So, what are our plans for 2018?, EURECA formally closed at the end of February, so we've been looking for new Horizon2020/FP9 projects and Innovate UK projects to get our teeth into and continuing our commerical propositions. To that end, we have signed NDA's with one large global corporation to provide white label energy efficiency and consultancy services to their UK/Ireland operations, and hope to extend this acoss the EU and one smaller organisation to use their software products in our overall service wrap.

Over the next few weeks, we hope to wrap up CEEDAs in Brazil and Saudi Arabia for Q2 but have one external Data Centre Assessment for a University to complete next week.

We're looking at a project in North Africa with a view to providing energy efficiency and environmental support services.

Our work with the Data Centre Alliance continues with the development of "validation" services and we're very close to launching this product line. We'll keep you informed via our usual social media outlets and in conjunction with them.

Thats about it, we hope to provide a more regular update service in the future, but keep a lookout for articles published in Inside Networks, and Data Centre Solutions magazine and we'll let you know via twitter about any other published article.

Until next time

Wednesday, 13 December 2017

December 2017 Update

Just a brief update today.

There is a great amount of confusion about the energy consumption of ICT globally with estimates ranging from 2.5% (on a par with the aviation industry) to 10%. The aviation industry comparison is quite old now but still used by some organisations to highlight the potential environmental impact. The higher figure of 10% was published in 2013 in this report

The thing is this, the growth of ICT services (Cloud, IoT) MITIGATES energy consumption elsewhere, the move from on-site ICT (where the management/legacy may be suspect and quite inefficient) to hyperscale cloud datacentres (AWS/Google/Salesforce) and colocation sites where the management and designs are efficient is a good thing. The use of IoT for monitoring and better management can reduce energy consumption used for maintenance and support services, changing from regular servicing to just in time servicing which reduces overall costs.

The GESi smarter2030 report identifed the opportunities that can arise from the use of ICT systems, this guidance can be found here 

However, us consultants and those involved in policy are somewhat behind the curve and dont always know everything (although we would have you believe that we are on the bleeding edge) a case in point is quantifying the actual and projected energy use in this sector, we are always behind the curve on this as we do not have current and accurate data to work with.
The two figures above are estimates, and unfortunately with ICT it is always a moving target because the ICT ecosystem is very dynamic, new ICT systems are installed everyday, and not much is ever removed, hence a drive to eliminate "zombie" servers, reputed to be at least 30% in of installed IT in the US, these "zombie servers" consume vast quantities of power and network and do nothing, except cost money.

I had two articles published in the trade press yesterday, the first was this article for "inside networks, which can be found here
This was my view on the most significant for data centres in the UK this year, and I presented the 2nd target findings report from the Climate Change Agreement for Data Centres where those in the scheme posted an average PUE of 1.8, 1.8 ! Think about that for the moment, essentially it means that for every 1w of IT power, we are using .8w to support it, mostly cooling but also lights, management, security and building management etc. To put this into perspective, the most advanced data centres designs will now operate at sub 1.2 and the real cutting edge data centres operate at sub 1.1 This is simply not acceptable, operators are burning money and passing that costs onto consumers.

The second article was published in the Data Centre Solutions Magazine and can be found here 
In this report I made an estimate of how much energy was being used by UK PLC, essentially all the data centres, server rooms, mobile phone towers, railway/motorway traffic controls systems that create, process or transmit data for use in and by government, academia and the commercial sector.

My first attempt was very alarming, the figures revealed were quite staggering, the energy consumption was TWh's and the cost was anywhere between £7-9 Billion, time to sharpen the pencil! My estimates were based on the amount of businesses that employed over 50 people, which in the UK are 40,000, and that an organisation employing over 50 people would have an IT estate/server room, with approximately 50 devices (servers, storage and network) and be operating at a PUE of 2. We add to that total Government, including blue light, schools, universities etc to reach 80,000 server rooms with an average electricity of cost of just under £60,000 a year and depending on tariff costing UK PLC some £4-7Billion and using just over 38.5TWh or about 12.5% of UK generated capacity (2016)
Although some commenters are wary of the figures, including one that said that turkey processors employ over 50 people and none of them use a computer! which is a fair comment, all in all I think that my estimates are valid and probably a fair reflection of UK PLC IC cost and consumption.
In summary, as I stated in the 2nd article, we simply do not know the actual amount and probably never will, for that we will have to undertake a deep study and survey and make some assumptions.
If you fancy helping out, a) by assisting with the project, or b) funding the project, contact me directly via the website, social media feeds or email.
Over the Christmas break I'll be writing our 2017 Annual Review so until then, enjoy the holiday preparations and mull over the size of the energy efficiency opportunity.

Until next time......