Microsoft Exchange engineering and cloud-scale

The Exchange team (or at least Perry Clarke, its fearless leader) has been known to describe Exchange Online as “the gateway drug to the cloud.” But how did that come to pass?

This week at Ignite, I was lucky enough to have dinner with some folks from the Exchange product team and a very, very large customer where we discussed the various ways in which Exchange engineering has blazed a trail the rest of Microsoft’s server products have eventually followed. After a bracing Twitter discussion this afternoon with @swiftonsecurity and some of her other followers, I thought it would be fun to put together a partial list of some of the things we discussed to illustrate how the Exchange team has built a stairway to heaven, or an elevator to the cloud, or something like that.

Let’s start with PowerShell. Love it or hate it, it is here, so we all have to deal with it. In 2007, the idea that Exchange would be built on PS was both revolutionary and, to many, revolting, but it allowed Microsoft to do several important things (not all of which shipped in Exchange 2007, but all of which are critical to cloud operations):

  • Greatly improve testability, both for the developers themselves but also for administrators, who now got a suite of protocol and endpoint-related tests they could run as part of troubleshooting– critically important when you have to troubleshoot in a global network of data centers hosting tens of millions of mailboxes
  • Fully enable role-based access control, also critical for cloud deployments where customers want to control who can do what with their data
  • Finally decouple the presentation layer of the UI (EMC, EAC, etc) from business logic
  • Massively improve the tools for scripting, including enabling very large-scale bulk operations– an obvious requirement for a cloud-scale service

Requiring PowerShell was a bold move by the Exchange team but one which has both paid off hugely and one that’s been echoed by the Windows, SharePoint, SQL Server and Skype teams, all of whom depend on it for managing their own cloud services. (See also: the Microsoft Graph APIs.)

Then there’s storage performance. In ancient days, getting scale from Exchange pretty much required the use of SANs due to Exchange’s IO requirements. Now, thanks to the IOPS diet imposed by Exchange engineering, it doesn’t. Tony does his usual excellent job of summarizing the actual reductions. Summary: Exchange 2016 requires roughly 96% fewer IOPS than Exchange 2003 did. There have been a ton of storage performance improvements in Exchange’s sister products (notably SQL) but those have their own stories that I’m not competent to tell. The relentless drive to cut IOPS requirements was one of the biggest enablers for Exchange Online, since controlling storage provisioning costs is critical for any type of scaled cloud service.

Of course, data protection is critical too. Exchange moved from having a single monolithic database to one with separate property and MIME databases (Exchange 2000) then to having software-based database replication with clustering (Exchange 2007) to shared-nothing, fully-replicated active/passive database replication (Exchange 2010 and later). Keeping multiple separate database copies (including lagged copies) enables all sorts of DR and HA scenarios that previously had required SANs. The ability to reliably use cheap JBOD disks, which thanks to Moore’s Law have embiggened nicely during Exchange’s lifetime, has been a key enabler for Exchange Online.

Then there’s a bunch of other architectural changes and improvements that are really only interesting to Exchange nerds. For the latest example, I present “read from passive,” but there’s also all the stuff covered by the Preferred Architecture.

Oh, I almost forgot: managed availability gives ExO a fair degree of self-healing, although its behavior sometimes surprises on-prem admins who see it do things on their behalf unexpectedly.

Oh, and let’s not forget the conversion of all the Exchange codebase to managed code– that was an important accelerator for the move to the cloud, as well as serving as a lighthouse for other product groups with code of similar vintage.

There are more examples, I’m sure, but these should get the point across– there’s been a steady stream of architectural changes in the nearly 20 years since Exchange 4.0 shipped that have led directly to the capability, power, and reliability of Exchange Online– which really has been the gateway drug for getting Microsoft’s customers to Office 365.




Leave a comment

Filed under UC&C

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s