Next week I'll be in Russia, where I'm speaking about HTTP APIs and REST at the DotNext conference in Saint Petersburg. As part of this event, I've done an interview with the Russian tech site Хабрахабр about the history and future of API development on the web and in Microsoft.NET. The interview's available on their site habrahabr.ru (in Russian), but for readers who are interested but can't read Russian, here's the original English version.
Q: What kind of APIs are you designing? Where does API design fit into software development?
That’s kind of an interesting question, because I think one of the biggest misconceptions in software is that designing APIs is an activity that happens separately to everything else. Sure, there are certain kinds of API projects – particularly things like HTTP APIs which are going to be open to the public – where it might make sense to consider API design as a specific piece of work. But the truth is that most developers are actually creating APIs all the time – they just don’t realise they’re doing it. Every time you write a public method on one of your classes or choose a name for a database table, you’re creating an interface – in the everyday English sense of the word – that will end up being used by other developers at some point in the future. Other people on your team will use your classes and methods. Other teams will use your data schema or your message format.
What’s interesting, though, is that once developers realize that the code they’re working on will probably form part of an API, they tend to go too far in the other direction. They’ll implement edge cases and things that they don’t actually need, just in case somebody else might need it later. I think there’s a very fine balance, and I think the key to that balance is to be very strict about only building the features that you need right now, but to make those things as reusable and self-explanatory as you can. There’s a great essay by Pieter Hintjens, Ten Rules for Good API Design, that goes into more detail about these kinds of ideas.
The biggest API project I’m working on at the moment is a thing I’m building at Spotlight in the UK, where I work. It’s a hypermedia API exposing information about professional actors, acting jobs in film and television, and various other kinds of data used in the casting industry. We’re building it in the architectural style known as REST – and if you’re not sure what REST is, you should come to my talk at DotNext in Saint-Petersburg and learn all about it. There’s lots of different patterns for creating HTTP APIs – there’s REST, there’s GraphQL, there’s things like SOAP and RPC – but for me, the biggest appeal of REST is that I think the constraints of the RESTful style lead to a natural decoupling of the concepts and operations that your API needs to support, which makes it easier to change things and evolve the API design over time.
Q: One of the most famous applications that was "killed" by backward compatibility is IE. The problem of this browser was that it has too large number of applications for which it was required to have backward compatibility. Problem was solved by adding new application Edge, which is updatable and supports all new standards. Can you give an piece of advice on how not to get caught by that backward-compatibility trap? As an example could it be a modularity which doesn't have layers? May be there is a way to replace API with RESTful API, Service Oriented Architecture or something else?
I’ve been building web applications for a long, long time – I wrote my first HTML page a couple of years before Internet Explorer even existed, back when the only browsers were NCSA Mosaic and Erwise. It’s fascinating to look back at the history of the web, and how the web that exists today has been shaped and influenced by things like the evolution of Internet Explorer – and you’re absolutely right; one of the reasons why Microsoft has introduced a completely new browser, Edge, in the latest versions of Windows is that Internet Explorer’s commitment to backwards-compatibility has made it really difficult to implement support for modern web standards alongside the existing IE codebase.
Part of the reason why that backwards compatibility exists is that, around the year 2000, there was a massive shift in the way that corporate IT systems were developed. There are countless corporations who have bespoke applications for doing all sorts of business operations – stock control, inventory, HR, project management, all kinds of things. Way back in the 1980s and early 1990s, most of them used a central mainframe system and employees would have to use something like a terminal emulator to connect to that central server, but after the first wave of the dotcom boom hit in the late 1990s, companies realised that most of their PCs now had a web browser and an network connection, and so they could replace their old mainframe terminal applications with web applications. Windows had enormous market share at the time, and Internet Explorer was the default browser on most Windows PCs, so lots of organizations built intranet web applications that only had to work on a specific version of Internet Explorer. Sometimes they did this to take advantage of specific features, like ActiveX support; more often I think they just did it save money because it meant they didn’t have to do cross-browser testing. This happened with some pretty big commercial applications as well; as late as 2011, Microsoft Dynamics CRM still offered no support for any browser other than Internet Explorer.
So you’ve got all these companies who have invested lots of time and money in building applications that only work with Internet Explorer. Those applications aren’t built using web standards or progressive enhancement or with any notion of ‘forward compatibility’ – they’re explicitly targeting one version of one browser running on one operating system. And so when Microsoft releases a new version of Internet Explorer, those applications fail – and the companies don’t want to invest in upgrading their legacy intranet applications, so they blame the browser. So we end up with this weird situation where here in 2017, Microsoft are still shipping Internet Explorer 11, which has a compatibility mode where it switches back to the IE9 rendering engine but sends a user agent string claiming that it's IE7. Meanwhile, everyone I know uses Google Chrome or Safari for all their web browsing - but still has an IE shortcut on their desktop for when they have to log in to one of those legacy systems. .
So… to go back to the original question: is there anything Microsoft could have done to avoid this trap? I think there’s a lot of things they could have done. Building IE from the ground up with a modular rendering engine, so that later versions could selectively load the appropriate engine for rendering a particular website or application. They could have made more effort to embrace the web standards that existed at the time, instead of implementing ad-hoc support for things like the MARQUEE tag and ActiveX plugins, which would have avoided the headache of having to support these esoteric features in later versions. The point is, though, none of this mattered at the time. Their focus – the driving force behind the early versions of Internet Explorer – was not to create a great application with first-class support for web standards. They were trying to kill Netscape Navigator and win market share – and it worked.
Q: Let’s imagine someone is going to introduce an API. So they collect some requirements, propose a version and gets feedback. That’s rather simple and straightforward thing. But if there are any hidden obstacles there down the road?
Always! Requirements are going to change – in fact, one of the biggest mistakes you can make is to try and anticipate those changes and make your design ‘future-proof’. Sometimes that pays off, but what mostly happens is that you end up with a much more complicated design purely because you’re trying to anticipate those future changes. Those obstacles are often things outside your control. There’s a change to the law that means you need to expose certain data in a different way. There’s a change to one of the other systems in your organization, or one of your cloud hosting providers announces that they’re deprecating a particular feature that you were relying on.
The best thing to do is to identify something simple and usable, ship it, and get as quickly as you can to a point where your API is stable, there’s no outstanding technical debt, and your team is free to move on to the next thing. That way when you do encounter one of those ‘hidden obstacles’, you have a stable codebase to use as a basis for your solution, and you have a team who have the time and the bandwidth to deal with it. And if by some stroke of luck you don’t hit any hidden obstacles, then you just move on to the next thing on your backlog.
Q: Continue with API design. We’ve released the v1.0 of our API and now v1.1 is approaching. I believe many of us noticed http://example.com/v1/test and http://example.com/v1.1/test or something. What are the best practices (a couple of points) you can think of that can help a developer to design a good v1.1 API in respect to v1.0?
It’s worth reading up on the concept of semantic versioning, (SemVer) and taking the time to really understand the distinction between major, minor and patch versions. SemVer says that you shouldn’t introduce any breaking changes between version x.0 and version x.1, so the most important thing is to understand what would constitute a breaking change for your particular API.
If you’re working with HTTP APIs that return JSON, for example, a typical non-breaking change would be adding a new data field to one of your resources. Clients that are using version 1.1 and are expecting to see this additional field can take advantage of it, whereas clients that are still using version 1.0 can just discard the unrecognised property.
There’s a related question about how you should manage versioning in your APIs. One very common solution is to expose URLs via routing – api.example.com/v1/ as opposed to api.example.com/v1.1 – but if you’re adhering to the constraints of a RESTful system, you really need to understand whether the change in version represents a change in the underlying resource or just the representation. Remember that a URI is a Uniform Resource Identifier, and so we really shouldn’t be changing the URI that we’re using to refer to the same resource.
For example – if we have a resource api.example.com/images/monalisa. We could request that resource as a JPEG (Accept: image/jpeg), or as a PNG (Accept: image/png), or ask the server if it has a plain-text representation of the resource (Accept: text/plain) – but they’re just different representations of the same underlying resource, and so they should all use the same URI.
If – say – you’ve completely replaced the CRM system used by your organization, and so “version 1” of a customer represents a record used in the old CRM system and “version 2” represents that same customer after they’ve been migrated onto a completely new platform, then it probably makes sense to treat them as separate resources and give them different URIs.
Versioning is hard, though. The easiest thing to do is never change anything
Q: .NET Core - what do you think about its API?
When .NET Core was first announced in 2015, back when it was going to be be called .NET Core 5.0, it was going to be a really stripped-down, lightweight alternative to the .NET Framework and the Common Language Runtime. That was an excellent idea in terms of making it easier to port .NET Core to different platforms, but it also left a sizable gap between the API exposed by .NET Core, and the ‘standard’ .NET/CLR API that most applications are built against.
I believe – and this is just my interpretation based on what I’ve read and people I’ve talked to – that the idea was that .NET Core would provide the fundamental building blocks. It would provide things like threading, filesystem access, network access, and then a combination of platform vendors and the open source community would develop the modules and packages that would eventually match the level of functionality offered by something like the Java Class Library or the .NET Framework. That’s a great idea in principle, but it also creates a chicken-and-egg situation: people won’t build libraries for a platform with no users, but nobody wants to use a platform that doesn’t have any libraries.
So, the decision was made that cross-platform .NET needed a standard API specification that would provide the libraries that users and application developers expected to be available on the various supported platforms. This is .NET Standard 2.0, which is already fully supported by the .NET Framework 4.6.1 and will be supported in the next versions of .NET Core and Xamarin. Of course, .NET Core 1.1 is out, and works just fine, and you can use it right now to build web apps in C# regardless of whether you’re running Windows or Linux or macOS, which is pretty awesome – but I think the next release of .NET Core is going to be the trigger for a lot of framework and package developers to migrate their projects across to .NET Core, which in turn should make it easier for developers and organizations to migrate their own applications.
API flexibility VS. API precision. One can design a method API so it can accept many different types of values. It’s flexibility. We also can design a method API with lots of rules on input parameters. Both ways are correct. Where is the boundary across these approaches? When should I make a “strict” API and when should I make a more “flexible” design? Don’t forget that you should take backward-compatibility into account.
By implementing an API where the method signatures are flexible, all you’re doing is pushing the complexity to somewhere else in your stack. Say we’re building an API for finding skiing holidays, and we have a choice between
DoSearch(string resortName, string countryCode, int minAltitude, int maxDistanceToSkiList)
One of those methods is pretty easily extensible, because we can extend the definition of the SearchCriteria object without changing the method signature – but we’re still changing the behaviour of the system, we’re just not changing that particular method. By contrast, we could add new arguments to our DoSearch method signature, but if we’re working in a language like C# where you can provide default argument values, you won’t break anything by doing that as long as you provide sensible defaults for the new arguments.
At some point, though, you need to communicate to the API consumers what search parameters are accepted by your API, and there’s lots of ways to accomplish that. If you’re building a .NET API that’s installed as a NuGet package and used from within code, then using XML comments on your methods and properties is a great way to explain to your users what they need to specify when making calls to your API. If your API is an HTTP service, look at using hypermedia and formats like SIREN to define what parameter values and ranges are acceptable.
I should add that I think within the next decade, we’re going to start seeing a whole different category of APIs powered by machine learning systems, where a lot of the conventional rules of API design won’t apply. It wouldn’t surprise me if we got an API for finding skiing holidays where you just specify what you want in natural language, and so there’s not even a method signature – you just call something like DoSearch(“ski chalet, in France or Italy, 1400m or higher, that sleeps 12 people in 8 bedrooms, available from 18-25 January 2018”) – and the underlying system will work it all out for us. Those sorts of development in machine learning are hugely exciting, but they’re also going to create a lot of interesting challenges for the developers and designers trying to incorporate them into our products and applications.
Thanks to Alexej Sommer for taking the time to set this up (and for translating my answers into Russian – Спасибо!), and if you're at DotNext next week and want to chat about APIs, hypermedia or any of the stuff in the interview, please come and say hi!