Monday, 14 November 2016

IdentityServer, OpenID Connect and Microsoft CRM Portals

As readers of this blog will know, here at Spotlight we’re in the process of moving nine decades’ worth of legacy business process onto Microsoft Dynamics CRM, aka CRM Online, which I gather is now called Dynamics 365 (because hey, it’s not like naming things was hard enough already, right?)

We’re also investigating a couple of options for building customer-facing systems that integrate with Dynamics. Until last year, there were really three options for this – a product called Adxstudio, a free Microsoft component called the CRM Portal Accelerator, or rolling your own solution using the CRM SDK. Around this time last year, Microsoft quietly retired the Portal Accelerator component and acquired Adxstudio, and since then, they’ve been in the process of assimilating it into the Dynamics product family – which has meant it’s been something of a moving target, both in terms of the supported features and in terms of the quality of documentation and examples.

I’ve previously blogged about one way to integrate Adxstudio with your existing authentication system, but that approach relied completely on running Adxstudio on-premise so you could run your own code as part of the request lifecycle – and as you may have noticed, there’s a bit of a trend in IT at the moment away from running your own servers and towards using hosted managed services, so that patching and backups are somebody else’s problem. Since Microsoft acquired Adxstudio, there’s been a lot of churn around what’s supported and what’s not – I’m guessing that behind the scenes they’re going through the Adxstudio codebase feature-by-feature and making sure it lines up with their plans for the Dynamics 365 platform, but that’s just guesswork on my part.

One of the main integration points I’ve been waiting for is the ability for a Microsoft-hosted Portal solution to use a third-party OpenID Connect endpoint to authenticate users, and it appears in the latest update this is finally supported – albeit with a couple of bumps along the way. Here’s what I’ve had to do to get a proof-of-concept up and running.

Setting up Dynamics CRM Portals

First, you’ll need to set up a Dynamics Portal trial. You can get a 30-day hosted trial of Dynamics CRM Online by signing up here – this actually gives you a full Office 365 organization including things like hosted Active Directory, as well as the Dynamics CRM Online instance we’re using in this example. Next, you’ll need to ask nicely for a trial of the portal add-on – which you can do by filling out the form at crmmanagedtrials.dynamics.com.

Setting up IdentityServer and configuring an ngrok tunnel

Whilst you’re waiting for the nice Microsoft people to send you your trial license, get up and running with IdentityServer. For this prototype, I’m using the MVC Authentication example from the IdentityServer3.Samples project – clone it to your workstation, open the MVC Authentication solution, hit F5, verify you can get up and running on localhost.

Next – in order for Dynamics CRM Online to talk to your IdentityServer instance, you’ll need to make your IdentityServer endpoints visible to the internet. You could do this by deploying your IdentityServer sample to Azure or AWS, but for experiments like this, I like to use a tool called ngrok, which will create temporary, secure tunnels from the internet to your workstation. Download ngrok, unzip it somewhere sensible.

Pick a tunnel name. I’m using authdemo in this example but any valid DNS host name will do. Next, create a local IIS application pointing to the EmbeddedMvc folder in your samples directory, and set the host name to <your tunnel name>.ngrok.io

image

Now run ngrok.exe to create a tunnel from the internet to your new IIS application:

C:\tools\ngrok> ngrok.exe http –subdomain=authdemo 80

ngrok by @inconshreveable

Session Status        online
Version               2.1.18
Region                United States (us)
Web Interface         http://127.0.0.1:4040
Forwarding            http://authdemo.ngrok.io -> localhost:80
Forwarding            https://authdemo.ngrok.io -> localhost:80

Connections           ttl     opn     rt1     rt5     p50     p90
                      0       0       0.00    0.00    0.00    0.00

All being

If that’s worked, you should be able to fire up a browser, go to http://authdemo.ngrok.io/ – replacing ‘authdemo’ with your own tunnel name - and see the IdentityServer3 sample landing page:

image

Configuring IdentityServer

Right. Next thing we need to do is to make a couple of changes to the IdentityServer configuration, so that it’ll run happily on authdemo.ngrok.io instead of on localhost

First, enable logging. Just do it. Use the package manager console to install the Serilog.Sinks.Trace package. Then add this to the top of your Configuration() method inside Startup:

Log.Logger = new LoggerConfiguration()
                .MinimumLevel.Debug()
                .WriteTo.Trace()
                .CreateLogger();

and add this to your web.config, specifying a path that’s writable by the application pool:

<system.diagnostics>
  <trace autoflush="true"
         indentsize="4">
    <listeners>
      <add name="myListener"
           type="System.Diagnostics.TextWriterTraceListener"
           initializeData="C:\logfiles\identityserver.log" />
      <remove name="Default" />
    </listeners>
  </trace>
</system.diagnostics>

Next, do a global search and replace, replacing any occurrence of localhost:44319 with authdemo.ngrok.io – again, substituting your own tunnel name as required.

Next, add a new client to the static EmbeddedMvc.IdentityServer.Clients class the IdentityServer sample project – changing the highlighted values to your own client ID, client secret, and portal instance URL:


new Client {
    ClientName = "Dynamics CRM Online",
    ClientId = "crm",
    Flow = Flows.Hybrid,
    ClientSecrets = new List() { new Secret("secret01".Sha256()) },
    RedirectUris = new List { 
"https://my-portal-instance.microsoftcrmportals.com"
}, PostLogoutRedirectUris = new List {
"https://my-portal-instance.microsoftcrmportals.com"
}, AllowedScopes = new List { "openid" } },

 

Adding IdentityServer as an endpoint in CRM Portals

Finally, you need to add your new IdentityServer as an identity provider. CRM Portals uses the Dynamics CRM platform for all its configuration and data storage, so to add new settings you’ll need to log into your Dynamics CRM Online instance, go into Portals > Site Settings, and add the following values:

Name

Value

Website

Authentication/OpenIdConnect/AuthDemo/Authority

http://authdemo.ngrok.io/identity/

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/Caption

IdentityServer OpenID Connect Demo

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/ClientId

crm

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/ClientSecret

secret01

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/MetadataAddress

http://authdemo.ngrok.io/identity/.well-known/openid-configuration

Customer Self-Service

Finally, it looks like you’ll need to restart the portal instance to get it to pick up the updated values – which you can do by logging into the Office 365 Admin Center, Admin Centers, CRM, Applications, Portal Add-On, clicking ‘MANAGE’, and pressing the nice big RESTART button on the Portal Actions page:

image

And – assuming everything lines up exactly right – you should now see an additional login button on your CRM Portals instance:

image

Clicking on it will bounce you across to your ngrok-tunnelled IdentityServer MVC app running on localhost:

imageLog in as bob / secret, and you’ll get the OpenID permissions check:

image

…and when you hit ‘Yes, Allow’, you’ll be redirected back to the CRM Portals instance, which will create a new CRM Contact linked to your OpenID Connect identity, and log you in to the portal.

Conclusions

Of course, in the real world there’s a lot more to it than this – there is a huge difference between a proof of concept like this and a production system. These sorts of user journeys form such a key part of delivering great user experience, and integrating multiple systems into your login and authentication/authorization journeys only makes this harder. But it did work, and it wasn’t actually all that complicated to get it up and running. It’s also interesting to see how something like OpenID Connect can be used to integrate a powerful open-source solution like IdentityServer with a heavyweight hosted platform service like CRM Portals.

Whether we end up adopting a hosted solution like CRM Portals – as opposed to just building our own apps that connect to CRM via the SDK or the new OData API – remains to be seen, but it’s nice to see solutions from two radically different sources playing nicely together thanks to the joy of open protocols like OpenID Connect. Long may it continue.

Monday, 31 October 2016

The Laws of Distributed Systems

I’ve spent a lot of time over the last year reading, thinking, and speaking at conferences about distributed systems, organisational structures, and the eponymous laws of software development. Over the course of many conversations and countless blog posts and articles, something has crystallised from thinking about three laws in particular, which – if it’s right - could have substantial implications for all of us as software developers, and for the people who use the systems we build.

TL;DR: if we keep having meetings, the internet will stop working.

(There – that got your attention, didn’t it?)

Moore’s Law

So, let’s recap. Gordon Moore was the co-founder of Intel and Fairchild Semiconductor. Back in 1965, Moore wrote a paper predicting that the number of components per integrated circuit would double every year. In 1975, he revised his forecast to doubling every two years. His predictions have proved accurate for several decades, and will probably continue to do so until the 2020s, but what’s really interesting is what’s happened since 2000.

Here’s the average total transistor count of CPUs against their year of introduction since 1965. Plotted on a a logarithmic axis, it’s pretty close to a straight line.

image_thumb1
[Data source: Wikipedia] 

Now here’s the same graph, but showing transistors per core.

image_thumb3
[Data source: Wikipedia]

See how around 2005 the two series suddenly diverge sharply? That’s because in the early 2000s, we began hitting the physical limits of how many transistors could be integrated into a single CPU. Somewhere around the 4Ghz mark, we hit a wall in terms of raw clock speed, and so the semiconductor industry hit upon the bright idea of multicore CPUs – basically putting more than one CPU into the same physical package.

In the same time frame, we’ve seen an industry-wide shift away from monolithic powerhouse servers towards distributed systems. Modern web apps – which are really just big multiuser systems – run across clusters of dozens or hundreds of ephemeral worker nodes; a radical contrast to the timesharing mainframe systems of the 1970s and 1980s.

The amount of computing power available at a particular price point is still increasing exponentially, but we’re no longer scaling up, we’re scaling out. And the reason we’re scaling is that the load on our systems – our websites, APIs and servers – is also increasing. More people are getting online, people are using more devices and connected services, and those devices are delivering increasingly rich user experiences – which means more data, more power and more bandwidth. Here’s Business Insider’s analysis and forecast of the number of connected devices from 2015 to 2020:

unnamed

To cope with this ever-increasing level of expectation, we need to build systems that will scale out to cope with demand. Our code needs to parallelize. We need to decompose our problems into small, autonomous units of work that can be distributed across as many cores or nodes as we have available, and combine the outputs of those operations to deliver the results our users are expecting.

Amdahl’s Law

This brings us to Amdahl’s Law. Gene Amdahl started out designing mainframe systems for IBM. He was the chief architect of the IBM System/360, and he first presented his eponymous law back in 1967. Amdahl’s Law controls the theoretical performance improvements we can expect by parallelizing some given workload.

Amdahl’s Law is actually Slatency(s) = 1/((1-p) + (p/s)) – but the gist of it is nicely explained by thinking about Christmas dinner. Or Thanksgiving, if that’s your thing. If you’ve got one person with one cooker working alone, it’ll take a good 20 hours to prepare all the trimmings for a Christmas dinner. By adding more people and more cookers, you can parallelise this and so complete it faster – but you reach a point where everything is done and everyone’s stood around waiting for Jeff to finish roasting the turkey, because you can’t roast a turkey in under four hours, no matter how many chefs and ovens you’ve got. 

If you need to parallelize like this, eliminate the turkey. Have steaks instead – because you can cook steaks in parallel. If you suddenly get another 20 guests showing up half an hour before lunch, no problem – you don’t need to wait four more hours to roast another turkey; just get 20 more chefs to cook 20 more steaks and you’ll still be done on time. And because you’re using cloud infrastructure, you can spin up more chefs and griddles instantly to cope with the increased demand.

See, by designing systems to eliminate those non-parallelisable workloads, we create systems that scale smoothly with the available resources. The beauty of that is that, like all the best solutions in software, it turns “how fast is our website” into a pure business decision. You want faster pages? Pay for more servers. No need to rewrite your algorithms; just throw more power at it.

Conway’s Law

Finally, there’s Conway’s Law. First published in 1964, Conway’s Law is the observation that ‘any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.’ As with Moore’s Law, it’s an observation that has proved remarkably prescient over the intervening decades. I’ve spoken at length about how Conway’s Law has affected the teams and projects I’ve worked on personally, but there’s also some interesting examples in the software industry at large. The relationship between the Linux kernel and the various distributions based on it has interesting parallels with Linus Torvalds’ role in the Linux ecosystem. Id Software’s genre-defining Doom and Quake games – tight, cohesive, focused engines, created by a bunch of coders camped out in a beach house with soda, pizza and no distractions, with less tightly-coupled elements like music and level design handled by less close-knit development efforts.  High-profile open source projects like Chromium – the underlying rendering engine, the browser itself, and the ecosystem of plugins and extensions closely reflecting the tight-knit Webkit project, the loosely-organised contributors and pull requests that shape the development of the browser application, and the community of plugin and extension developers who don’t engage with the project directly but rely on published contracts and protocols just as their plugins and extensions do.

And then there’s the really obvious examples, like this one:

image[92] 

Putting it all together…

OK, so let’s look at what happens when we interpret those three laws together. Moore’s Law has informed half a century of user expectations about technology. More people do more stuff on more devices, and they expect those experiences to keep on getting better, faster and more responsive. As we’ve seen above, the increase in raw computing power that’s going to deliver those improvements isn’t about clock speed any more, it’s about parallelism. Amdahl’s Law tells us whether systems will benefit from that parallelism or not – and that systems based around blocking, non-parallelisable, long-running operations will benefit the least from the next decade of computing innovation. And Conway’s Law says that if we don’t want our systems to contain these kinds of blocking, non-parallelisable operations, then we should be looking to eliminate them from our organisations.

Which brings us to the crux of the thing: what’s the organizational equivalent of a long-running non-parallelizable operation?

How about sitting around reading Hacker News because the person who’s asked you to build a “Summary Dashboard” hasn’t told you where to find the data, or what the dashboard should look like, and they’re out of the office right now, they didn’t leave any notes, and you can’t do anything until they get back?

How about a two-hour project update meeting where a series of people sit around telling each other things they could have emailed, or written down in a ticket or on a wiki page?

How about sitting on a train for an hour to get to the office in Canary Wharf where you’re expected to be at 09:00 every day, despite the fact that your source code is hosted in the US, your data centre is in Ireland, your issue tracking system is hosted in Frankfurt and your customers are online 24/7 all over the world?

One of the underlying principles of the agile manifesto is that ‘the most efficient and effective method of conveying information to and within a development team is face-to-face conversation’. I think that’s correct, but I think it might be optimising for the wrong metric. Sure, a conversation is a high-bandwidth, high-interaction discussion medium, and I find face-to-face great for bouncing ideas around and solving problems - but conversations are ephemeral. They’re not captured anywhere, nobody outside the conversation knows what was said, and there’s always the risk the people you’re talking to assure you they get it when they actually haven’t understood a single word you said. Perhaps we should be optimising our communication patterns for discoverability instead of raw bandwidth; trading a little temporary velocity for some long-term efficiency.

This isn’t just about cancelling a couple of meetings and letting people work from home on Fridays. It’s about changing the way we think about collaboration, so that the interaction patterns we want to see in our systems can emerge organically from the interaction patterns used by the people who created them. It’s about taking established architectural patterns and practises used in asynchronous distributed systems, and working out if we can apply those patterns to our teams and our projects. What if you applied event sourcing to your project backlogs, so you don’t keep having to ask people about the context behind a particular decision? Maybe you’re even doing this already – I know a lot of open-source projects that do an excellent job of capturing this history as part of their open issues and tasks so anybody who wants to pick up a particular ticket can see the complete history, the discussion, the arguments and hopefully the eventual consensus. What if you treat your documentation - wiki pages, GitHub pages, READMEs - like the query stores in a CQRS system? Rapid retrieval, read-only, optimised for consumption, and updated as necessary when processing commands (i.e. making changes) that affect the underlying systems that are being documented?

What I find remarkable is that Moore, Amdahl and Conway all published their eponymous laws almost exactly fifty years ago – Conway’s Law was published in 1964, Moore was 1965, Amdahl was 1967. Their observations hail from a decade of astonishing engineering achievements – Apollo, the Boeing 747, the Lockheed SR-71, the geodesic dome – in an era when computers were still highly specialist devices. Sure, you could argue that people working on timesharing systems in the 1960s couldn’t possibly have foreseen the long-term social implications of distributed systems engineering – but remember, this is the generation that landed on the moon using slide rules and No. 2 pencils. Do you really want to bet on them being wrong?

Tuesday, 25 October 2016

The Mystery of the Chinese Junk

You know it’s going to be one of those days when, just as you’re about to put on your headphones and get into ‘the zone’, you overhear somebody saying the fateful words ‘ok, then maybe we’ll need to get Dylan to look at it.’

See, amongst the many hats I wear in the course of a given week, there’s one that’s probably labelled ‘dungeon master’ I’m the one who remembers where all the bodies are buried, because – for all sorts of reasons that made very good sense at the time – I probably helped bury most of them. And on this particular day, the source of so much excitement was our venerable Microsoft Dynamics CRM v4 server. It started out with a sort of general grumbling on the support channel about CRM4 being slow… but by the time it was handed over to me to look into, it was beautifully summarised as ‘dude… there’s Chinese in the Windows event log’

And, sure enough, there is – complete with the lovely Courier typeface that Windows Event Viewer kicks into when you get errors so weird that good old Microsoft Sans Serif can’t even display them:

image

Now, whilst it’s been a while since I’ve done any serious work on our old CRM system, I’m pretty sure it’s not supposed to do that – so we start investigating. Working theory #1: some sort of vulnerability has resulted in attackers injecting Chinese characters into our database – whilst CRM4 is generally pretty well insulated from any public-facing code, there’s one or two places where signup forms would generate CRM Leads, that sort of thing. So we start grepping the entire database for one of the Chinese strings we’ve found in the event log.

Whilst this is going on – and trust me, it takes a while – I decide to share my excitement via the wonder of social media. This turns out to be a Really Good idea, because... well, here's what happened...

"The incoming tabular data stream TDS RPC protocol stream is incorrect. Parameter ("䐀攀氀攀琀椀漀渀匀琀愀琀攀..." Oh. It's gonna be one of THOSE days.

— Dylan Beattie (@dylanbeattie) October 18, 2016

@dwm @dylanbeattie The low bits are all null, so this probably UTF-16LE being mistaken for UTF-16BE (or vice versa...?). pic.twitter.com/D0IO4px9YE

— Fake Unicode ⁰ ⁧ (@FakeUnicode) October 18, 2016

You see in @FakeUnicode's screenshot there, the words ‘DeletionState’ appear quite clearly at the bottom of the message?

Whilst this is going on, our database search comes back reporting that there’s no mysterious Chinese characters in any of our CRM database tables. Which is good, since it means we probably haven’t been compromised. So, next step is to work through that Unicode lead, see if that gets us anywhere. Because .NET has a built-in encoding for big-endian Unicode, this is pretty simple:

var source = "䐀攀氀攀琀椀漀渀匀琀愀琀攀";
var bytes = Encoding.BigEndianUnicode.GetBytes(source);
var result = Encoding.Unicode.GetString(bytes);
Console.WriteLine(result);

Turns out – just as in FakeUnicode’s screenshot – that’s the text “DeletionState” with the byte order flipped. We grabbed a few examples of the ‘Chinese’ text from the event log, ran it through this – sure enough, in every single case it’s a valid CRM database query that’s somehow been flipped into wrong-endian Unicode. At this point we start suspecting some sort of latent bug – this is old software, running on an old operating system,talking to an old database server, and sure enough, a bit of googling turns up a couple of  likely-looking issues, most of which are addressed in various updates to SQL Server 2008. We take a VM snapshot in case everything goes horribly wrong, and one of the Ops gang volunteers to work late to get the server patched.

Next morning, turns out the server hasn’t been patched – because every single download of the relevant service pack has been corrupted. At which point all bets are off, because chances are the problem is actually network-related – which also explains where the ‘Chinese’ is coming from.

OK, let’s capture a stream of bytes from somewhere. Like, say, from the TDS data stream used by the MSCRMAsyncService

image

What does that say? If you think you know the answer, you’re wrong. Pop off and read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – done? Awesome. NOW what do you think it says?

See, we have no idea. It’s a stream of bytes. Without some indication of how we’re supposed to interpret those bytes, it’s meaningless. OK, I’ll give you a clue – it’s UTF-16. Now can you tell what it says? No, you can’t – because (1) you don’t know whether it’s big-endian or little-endian, and (2) you don’t know where it started.

If we assume it’s big-endian, then the first byte pair – 00 48 – would encode the character ‘H’, the second byte pair – 00 65 – would encode ‘e’, and so on. If we assume it’s little-endian, then the first byte pair – 00 48 – encodes the character 䠀 – and suddenly the mysterious Chinese characters in the event log start to make sense.

image

Of course, the data stream between the MSCRMAsyncService and the SQL server hasn’t actually flipped from little-endian UTF-16 to big-endian – what’s happened is that the network connection between them is dropping bytes. And if you drop a single byte – or any odd number of bytes – from a little-endian Unicode stream, you get a sort of off-by-one error right along the rest of the data stream, resulting in all sorts of weirdness – including Chinese in the event logs.

Turns out there was a problem with the virtual network interface on the SQL Server box – which was causing poor performance, timeouts, bizarre query syntax errors, Chinese in the event logs, and corrupted service pack downloads. Fortunately the databases themselves were intact, so we offlined them, cloned the virtual disk they were sitting on, attached that to a different server and brought them back online.

Every once in a while, you get a weird problem like this. I’ve seen maybe half-a-dozen problems in my entire career that made absolutely no sense until they turned out to be a faulty network connection, at which point generally you not only solve the problem, but explain a whole load of other weirdness that you hadn’t got round to investigating yet. The only thing more fun than dodgy networks is dodgy memory – but that’s a post for another day.

Oh, and if you’re wondering about the title of this post, you clearly haven’t studied the classics.

Thursday, 15 September 2016

Upcoming Conferences and User Group Talks

I’m appearing at some conferences and user groups in October and November, talking about ReST, the history of the web, and how we can apply the scientific method to software development. Though not all at the same time.

October 19th I'll be speaking at FullStack Bytes at SkillsMatter here in London - one of a series of monthly events around the same themes and ideas as the annual FullStack conference - "JavaScript, NodeJS and the Internet of Things". On October 25th I’m heading up to Telford to give my Real World REST talk at the Shropshire Devs User Group.

November 5th, I’m one of the keynote speakers at Tampere Goes Agile, a free one-day conference in Tampere, Finland all about agile software development. The theme of the conference this year is ‘experimentation’, and I’ll be talking about the scientific method, the history of experimentation and how we can apply scientific principles to our own software projects and agile processes. It’ll be my first time visiting Finland – new country, new conference, and a new topic – and I’m looking forward to it immensely.

I’ll be speaking at Oredev in Malmo, Sweden on the 8/9/10th of November, where as well as a couple of technical talks, Rob Conery has signed me up for a ‘head to head’ session with Jimmy Bogard, which should be entertaining. Thanks, Rob... :)

Finally, I’ll be rounding out 2016 with a couple of talks at BuildStuff - in Vilnius on Nov 16-18 and then in Kyiv on the 21-22. If you’ve ever been, you’ll know that BuildStuff is an excellent conference with great people and great content – and if you haven’t, get yourself a ticket, come along and see for yourself.

Wednesday, 27 July 2016

Affordances, Signifiers, and Cartographobia

One of the teams here is putting the finishing touches on a new online version of Spotlight Contacts, our venerable and much-loved industry guide that started life as a printed handbook way back in 1947. Along the way, we’ve learned some very interesting things about data, and how people perceive that their data is being used.

image

One of the features of the new online version is that every listing includes a location map – a little embedded Google Map showing the business’ location. When we rolled this feature out as part of a recent beta, we got some very unhappy advertisers asking us to please remove the map from their listing immediately. Now, most of these were freelancers who work from home – so you can understand their concerns. But what’s really interesting is that in most cases, they were quite happy for their full street address to stay on the page – it was just the map that they were worried about.

Of course, this immediately resulted in a quite a lot of “what? they want to keep the address and remove the map? ha ha! that’s daft!” from developers – who, as you well know, are prone to occasional outbursts of apoplectic indignation when they have to let go of their abstractions and engage with reality for any length of time – but when you think about it, it actually makes quite a lot of sense.

See, street addresses are used for lots of things. They’re used on contracts and invoices, they’re used to post letters and deliver packages. Yes, you can also use somebody’s address to go and pay them a visit, but there are many, many reasons why you might need to know somebody’s address that have nothing to do with you turning up on their doorstep. In UX parlance, we’d say that the address affords all of these interactions – the presence of a street address enables us to post a letter, write a contract or plan a trip.

A map, on the other hand, only affords one kind of interaction; it tells you how to actually visit somewhere. But because of this, a map is also a signifier. It sends a message saying “come and visit us” – because if you weren’t actually planning to visit us, why would you need to know that Spotlight’s office at 7 Leicester Place is actually in between the cinema and the church, down one of the little alleys that run between Leicester Square and Chinatown? For posting a letter or writing a contract, you don’t care – the street address is enough. But by including a map, you’re sending a message that says “hey – stop round next time you’re in the neighbourhood”, and it’s easy to see why that’s not really something you want if you’re a freelancer working from your home.

It’s important to consider this distinction between affordances and signifiers when you’re designing your user interactions. Don’t just think about what your system can do – think about all the subtle and not-so-subtle messages that your UI is sending.

Here’s the classic Far Side cartoon “Midvale School for the Gifted”, which provides us with some great examples of affordances and signifiers. The fact you can pull the door is an affordance. The sign saying PULL is a signifier – but the handle is both. Looking at it gives you a clue “hey – I could probably pull that!” – and when you do, voila, the door swings open. If you’ve ever found a door where you have to grasp the handle and push,, then you’ve found a false affordance – a handle that’s sat there saying ‘pull me…’ and when you do, nothing happens. And, in software as in the Far Side, there’s going to be times when all the affordances and signifiers in the world are no match for your users’ astonishing capacity to ignore them all and persist in doing it wrong.

(Far Side © Gary Larson)

ASP.NET Authentication with Adxstudio

I’m looking into options for integrating our shiny new CRM system with our website, so we can provide all sorts of neat self-service capabilities and features. One of the applications I’m investigating is a thing called Adxstudio – now owned by Microsoft - which claims to “transform Dynamics CRM into powerful application platform with dozens of apps and starter portals.”

image

This is one of those situations where we really are dealing with ‘solved problems.’ Email campaigns. Customers updating their own contact details, potentially things like forums, helpdesk/ticketing systems – lots of things which are nice-to-have but really aren’t strategic differentiators, and so there’s a compelling argument to find an off-the-shelf solution and just plug it in. We already have a federated authentication system here at Spotlight – something we built a few years ago that provides basic identity and authentication capabilities on top of OAuth2. At the time we built it, OpenID Connect didn’t exist yet, so we’ve got a system that does basically the same thing but isn’t actually compatible with OpenID Connect – and consequently doesn’t work out-of-the-box with Adxstudio. So I’ve been poking around, trying to work out the best way to plug Adxstudio into our infrastructure so we can evaluate it as a solution.

One of the options on the table was to replace our existing authentication system with IdentityServer; another was to implement OpenID Connect support on top of our existing authentication system – both quite elegant solutions, but both of which involve quite a lot more work than is actually required for what we’re trying to do.

The core requirement here is:

  • We already have a CRM Contact record for every user of our system
  • We can look up a user’s CRM Contact GUID during authentication
  • We want to set up the Adxstudio MasterPortal demo so that our customers are seamlessly authenticated and can use Adxstudio features as though they had registered via the Adxstudio registration facility.

Now, one of the nice things about Adxstudio is that it’s built as OWIN middleware, and uses the ASP.NET Identity framework to handle authentication – so what we need to do is work out how to translate the CRM Contact GUID into an IPrincipal/IIdentity instance that we can assign to the HttpContext.Current.User property, and hope that Adxstudio then does the right thing once the HttpContext User is set correctly.

Adxstudio provides an implementation of ApplicationUserManager that’s already registered with the OWIN model, which accepts a CRM Contact GUID (as a string) and returns an instance of ApplicationUser that we can use to spin up a new ClaimsIdentity. So the simplest possible approach here is this snippet of code:

protected void Application_AuthenticateRequest(object sender, EventArgs e) {
  Guid userGuid;
  var cookie = Request.Cookies["crm_contact_guid"];
  if (cookie != null && Guid.TryParse(cookie.Value, out crmContactGuid)) {
    var http = HttpContext.Current;
    var owin = http.GetOwinContext();
    var userManager = owin.Get<ApplicationUserManager>();
    var user = userManager.FindById(crmContactGuid.ToString());
    var identity = user.GenerateUserIdentityAsync(userManager).Result;
    HttpContext.Current.User = new RolePrincipal(identity);
  }
}

Doing this with a Contact that’s been created via the Adxstudio registration thing works just fine – but trying to do it with a ‘vanilla’ contact blows up:

imageAs you can see from that stack trace, deep down buried under several layers of Adxstudio and ASP.NET Identity code, something is trying to construct a System.Security.Claims.Claim() instance and it’s blowing up because we’re passing in a null value for something that’s not allowed to be null. Unfortunately for us, because we don’t have the source for the thing that’s actually blowing up, we can’t see what the actual parameter values are that are causing the exception… so it’s time for a bit of hunch-driven development. :)

I’d already noticed that when you install Adxstudio into your CRM system, it adds a bunch of custom attributes to the Contact entity in Dynamics CRM; here’s a dump of those attributes for a working contact:

Attribute Key Value
adx_changepasswordatnextlogon False
adx_identity_emailaddress1confirmed False
adx_identity_lockoutenabled True
adx_identity_logonenabled True
adx_identity_mobilephoneconfirmed False
adx_identity_passwordhash (omitted for security reasons)
adx_identity_securitystamp ca49a664-0385-4eb4-90c0-6283c9e704ea
adx_identity_twofactorenabled False
adx_identity_username ali_baba
adx_lockedout False
adx_logonenabled False
adx_profilealert False
adx_profileisanonymous False
adx_profilemodifiedon 2016-07-20 09:18:43

The ASP.NET Identity model is generally pretty flexible, but I have a hunch that the username and the security stamp are both required fields because they’re fundamental to the way authentication works. So, let’s try inserting some code into the AuthenticateRequest handler that will check these fields exist, and update them directly in CRM if they don’t:

protected void Application_AuthenticateRequest(object sender, EventArgs e) {
  Guid userGuid;
  var cookie = Request.Cookies["crm_contact_guid"];
  if (cookie != null && Guid.TryParse(cookie.Value, out userGuid)) {
   
var http = HttpContext.Current;
    var owin = http.GetOwinContext();
    var userManager = owin.Get<ApplicationUserManager>();
    var user = userManager.FindById(userGuid.ToString());
    if (String.IsNullOrEmpty(user.UserName)
        ||
        String.IsNullOrEmpty(user.SecurityStamp)
     ) {
       // "Xrm" is the connection string name from web.config
       using (var crm = new OrganizationService("Xrm")) {
         var entity = crm.Retrieve("contact", userGuid, new ColumnSet(true));
         entity.Attributes["adx_identity_securitystamp"] =
           Guid.NewGuid().ToString();
         entity.Attributes["adx_identity_username"] =
           Guid.NewGuid().ToString().Substring(0, 8);
         crm.Update(entity);
      }
    }

    var identity = user.GenerateUserIdentityAsync(userManager).Result;
    HttpContext.Current.User = new RolePrincipal(identity);
  }
}

(For the sake of this demo, all we care about is making sure those values are no longer null. In reality, make sure you understand the significance of the username and security stamp fields in the identity model, and populate them with suitable values.)

OK, this now works sometimes – but only following an IISRESET. Turns out that Adxstudio is actually caching data from CRM locally, so although that new chunk of code is updating the Contact entity into a valid identity, the Adxstudio local cache doesn’t see those changes because it’s looking at an out-of-date copy of the Contact entity. So… time to configure some cache invalidation.

You can read about Adxstudio’s web notifications feature here. Adxstudio includes some code that will call a cache invalidation handler on your own site every time an entity is updated. Which works just fine IF CRM Online can see your Adxstudio portal site. And right now I’m running CRM Online as a 30-day trial and I’m running Adxstudio on localhost, and my workstation isn’t on the internet, so CRM Online can’t see it.

Time to fire up my favourite toolchain – Runscope and Ngrok. First, I’ve set up ngrok so that any requests to mytunnel.ngrok.io will be forwarded to my local machine – you’ll need a paid ngrok license to use custom tunnel names, but if you’re using the free version try this:

D:\tools\ngrok>ngrok http –host-header=adx.local 80 

image

Now, as long as that NGrok process is running, you can hit that URL – http://ef736c25.ngrok.io/ – from anywhere on the internet, and it’ll be tunneled to localhost on port 80 and have the host-header rewritten to be adx.local. This neatly solves the problem of CRM Online not being able to connect to my local Adxstudio instance.

Next, just to give us a bit of insight into what’s going on, I’m going to set up a Runscope bucket for that. Remember – we need to route requests to /cache.axd on our local Adxstudio portal instance, via ngrok, so here’s how to get the Runscope URL you’ll need:

image

So, last step – you see that big URL in the middle?  We need to tell the Adxstudio Web Notifications plugin to notify that URL every time something changes. The option is under CRM > Settings > Web Notification URLs.

Note that the Adxstudio documentation refers to a Configuration screen accessible from the Solutions > Adxstudio Portals Base. It appears this screen doesn’t exist any more – I certainly couldn’t find any trace of it in my CRM Online instance – but it also appears it isn’t necessary, because as soon as I’d created and active Web Notification URL things started happening.

So, now we have something that works – but it still fails on the first Portal request for a particular Contact, probably because the Adxstudio cache isn’t picking up those two new fields fast enough for the login to succeed. To work around this, I’ve put in a thread sleep and then an HTTP redirect, so the first time a user lands on the portal they’ll get a slight delay whilst we populate their Adxstudio attributes, and then they’ll get their personalised screen:

protected void Application_AuthenticateRequest(object sender, EventArgs e) {
  Guid userGuid;
  var cookie = Request.Cookies["crm_contact_guid"];
  if (cookie != null && Guid.TryParse(cookie.Value, out userGuid)) {
   
var http = HttpContext.Current;
    var owin = http.GetOwinContext();
    var userManager = owin.Get<ApplicationUserManager>();
    var user = userManager.FindById(userGuid.ToString());
   if (String.IsNullOrEmpty(user.UserName)
        ||
        String.IsNullOrEmpty(user.SecurityStamp)
     ) {
       using (var crm = new OrganizationService("Xrm")) {
         var entity = crm.Retrieve("contact", userGuid, new ColumnSet(true));
         entity.Attributes["adx_identity_securitystamp"] =
           Guid.NewGuid().ToString();
         entity.Attributes["adx_identity_username"] =
           Guid.NewGuid().ToString().Substring(0, 8);
         crm.Update(entity);
         Thread.Sleep(TimeSpan.FromSeconds(5));
         // Redirect back to the same page, so that Adxstudio will
         // retrieve a fresh copy of the cached data.
         http.Response.Redirect(http.Request.RawUrl);

      }
    }
    var identity = user.GenerateUserIdentityAsync(userManager).Result;
    HttpContext.Current.User = new RolePrincipal(identity);
  }
}

And it works. The final step for me was to spin up a separate web app that lists all the Contacts in the CRM system, with a login handler that puts the CRM Contact GUID into a cookie and redirects the browser to http://adx.local/ – and it works. No registration, no login, and any user with a valid CRM Contact GUID can now log directly into the Adxstudio MasterPortal example.

Wednesday, 20 July 2016

ASP.NET Core 1.0 High Performance

I have exciting news. My friend and erstwhile colleague James Singleton has just published his first book, ASP.NET Core 1.0 High Performance. James was gracious enough to invite me to contribute the foreword – and since the whole point of a foreword is to tell you all why the book is worth buying, I figured I’d just post the whole thing. Read the foreword, then read the book (or better still, buy it then read it.)

TL;DR: it’s a really good book aimed at .NET developers who want to improve application performance, it’s out now, and you can buy your copy direct from packtpub.com.

And that foreword in full, in case you’re not convinced:

"The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry."
- Henry Petroski

We live in the age of distributed systems. Computers have shrunk from room-sized industrial mainframes to embedded devices smaller than a thumbnail. However, at the same time, the software applications that we build, maintain and use every day have grown beyond measure. We create distributed applications that run on clusters of virtual machines scattered all over the world, and billions of people rely on these systems, such as email, chat, social networks, productivity applications and banking, every day. We're online 24 hours a day, 7 days a week,  and we're hooked on instant gratification. A generation ago we'd happily wait until after the weekend for a cheque to clear, or allow 28 days for delivery; today, we expect instant feedback, and why shouldn't we? The modern web is real-time, immediate, on-demand, built on packets of data flashing round the world at the speed of light, and when it isn't, we notice. We've all had that sinking feeling... you know, when you've just put your credit card number into a page to buy some expensive concert tickets, and the site takes just a little too long to respond. Performance and responsiveness are a fundamental part of delivering great user experience in the distributed age. However, for a working developer trying to ship your next feature on time, performance is often one of the most challenging requirements. How do you find the bottlenecks in your application performance? How do you measure the impact of those problems? How do you analyse them, design and test solutions and workarounds, and monitor them in production so you can be confident they won’t happen again?

This book has the answers. Inside, James Singleton presents a pragmatic, in-depth and balanced discussion of modern performance optimization techniques, and how to apply them to your .NET and web applications. Starting from the premise that we should treat performance as a core feature of our systems, James shows how you can use profiling tools like Glimpse, MiniProfiler, Fiddler and Wireshark to track down the bottlenecks and bugs that are causing your performance problems. He addresses the scientific principles behind effective performance tuning - monitoring, instrumentation, and the importance of using accurate and repeatable measurements when you’re making changes to a running system to try and improve performance.

The book goes on to discuss almost every aspect of modern application development - database tuning, hardware optimisations, compression algorithms, network protocols, object-relational mappers. For each topic, James describes the symptoms of common performance problems, identifies the underlying causes of those symptoms, and then describes the patterns and tools you can use to measure and fix those underlying causes in your own applications. There’s in-depth discussion of high-performance software patterns like asynchronous methods and message queues, accompanied by real-world examples showing how to implement these patterns in the latest versions of the .NET framework. Finally, James shows how you can not only load test your applications as part of your release pipeline, but can continuously monitor and measure your systems in production, letting you find and fix potential problems long before they start upsetting your end users.

When I worked with James here at Spotlight, he consistently demonstrated a remarkable breadth of knowledge, from ASP.NET to Arduinos, from Resharper to resistors. One day he’d be building reactive front-end interfaces in ASP.NET and JavaScript, the next he’d be creating build monitors by wiring microcontrollers into Star Wars toys, or working out how to connect the bathroom door lock to the intranet so that our bicycling employees could see from their desks when the office shower was free. Since James moved on from Spotlight, I’ve been following his work with Cleanweb and Computing 4 Kids Education. He’s one of those rare developers who really understands the social and environmental implications of technology - that whether it’s delivering great user interactions or just saving electricity, improving your systems’ performance is a great way to delight your users. With this book, James has distilled years of hands-on lessons and experience into a truly excellent all-round reference for .NET developers who want to understand how to build responsive, scalable applications. It’s a great resource for new developers who want to develop a holistic understanding of application performance, but the coverage of cutting-edge techniques and patterns means it’s also ideal for more experienced developers who want to make sure they’re not getting left behind. Buy it, read it, share it with your team, and let’s make the web a better place.

Check it out. The chapter on caching & message queueing is particularly good :)