UM Call Router troubleshooting adventure

Yesterday I was in Redmond, teaching the UM portion of the Microsoft Certified Solutions Master: Messaging class. This was the first rotation for this particular class, which replaces the MCM Exchange 2010 class, so I had all new content and an eager group of 14 motivated, smart MCSM candidates, including fellow MVP Michael van Horenbeeck, several people who I knew from online interactions (hi, Hany and Jerrid!), plus candidates from Germany, Israel, the US, Australia, and probably a few other places.

The teaching session went well, although my slides have a few lingering rough spots that I’ll need to polish. In this rotation we had a brand-new, and much improved, lab environment, so when Michael called me over to have a look at something my first thought was that it was a lab setup issue.

He couldn’t get the UM Call Router service to run after he’d enabled a certificate for it and set its startup mode to “dual”. These steps are required if you want to integrate Exchange UM with Lync, but they must be done in a specific order: first you change the startup mode with Set-UMCallRouterService (which will complain that you can’t enable secure SIP without a certificate!) and then you use Enable-ExchangeCertificate to assign the certificate.After he did so, the UM Call Router service stopped answering requests. When he ran netstat, he saw that it was listening on the IPv4 and IPv6 loopback addresses, but not on the assigned IPs for that server. The call router service had logged event ID 1621, which said that the UMCR couldn’t start because “the Client Access service was disabled.” This didn’t make a bit of sense, so we started digging.

First, I verified that no one else was having this particular problem—and they weren’t, so it seemed to be localized to Michael’s environment. Next I spent some time researching event ID 1621 on the intertubes, but that didn’t take long; the only two mentions I found were on TechNet, and the suggested solution was to reinstall Exchange. Nope, not gonna happen.

Michael had the bright idea to check the service component availability.. and it came back as “inactive”. However, the service was still running and would respond to telnet requests on port 5060 on the loopback address. This seemed very odd.

We ued Set-ServerComponentState to force the UMCR back into normal state, and it started listening on 0.0.0.0 again! So clearly the problem was that Managed Availability had killed off the service—now we started investigating why.

After a number of experiments, our theory was that because the UMCR couldn’t start in dual mode without a certificate assigned, so Managed Availability decided that it was unhealthy and marked its state as “inactive.” To test this, we ran Set-ServerComponentState to put the UMCR in maintenance mode; sure enough, the next time the service was probed, it unbound from 0.0.0.0 but remained bound to both loopback addresses. Forcing the service state back to healthy caused it to rebind to 0.0.0.0.

This leads me to point out a couple of things:

  • It strikes me as very odd that after Managed Availability marked the service as inactive that it kept running. I assume that this is on purpose; the service stays up so that Managed Availability can continue to probe it and keep its state updated.
  • The description for event ID 1621 is so bad it isn’t even wrong—the service wasn’t running because it couldn’t start with an unassigned certificate (and in fact, there was a separate event indicating exactly that). The problem had nothing to do with the (non-existent) client access service being disabled.
  • I didn’t see any events logged indicating that the component was in an unhealthy state, although I might have missed them. Once we’d fixed the binding problem, as we transitioned UMCR into and out of maintenance mode, we saw event ID 1648, indicating that the UMCR was returning to healthy state.

Clearly I still have a lot to learn about Managed Availability! I recommend starting with this blog post, which explains in more depth how you can find out what sort of mischief it may have committed on your unsuspecting services…

Advertisements

7 Comments

Filed under UC&C

7 responses to “UM Call Router troubleshooting adventure

  1. Pingback: UM Call Router troubleshooting adventure | Paul’s Down-Home Page | JC's Blog-O-Gibberish

  2. You are the man !
    I had some strange behavior to enable the Service again. I did it for all the requesters and then it worked.
    Thanks you all and I hope you had as much fun as I had at the Lync Rotations 🙂
    – Philipp

    • Johann Spies

      Hi Guys,

      I am experiencing the exact same problem… Get-ServerComponentState shows the UMCallRouter as Inactive. I have tried the Set-ServerComponentState -State Active but it remains Inactive. How do you force it go Active or Healthy? I can see that my UMCallRouter ports is only bound to the loopback address… Any help would be appreciated…

      Cheers.

      • Mike

        Does anyone know why the UMCallRouter service dumps gigs of logs files to C:\Windows\Temp? If so, please explain and help see if this path can be changes. Thank in advance. I am sorry to post this here, but it appears y’all know stuff about UM..

      • robichaux

        Wow, I don’t know, Mike– that’s a new one on me. I’ve seen occasional reports of AV services spamming the temp directory with log files, but have never run into this problem before. What version of Exchange and Windows are you using, and when did this start? Can you identify what the log files are for (i.e. do they say anything that tells you which component is creating them?)

  3. Pingback: Exchange 2013 and Lync 2010 Auto-attendant not working | Jodrik.nl

  4. artofnoise

    i have the same issue with c:\windows\temp fills up the hard drive with random named .tmp files example: 68ed544f-c155-4cb1-adf4-722ab1eb5f65.tmp. contents of tmp file are over and over.
    — Dumping Process Log —
    — Starting new process log for Microsoft.Exchange.UM.CallRouter —
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMCallRouter::Main
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMAbstractService::Initialize
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMAbstractService::Initialize: Removed unnecessary privileges
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMAbstractService::Initialize: Initialized Watson.
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMAbstractService::Initialize: Register for Unhandled Exceptions
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: Initialize: Initialized single performance counters.
    05/11/2016 23:08:19 7812 Microsoft.Exchange.UM.CallRouter: UMAbstractService::C’tor
    05/11/2016 23:08:20 7812 Microsoft.Exchange.UM.CallRouter: UMServiceBase::OnStartInternal
    05/11/2016 23:08:26 7812 Microsoft.Exchange.UM.CallRouter: UMServiceBase::StartService
    05/11/2016 23:08:26 7812 Microsoft.Exchange.UM.CallRouter:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s