Schannel Fatal Alert (70) on Exchange Server 2010

So I use Zenoss for centralized system monitoring, including everything from network devices, ESXi hosts, all the way to end server such as Windows VM’s using WMI.

As I receive a flood of events from SharePoint and it’s child service  a terrible workflow server add on called K2 Blackpearl, I ignored my Zenoss server quite a bit. I did clean up my other servers pretty well. So when I noticed this alert on my Exchange server, I wasn’t too happy. I like clean event logs in most of the servers I manage. (I’ve made an expectation to SharePoint and K2 since there a whole mixed bag of service accounts and permissions, and web parts… so many moving parts, I simply don’t care about their events.. Given there are no issues)

So I set out to figure out what was causing this event… usual googling came up with the usual TechNet articles of those claiming it probably just re-associating to another acceptable protocol and to accept it. And as per usual whenever people can’t figure out why a event is triggering but doesn’t seem to affect production: “you can just ignore it, or disable SChannel events” This is not good enough for me, as it clearly indicates an issue going on in the back end.

Digging further I came across this tid bit of info. Using this info I knew it was a protocol version issue, with SSL and since it’s on my exchange server I had an itchy suspicion it was ActiveSync related.

Installing Wireshark onto the server and running it with the SSL filter in place, I sure enough was able to pin point the device triggering the events. My boss’s Note 4 running Android 5.01 using the native mail app. At first I simple went into his exchange settings (just to note that it would work externally but not internally) and unchecked SSL (caused Auth to fail as expected), then re-enabled SSL. At first this seemed to make his ActiveSync work and I figured the events would go away, they did not, checking Wireshark it was still from his phone.

To Paraphrase to solution:

1) Remove the corporate email account from the device. (Completely)
2) Re-add the account to the device.

So that’s it! Since doing that I haven’t received any other SChannel fatal error (70). I hope this helps other that come across the same events in their Exchange environment. Just note this was on Exchange 2010 SP3 RU 10.

Jan 2018 Update

Got to love event logging. See so much, but sometimes, so much can drown you. Just have to take care of the ones you can when you can.

Remotely working with user profiles

Checking up on my daily emails, I see my usual AV report. I open it up to see a who’s the bad surfer, turns out only one system. And it turned out to be the new temp we hired. I didn’t expect him to actually go to any bad sites, didn’t seem to be the type, so I quickly viewed the infected file.

Turns out the file specified was from an old user profile, of someone who used the system before him… making me wonder how many old profiles are on his system. I’m a systems admin who prefers to get some things done without affecting other peoples work. I manage to do these with different remote applications. Most of these applications share the users screen and allow taking over of the system. This still interrupts the user, so often I resort to sysinternals psexec.

So I wanted to find out how I could enumerate a list of local profile accounts on a system via command. Doing a bit of research showed this to not be as easy as I had expected (was hoping for a simple wmic command). As it turns out sometimes I forget I’m stuck in the past and should really get with the times, old way

Which is from 2005, thankfully Since Vista they introduced a new class to handle these. 😀 Check out this post for more info.

Easy Peasy way using Win32_UserProfile class and PowerShell 😀

To Paraphrase to solution:

1) Open PS in elevated mode.
2) gwmi win32_userprofile -co $REMOTESYSTEM | ft sid, localpath -a
3) Where gwmi = Get-WinObject | Win32_UserProfile is the class | -co is the remote system attribute | |is the pipe | and ft is format table
4) Note as with all object based programming/applications This can be very easily used to manage user profiles as well, ex:
5) (gwmi win32_userprofile -co Server1 | where {$_.LocalPath -like '*\cjohn*'}).Delete()

So that’s it! Ever since Vista managing user profiles has become a breeze and no longer requires intensive scripting to be managed remotely! Thanks MS you finally did something right!

DC Demote failes due to ForestDNSZones

Scenario: You are about to remove the final Physical Domain Controller from your infrastructure, As you’ve done this before you figure it’ll be a piece of cake and will go about you day.. Instead you’re presented with this!


After some googlings, you’ll probably comes across this. Which will tell you that you need to edit the objects attribute to an active DC role owner.

If you decided to look in the dcpromo log file, you probably noticed that the fsmroleOwner is pointing to an old Server, which was probably the PDC at one point within the domain.

Even though it took a decent amount of time to troubleshoot, I’ll keep this post short. Just check out MS tech guy: Chris Davis blog about the issue.

Grab FixFSMO.vbs from MS support articles, or Davis’s blog and run it against the object in DN notation on a PDC or the DC you wish to have as the FSMO owner.
ex. cscript fixfsmo.vbs DC=DomainDnsZones,DC=Contoso,DC=com

Jan 2018 Update

Funny I don’t recall this one all that well, but great blog post by Chris which covers the nitty gritty pretty well, considering it’s a direct MS technet blog unlikely to go down. Good job.

Switching between Skype UI and Lyne UI

To change All Users to Skype for Business UI:
Set-CsClientPolicy -Identity Global -EnableSkypeUI $true

To change All Users to Lync 2013 UI:
Set-CsClientPolicy -Identity Global -EnableSkypeUI $false

What if you only want to change the UI for a certain group of users?
It’ll only take 2 extra cmdlets, in the same sphere.

First you create a new client policy by which to identify this group of users. Let’s call them “SkypeTesters”.
The cmdlet will look like this:
New-CsClientPolicy -Identity SkypeTesters -EnableSkypeUI $true

Then you collect users & assign them to this new SkypeTesters policy. You can collect users via department, AD group, etc. I’ll use a Marketing Department for this example.

To collect users:
Get-CsUser -LDAPFilter “Department=Marketing”
To grant them the new client policy & enable Skype for Business UI:
Grant-CsClientPolicy -PolicyName SkypeTesters

(Of course you can pipe these two cmdlets together & save time. I split them up just for clarity’s sake.)

Information here was provided by The Lync Insider

If you wish to enable all users for Lync after you are done with your initial test group, remove users from the test group with the following piped cmdlet:
Get-CsUser -Filter {ClientPolicy -eq “SkypeTesters”} | Grant-CsClientPolicy -PolicyName “”
The key is specifying a blank PolicyName, this took me rather long time to figure out hahah.
Once that is comepleted you can run the inital cmdlet above to enable the skype UI on the global policy.
This is way better demo then my initial blog post, thanks Lync Insider for having a better write up then MS answers! Cheers!

FYI, to check the global policy and what it’s attribute is set to run the following cmdlet:
Get-CSClientPolicy -Identity Global | select Identity, EnableSkypeUI | fl

PowerShell: SkypeUIEnabled

Jan 2018 Update

Good ol’ Lync/Skype; Seems MS can never get their marketing choices right and all the Devs suffer for it.

Repair a corrupted Windows boot

This blog is going to be an interesting one. The points as to how I got to a windows machine with a bad boot is as interesting as how I managed to resolve the issue.

It started on a Friday, well it technically started before that, but long story short my colleague and I were planning to do a P2V of a physical SQL machine. I prepared the new ESXi host slowly during the week to prepare for this.

As I was destined to go camping that weekend and still needed a bit of work to be done to get ready for the progress, I’ll admit I was a bit rushed in telling my colleague the appropriate steps to take. Now at this point the host was ready, albeit was a bit on the small side when it came to the local datastore, (3 x 6Gb/s sata discs in a RAID 5 (I really wanted RAID 1 with a hot spare, but this was the best suitable option this HBA/RAID controller provided me)). Anyway while I was away camping, go figure in a location with no cell service, my colleague completed the P2V. Turns out it didn’t go 100% perfectly as planned, as he emailed me a bunch of error noted by the host ESXi. Turns out he had provisioned Thick Provision lazy zeroed discs normally I wouldn’t deny its a good choice under certain circumstances, however, in this case with the limited datastore space it wasn’t the greatest choice cause there wasn’t really room to spare on empty “zeroed” data.

So after I was informed of the situation I began to attempt to fix the issue, which happened to be that the host was unable to remove/consolidate snapshot due to the lmited space left on the datatstore. I began by adding the host into our SAN network and connecting it to our SAN storage. I did a svMotion after I had initially shutdown the guest OS. Shutting down the guest OS took over 20 minutes with the slowness it had been brought to by the issue, while it was still specifying shutting down I had got impatient and forced the system down.

After the initial svMotion, I checked the datastore and noticed that there was still a VM folder in there and all the space had not yet cleared, which I found a bit strange considering it should have migrated all the data to the new store.

I figured well lets see how the VM reacts now… and whomp whomp waaaaaaa was presented with this! (Imagine Source was removed, I can’t remember what it was lol)

At this point I wasn’t sure if this was realated to the snapshots and the VM folder set being not all in one place.. So I decided to delete all the snapshots. Even after completion and noting that all Data had indeed migrated to the SAN, I was still presented with the error shown above.
At this point I was starting to worry that I might have ruined 20 hours of P2V work, I was too tired to carry on for the day and decided to boot the physical back up to handle DB requests for the following work day.

The next day I jumped on fixing the issue at hand to recover this VM and save the 20 hours it took to P2V this thing. I initially started by mounting the Windows Server 2008 R2 installation DVD to the guest VM and adjusting the boot time to allow me to load the boot order and boot from the disc. Even though selecting repair my computer did see all the local discs including the installed OS, it would only give me the option of recovering from a system image (which I didn’t have), run diag tools (doesn’t help in this case) and command prompt. So I loaded command prompt. Now everything I tried in bootrec.exe options had failed:

/FixMBR (didn’t work)
/FixBoot (didn’t work)
/ScanOS (Found 0 installed instances)
/RebuildBCD (Found 0 installed instances)

At this point I felt it was pretty shot and unrecoverable, but like usual I felt to give one last google search on the issue of 0 found instances. Which lead me to this MS answers post, with the same question. To paraphrase the solution from Vijay B

To Paraphrase to solution:

1) bcdedit /export c:\bcdbackup (Backup the existing bcd)
2) attrib c:\boot\bcd -h -r -s  (Allow write/modify of the BCD file)
3) ren c:\boot\bcd bcd.old      (rename the BCD file, can also just be deleted, this is a backup solution)
4) bootrec /rebuildbcd

At this point it will catch the windows install and actually rebuild the BCD (/Rebuild BCD), believe it or not after that I was able to successfully boot the VM, and saved a 20 hour P2V. I can now freely vMotion and move this VM as required in my hypervised system!
Thanks ViJay!

Jan 2018 Update

Another good post, but sad didn’t write out the error message as clearly the outsourced image has been lost from the interwebs.

The User Profile Service failed the logon

It’s a beautiful Monday morning. I get up shower and get dressed for work. Hop on the bus, that happens to be crowded to the tits! As I stand silently enjoying the sun shine through the bus windows, I hear the annoying sounds of a child’s educational video game. Have you ever heard the sound FX from those things…. so repetitive it’ll drive you crazy!

I silently tough out my nightmarish bus ride, and walk into work. Pull up on my new standing desk, and begin to check on Backup status, and server updates. Then go to grab a coffee from the lunch room.

As I return to my office, I notice a Lync/Skype for Business communication from a colleague that works on another floor stating another user there can’t log into their system.

I quickly open cmd prompt to verify the workstation has network connectivity, it sure does. I remote in using our remote software, and watch the users login attempt, sure enough it fails.

The User Profile Service failed the logon

As this was something I had not seen before, although I had a real good assumption it was user profile based, I quickly googled the error as seen on the screen. Where would we be without google!? My first investigation brought me to this MS support page but it’s offered suggestions were a bit outlandish for me, considering it basically wanted me to manually re-create the profile and migrate the data. I don’t think so, as I went to go see how large the profile was via Advanced System Settings, I noticed the profile status was set to “backup”. Googling this issue brought me to this awesome blog page.

 To Paraphrase to solution:
	1) Open Regedit (�HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList)
	2) Find the profile SID with the .bak
	3) Rename SID profile (with ProfileImagePath = c:\users\temp) to ***.ba
	4) Rename SID profile with .bak by removing the .bak
	5) Set DWROD State and RefCount to 0
	6) Rename or delete (haven't tested) ***.ba to ***.bak
	7) Reboot and enjoy

Jan 2018 Update

Sourced links, paraphrased solutions in case links die, good story and to the point. Well done.

Dangers of Xopy and Robocopy

I use my own blog once in a while to remind myself how to do a specific task, and sometimes I forget to document every step along the way.
I did my basic steps of using psexec with admin creds and elevated permissions, and came across the two useual problems I forgot to mention in my previous posts!

The first one being; Access Denied – unable to create directory.
That’s funny considering I have full admin permissions, this is another one of those erroneous error messages. This one was really due to junctions aka linked files in linux.
After hurdling that issue using the switches /d /s /y /h /i /z I discovered the next erroneous error message of insufficient memory. Gosh Darn it xcopy can you do anything right? Like usual google to the rescue and you sure enough discover that this event happens due the limit of 256 characters on the NTFS file format most windows use (unless you are on Win 8.1 or above and use the new file table, but that’s a blog post for another day.) So the total directory path turns out to be longer than NTFS allows and cause this “insufficient memory” error. As you can imagine google and others blogs have pointed me to Robocopy!.

Robocopy what a nice command tool! Anyway at first I attempt normal robocopy without switchs, well that didn’t work, then I saw the /MIR, MASSIVE CAUTION!
Robocopy by default attempts to copy junction links, and by doing so (I’m not sure exactly how it attempts to accomplish this) it will spiral into the junction creating folder parents of children deeply into itself leaving with a structer similar to:
c:\user\username\Application Data\Application Data\Application Data”
to the point where it gets so deep it managed to crash saying it can’t read the file and attempting to break out of it using ctrl+c, won’t really work.
If you did run Robocopy with the /MIR on a users profile or any parent directory with junction folders, you’ll have to manually clean up the mess left behind. When you attempt to deleted the parent folder it will prompt an error stating the file is too large for the file system it resides on or something similar. These events are full of erroneous error messages. But go as far into the folder as you can then cut the folder where you end and paste it all the way into its parent folder shrinking the complete path name allowing you to work with the deeper child directories, after a couple cut n paste it should be short enough for you to delete the folders and all files.
The proper way is to use “Robocopy /MIR /XJ SourceDir TargetDir”. This way junctions will be skipped. Hope this helps someone else experiencing troubles with Xcopy and Robocopy.

Jan 2018 Update

I remember this. 🙂 2014 was a good year.

Locked out of iLO, on ESXi hosts!

Scenario: You’ve taken over a system admin position, with little documentation and you are locked out of a pre-configured iLO port and the default username and password has been changed!

Prerequisites:
iLO Resources

iLO drivers and tools for server

Info Source

software install ESX host SSH

Let’s say you have a handful of ESXi hypervisors. Sweet, but what are they running on? A couple of G7 servers… They may not be G8’s but hey, can’t have everything.
Sweet part is most of these big servers have remote management capabilities. HP offers iLO, this sweet little separate attached hardware can do many remote tasks, such as, but not limited to: hard resets/power cycle, send SNMP Traps, and a couple of other things.
Now you’ve found the iLO host name, you send and receive ICMP requests (Pings) and sure enough you can access the web interface; fantastic!

Now you attempt to log on and realize every attempt fails. A couple question might start to arise, does this accept domain credentials (directory based authentication)? Is there a default User name and password?
Which of course by reading the user guide you discover it does, with one exception… Directory-based authentication requires an iLO license…
Now unless you’re in a big corporate environment running a decent sized datacenter chances are you don’t have directory based access on your iLO port.

Now you take the default admin account; Administrator, and type in the password as indicated on the info tab on the server… no luck…
as the guide suggests it has been changed… now comes all the fun stuff.. Bringing iLO up to date..

Now if you’re lucky the iLO drivers and tools for the ESXi host is already installed in which case you won’t even need to reboot your hypervisor to get into iLO.
However, if you don’t, make sure you migrate any active production VM’s to another host, or schedule a maintenance window.
At this point I’m assuming you have access to the host directly, admin access if not by directory services then a local root account.
SSH into the host directly, if you have /opt/hp/tools directory chances are you have the required iLO drivers and tools.

Otherwise ensure you follow these steps:
1) Log into vCenter (If managing a Cluster) and migrate/shutdown active VM’s.
2) Right click the host about to get iLO drivers installed on, Maintenance mode.
3) SCP hp-HPUtil-esxi5.0-bundle-1.4-15.zip to /tmp (to host)
4) In the SSH session (as admin/root) enter |esxcli software vib install -d “/tmp/hp-HPUtil-esxi5.0-bundle-1.4-15.zip”|
5) Let the server reboot
6) Wait for ICMP response, after a couple min of response you can take out of Maintenance Mode in vCenter

Congrats you can now manage the iLO port without needing to reboot the server ever again! Let’s get into that web interface!

You have a couple options at this point, reset the settings to factory defaults, or reset the admin password. Since we don’t have to reconfigure everything let’s reset that password.
There’s a good chance the previous admin changed the name and password or the default admin so how do we figure what that account is?
All one has to do is export the existing configuration and display it.
On the SSH session on the ESXi host type: “/opt/hp/tools/hponcfg -w /tmp/ilo-config.txt | cat /tmp/ilo-config.txt”
This will spit out an XML looking set of config options, at the bottom one should see a ADD USER class, with a USER_LOGIN field.

Now you build an XML file, if you’re puttying into the ESXi host just use vim to paste and edit this:

<ribcl VERSION=”2.0″>
<login USER_LOGIN=”USER_LOGIN” PASSWORD=”NewPassword”>
<user_INFO MODE=”write”>
<mod_USER USER_LOGIN=”USER_LOGIN”>
<password value=”newpass”/>
</mod_USER>
</user_INFO>
</login>
</ribcl>

Just put < and > around them cause apparently you can’t simply display XML code in HTML… opps

Save the file to /tmp/reset_admin_pw.xml and run “/opt/hp/tools/hponcfg -f /tmp/reset_admin_pw.xml”
It should reply complete, now just log into the web interface with the USER_LOGIN ID and NewPassword. While finally in there, create a new default admin and password and document it.
Create your own account with a private password, as the default account should only be used in an emergency.
Also update the firmware to the latest and renew your certs!

Jan 2018 Update

Updated one source link on the drivers source, as HP for some reason didn’t have a redirect to hpe since dividing their company. Again Well done.

Event ID 21054 error logged on Lync Server 2013

If you manage a Lync 2013 server, you probably have looked through event viewer to find this little bugger; Event 21054.

You’ll notice this event is generated whenever the address book is updated with Update-CsAddressBook cmdlet or after the maintenacne that is run every day.

The following is a sample of the event log information details:

Users are not indexed in the database that should be.

Expected indexed user count: 0

Actual indexed user count: 136

Cause: User replication issue.

Resolution:

Run Update-CsAddressBook to synchronize all accounts.

How lovely, it didn’t resolve the issue. All googling will state run “debug-csaddressbookreplication” and if not indexed objects and Abandonded objects are 0 then…..
You guessed it! MS favorite quote “You can safely ignore it.”
Better yet MS fix this already!

Jan 2018 Update

I have no clue if this is still a problem, I generally don’t look in event viewer unless somethings broken. However if you are running Zenos, PRTG or some other monitoring software that ties into some windows servers event viewer, you’d have to filter it there.

Dealing with Event ID 7000

This event could really be due to a couple things, mostly dependencies.
I came across this error notice while checking my workstations event logs. I noticed these events coming from what should be Trend AV. As Trend is an active AV in use I was concerned about it, however the active AV session was OK and showing green across the board.

Entering the exact info from the event into google prompted a nice forum topic about it. I already did my due diligence by checking local service with an admin account both using the “sc query” command and the “Get-WmiObject win32_service | format-table displayname,name,startname.”
This was enough to show me that tmcomm was not an active service installed on my system. Lucky for me the user on this forum was experiencing a similar issue.
This left me to believe these were old services left behind by a previous version of Trend AV..
Following the advice there, removed the service keys from the registry. Browse to HKey_Local_Machine\System\CurrentControlSet\Services under services key, there are many sub keys, find the one named TmComm and delete it, keys look like a folder.
Once I had removed the key and rebooted I had a clean eventlog!

So those are the basic steps, check log, see event, and verify dependencies are starting. If service name cannot be found using the commands listed above then check the registry under HKLM/SYSTEM/CCS/Services and remove the key for the listed service .
Hope this helps someone else experiencing Event ID 7000 in their eventlogs!

Jan 2018 update

Well done.