02 November 2009

Error Installing Service Level Dashboard 2.0 - Error while Adding Site Template

Sometimes when you tried to install Service Level Dashboard 2.0 (SLD) on your existing OpsMgr R2 infrastructure only to end up being frustrated of constant failing without anything hints. It couldn't help with the installer not leaving any log files for us to troubleshoot.

I have never had problems with the installation until today. Somehow I encountered installation error when I tried to install SLD v2.0 on my customer's POC environment today. I kept getting error when the installer tries to "Adding Site Template"

The bad part is, no installation log files were created that enables me to do detailed troubleshooting. Fine !

Finally found a way to do that, using command prompt type the following command to launch the installer to output the installation log

msiexec /i serviceleveldashboard_x64.msi /L*V install.log

Installation still fails but this time, I have an Install.log file to TROUBLESHOOT !!!

I scrolled down to the potential error and found this !

Action 16:16:59: ADDSITETEMPLATE. Adding Site Template...
MSI (s) (3C:F0) [16:16:59:113]: Executing op: CustomActionSchedule(Action=ADDSITETEMPLATE,ActionType=1025,Source=BinaryData,Target=CAQuietExec,CustomActionData="C:\Program Files\Common Files\microsoft shared\Web Server Extensions\12\BIN\STSADM.EXE" -o addtemplate -filename "C:\Program Files\Service Level Dashboard 2.0\SLD_SiteTemplate.stp" -title ServiceLevelDashboard)
MSI (s) (3C:D0) [16:16:59:118]: Invoking remote custom action. DLL: C:\Windows\Installer\MSI69E8.tmp, Entrypoint: CAQuietExec
CAQuietExec:
CAQuietExec: Access denied.


Looks like I have permission errors. (Don't we all do, duh ? lol) Though the account that I use is already a Domain Administrator somehow I did a workaround. I use the Run As Administrator option to execute the installer and the installation were successful and I am less stressful !

Error ID 33333: Data Access Layer rejected retry on SqlError


Got hit with this problem when I came in to look into the Operations console and suddenly no monitoring data collected ! Performance Data went blank ! Reports blank !


Push the Panic Button >-> Not Yet :)


As usual, the good old trusted Event Viewer will surely gives you something to look into if you bumped into problems with OpsMgr. Fair enough, I got hit by a series of the following error



Apparently it's my bad after all. I forgot to turn on agent proxying for this servers since they are DC's & Exchange Servers.

Thanks to an entry from the product team, it helped me solved my problem.

http://blogs.technet.com/smsandmom/archive/2008/03/13/opsmgr-2007-dataaccesslayer-event-id-33333-with-should-not-generate-data-about-this-managed-object.aspx

Basically, we need to turn on agent proxying for each of the DCs & Exchange Servers. To be safe, restart the Health Service ( aka System Center Management) on the agents. So far, my Event Viewer looks good and fingers crossed, hope that my monitoring data flows in as well ... gulp

(Oh yea, just checked a minute ago, they came in ... hurray ! I just love OpsMgr)

03 October 2009

Service Manager 2010 Beta2 is available

With great excitement, I wish to share with you folks out there that Service Manager 2010 Beta 2 is available for download.

So what will I be excited about with this version compared to the previous Beta1 and also the CTP2 version that I have been playing with.
  • Self Service Portal
  • Problem Management
  • Self Service Software Provisioning

Additional information is available in the product team's blog site:

http://blogs.technet.com/servicemanager/archive/2009/10/02/service-manager-beta-2-has-shipped.aspx

Within this site, you can get the link that will direct you to Microsoft Connect site which lets you to download this exciting product for your evaluation.

Now enjoy ... I know I will enjoy my time with this product as I have been for the previous versions.


30 September 2009

DPM 2010 Beta is available

It has been announced that System Center Data Protection Manager 2010 (DPM 2010) Beta is available for download.

Overview of this exciting product is available in the product site
http://www.microsoft.com/systemcenter/dataprotectionmanager/en/us/2010beta-overview.aspx

Within this link, you can also download and try it out at:
http://go.microsoft.com/?linkid=9686964

My immediate interest is to see how DPM2010 provides backup/restore of Hyper-V R2 virtual machines that runs on Clustered Shared Volumes (CSV) ...

27 September 2009

Managing Applications with Service Maps in Service Manager 2010

This TechNet webcast demonstrate how Service Manager 2010 can integrate with Operations Manager 2007 R2 to import Service Maps such as your Messaging Application, Directory Services, Web Portal application and more ... into Service Manager and lets you extend information like Service Owner, Users who uses the service/application, incidents & change requests for the service and more.

This lets you as the IT Service Manager to be able to view the CI relationship of the infrastructure that supports the service and at the same time lets you assess the business impact shall the IT service fails and also helps you perform activities like root cause analysis and more ... in summary, Service Maps lets you have a visualization on your IT Services and able to quickly assess the business impact of the service so that you will be able to perform quick remediation activities to bring back the service up ASAP.

Enough talk ... enjoy ....

http://edge.technet.com/Media/Managing-Applications-with-Service-Maps-in-System-Center-Service-Manager-2010/

25 September 2009

SPEAK YOUR MIND & BE HEARD BY THE PRODUCT TEAM

If you want to have your thoughts on OpsMgr to be heard by the product team directly, now you can do it.

The team has just launched a connect portal that let's you feedback (good or bad) to the team and have them heard ... a luxury that only TAP partners used to have ... but now have been extended to general public.

You can access the portal via the following link:
https://connect.microsoft.com/opsmgr

Also the announcement of the New Connect Site portal can be seen here

18 August 2009

Storage Server 2008 Default Password

Windows Storage Server 2008 does not let administrator to provide a password during installation. Hence when the installation is done you get this big question running in your head ... what's the password???

Well, for those who are lazy to read the installation guide ... like me :P

The default password for Storage Server 2008 is (case sensitive):

wSS2008!

And don't forget to change the password once you are logged in :P

12 August 2009

18 July 2009

Bridgeways announced the release of MP for R2

Bridgeways recently announced that they have released 4 new Management Packs for OpsMgr R2. Focusing on extending monitoring capabilities for OpsMgr R2 in non Windows Application & Database servers, Bridgeways Management Pack offerings includes
  • JBoss Application Server
  • Oracle Database Server
  • MySQL Database Server
  • Apache HTTP Web Server
And guess what, they even have MP to monitor VMWare ESX without needing to go through Virtual Center !

As I am blogging my post here, I actually had their MP configured to monitor a customer's Apache Web Server...

For more information on the product drop by at theor product info page here
And their technical blog here

24 June 2009

Can't install WinRM. Access Denied

One of the pre-requisites that we need to have is WS Management 1.1 component so that we can manage non-windows machines. And most of the time, we just need to download the update and install and voi'la it works.

Not today ... I am working in a customer site which all the servers are hardened. So when I tried to install the update ... it gives me error ... Access Denied ...

Never in my mind that installing an update using Domain Administrator account will give me Access Denied.

After an hour of troubleshooting, I find the culprit. The permission for the registry entry for

Local Machine\Software\Microsoft\WINNT\Svchost

only gives Read rights for administrators hence I cannot register the WinRM service into the server.

After getting permission from customer, I set the permission to Full Control and the installation of the update works without any problem.

02 June 2009

Extract monitors,rules and object discoveries using Powershell

Wrote a Powershell script to extract the Monitors, Rules and Object Discoveries and export the names of the objects to a text file.

The method is pretty simple, I just use a couple of get-objects cmdlets and adds the criteria to match certain criteria strings and the script should.

For example, to extract monitors for Linux

$exportpath = "C:\DUMP\allLinuxmonitors.txt"
$mp = Get-ManagementPack | WHERE {$_.DisplayName -match "linux"}
foreach ($mp1 in $mp)
{
$mp1 | Format-Table DisplayName | Out-File $exportpath -Append
Get-monitor -ManagementPack $mp1 | format-table DisplayName | Out-File $exportpath -Append
}


To extract Rules for Linux

$exportpath = "C:\DUMP\allLinuxrules.txt"
$mp = Get-ManagementPack | WHERE {$_.DisplayName -match "linux"}
foreach ($mp1 in $mp)
{
$mp1 | Format-Table DisplayName | Out-File $exportpath -Append
Get-rule -ManagementPack $mp1 | format-table DisplayName | Out-File $exportpath -Append
}

To extract Object Discovery for Linux
$exportpath = "C:\DUMP\allLinuxdiscovery.txt"
Get-Discovery | WHERE {$_.DisplayName -match "linux"}| Format-Table DisplayName | Out-File $exportpath -Append


Ray just threw me a challenge ! Extract those information together with the threshold, Enabled By Default, Override values etc. Will work on it if I have time :)



16 May 2009

Updates not deployed ... even after 1 hr !!!!!

Was building a lab image for my training class next week and somehow things was not as smooth as I would fancy. :(

It was supposed to be a simple Software Update deployment lab which I am preparing and what seemed to be a very routine step (yes, did this hundreds of times and it works all the time) but somehow my client machines is still not getting the software update after 1 hour ! ( triggered countlessly on the Machine Policy and Software Update Deployment Cycle actions)

Well, I took a look at the client's updatesdeployment.log and noticed that the advertisement is not activated yet ! Chaos mode !

Then took a look at the Deployment Management for the update, and noticed that my Time/Settings is set to UTC which is by default.

So the lesson learned here is, unless you are great with mathematics calculations of timezone/date, better stick with the Client Local Time options !

05 May 2009

Test Drive System Center Service Manager

Eager to try and see how the upcoming System Center Service Manager looks like?

Well then please go and have a try at the Hands On Lab for Service Manager

http://www.microsoftservicemanagertestdrive.com/

The scenarios in this lab will demonstrate an overview of a Microsoft System Center Service Manager installation and initial configuration, covering the following topics:
  • Installing Service Manager
  • Importing data from Active Directory,System Center Configuration Manager, and data and alerts from Operations Manager 2007 SP1 and above
  • Configuring User Roles within Service Manager
  • Manually adding users that were not imported from Active Directory
  • Creating several templates, configuring initial parameters, creating queues, lists, and groups, and then creating a management pack to save any custom objects
  • Installing Service Manager in a production environment in a scenario where Service Manager is installed on four computers
Enjoy !

04 May 2009

Generating ACS Report is slow ... what I can do about it ?

I have a friend who set his ACS data retention for 2 years to comply to SOX standards which means he needs to set the data retention range to 2 years without an archiving solution at that time.

It has been a year since he has the ACS data collected and he contacted me recently and complaining about the Audit report takes ages to generate. Well it kinda makes sense for the reporting to be slow because of the sheer volume of the data that was collected.

So what's next? Well both of us have been toying around with the ACS DB and managed to find a workaround to do this.

Well as I have shared on my previous posts, the data source which the Audit Reports retrieved from actually points to the adtserver.dvAll5 view and this view will retrieve all the data collected for the past year.

We have 2 ways to do this:

Option 1: Modify the dvAll5 View
- Go to SQL Management Studio, modify the dvAll5 syntax to only include the views that you want to see.
- Execute the SQL syntax
- Note: Although this is simple and straight forward, you need to constantly go into SQL Management Studio to do this and your settings will eventually be overwritten everytime ACS creates a new Partition table because the DBCreatePartition.sql will rebuild the dvAll5 view


Option 2: Modify the DBCreatePartition.sql file
- This file resides in C:\WINDOWS\system32\Security\AdtServer
- Before you start playing with this, make a backup copy of this file.
- Edit this file and look for select top 42 PartitionId from dtPartition order by PartitionCloseTime desc
- Change the Top value to a lower value. For example, if I just want to generate report for the range of 2 weeks, I just need to modify the statement to
select top 14 PartitionId from dtPartition order by PartitionCloseTime desc
- Save the SQL file
- What we did just now is to have ACS to only include the 14 latest Partition views to be included into the data source (which is dvAll5) which means the resultset is much smaller hence the performance will be increased.

What if you want to keep ACS data for more than 256 days ? - Part 3

In my previous posts, I have shared a little on the data structure of ACS database and what are the potential issues that we might faced if we do not adhere to best practice such as the 256 table limitation of SQL Server. That being said, many organizations has a requirement to keep their security events/logs up to a certain period of time to comply with a certain standards/compliance.

This is the dilemma which my friend is currently facing now realizing that he will potentially lose the ability to provide Security Audit reports for long term data (which is longer than 256 day). But after some study which we did, we managed to find a workaround for that though this requires alot of manual work and tweaking of the a couple of SQL scripts in ACS.

But before we go into details, just as a disclaimer you better make sure that you have properly and adequately backed up your entire OpsMgr infrastructure before you embark on this "perilous" adventure.

Well to start things with let us recap on the objective: We need to be able to generate Audit reports for up to 2 years worth of data.
  1. So the first thing is to set the retention range for ACS from 14 days (by default) to the range that you need and let ACS do the rest in terms of collecting data and storing to the ACS database
  2. To generate report for the range which is older than 256 day,go to SQL Server Management Studio and identify the partition tables IDs that stores the collected events/data for the date range from dtPartition table using the following SQL statement. SELECT * FROM dtPartition ORDER BY partition time
  3. Now this is the interesting part. Once you have identified the Partition Table IDs, open the dvAll5 view and edit the syntax and include the partition views into the syntax. For example, if I have identified 2 partition tables with the id of 12345_abcde and 67890_fghij as mentioned in Step 2 the modified syntax for the dvAll5 view will look something like this: ALTER VIEW [AdtServer].[dvAll5] AS SELECT * FROM dvAll5_12345_abcde UNION ALL SELECT * FROM dvAll5_67890_fhgij
  4. Execute the syntax and now the dvAll5 view which is the data source for reporting will retrieve the collected events/data that falls within your date range
Of course the steps above is rather tedious for some and quite complex for a few more.
Well, guess what? There is a tool out there that can do whatever I mentioned above easily and reliably.

The solution which can solve an administrator headache to manage long term ACS data is the Audit Collection Archiver brought to you by Secure Vantage.

This solution will archive your ACS data into a series of compressed binary files and provide you the ability to generate reports on the archived data.

If you would like to know more about this cool solution, have a look on this link
http://www.securevantage.com/Products/AuditCollectionArchiver/Default.aspx

02 May 2009

What if you want to keep ACS data for more than 256 days ? - Part 2

In the last posting, I shared with you about a friend of mine who needs to keep his ACS data for 2 years, only to face with an issue when the collected data reached the 256th day and his ACS Collector service stops.

Well, we eventually installed the hotfix available for KB 954948 from http://support.microsoft.com/kb/954948

Now in order for us to know what causes this problem, we need to do some study on how the ACS Database structure works.

As many of us already know, what ACS does is to collect security events from monitored servers and store them back into the ACS Database. These collected data will be stored in what we called a partition table, a table dynamically generated/created daily by ACS to store events data that was collected on that particular day. To let you have a clearer view of what I meant, take the following as an example.

I would like to ensure that the events collected on the 17th April 2009 is successfully stored in my ACS DB so how do I view those data? Like what I shared with you earlier, all the collected events are stored into partition tables and in order for us to identify which partition table was created to store the events for 17th April 2009, we need to select from the ACS Database and the SQL syntax to
query the dtPartition table to list all the partition tables and sorting them by Date. This can be achieved using the following SQL statement:

SELECT * FROM dtPartition ORDER BY partition time


The returned resultset will list all the partition tables and which one stores the data for which date.
In my cas
e ( as depicted in the following screenshot), I have already 258 partition tables created for the events that was collected for the past 258 days. I can also see that the partition table that stored the events that was collected on 17th April 2009 by referring to the PartitionId column which is 5705ab39_f297_4555_ad04_57f72603b941 and also PartitionStarTime and PartitionCloseTime as the date indicators.





Once we have the partition id identified, we shall validate the data that was stored is indeed collected on the 17th of April 2009. We will now attempt to select from the view that was created from the partition table and to do that simpy use the following syntax:

SELECT * FROM db0.dvALL5_(your partition table id)

In our scenario the SQL syntax will look like this,
SELECT * FROM dbo.dvALL5_5705ab39_f297_4555_ad04_57f72603b941

We should be able to see the event data that was collected. As for my screenshot, what we have done is a bit extra where we tried to verify the events collected in the partition table is indeed those on 17th April 2009


Now we have been able to understand a bit more on how ACS database structure works, the next step is to understand how reports are retrieved? Well to start with, ACS reports are retrieved from a Data Source which refers to the dvAll5 view. The interesting part on how this view work is that it is dynamically built to retrieve from all the views of the partition tables.

Yes, this means if you have 20 days of events collected this generally means that you have 20 partition tables which also means that you have 20 views created and to top it all, the dvAll5 view will select from all the 20 partition views ! I personally have some reservation on this approach as this would lead to performance issue if especially if your data size is huge and if your retention period is exceptionally long. (Guess this is why the default retention period is only 14days)

So how does dvAll5 is being created then? Well, after we studied the syntax that builds the view the SQL syntax that builds it actually does a UNION on all the partition views ... and it looks something like this ~gulp~

create view [AdtServer].[dvAll5] as select * from dvAll5_8fe6b4cb_aab5_43ef_b3a7_4ae601e0be53
UNION all select * from dvAll5_0ff135f5_b14f_474c_a94f_c5a8249b3b82 UNION all select * from dvAll5_76c46145_961f_492e_a015_fad4c0c41046 UNION all select * from dvAll5_d9d39a78_66d1_48ee_8da3_c4685c6d9c59 UNION all select * from dvAll5_add3b5d7_48ee_4cce_a12e_8fa6f96577da UNION all select * from dvAll5_66975973_55ee_4db2_9c45_f655a2c03779 UNION ....... all select * from dvAll5_72e2f5be_bbdd_49af_bc79_8efba125bf78

Yes, if you have 20 paritions tables this means your dvAll5 view will really include all of them.

Now this is when things gets interesting. Remember that my friend needs to have a data retention for more than 2 years ! ... and how his ACS dies after the 256th day ?

The reason is because, SQL Server has a limitation that you cannot be SELECTING for more than 256 tables in ONE view hence the error the he faced occurs.

After we have installed the patch, ACS will be a bit clever. If it detects that it has more than 256 partition tables, dvAll5 will only retrieve the latest 256th tables hence the earlier table will be discarded from reporting.

Now for my friend who is the administrator and needs to ensure that ACS can report data up to 2 years ... that simply freaked him out in the first place ! But after we studied the structure carefully and through some creative thinking we managed to find a workaround for this ... which I will share on my next post.


26 April 2009

What if you want to keep ACS data for more than 256 days ? - Part I

I have a good friend who has configured his OpsMgr settings (last year) to keep his collected events in ACS for 2 years because his company's policy requires them to save all security events data for 2 years. And so the story goes ...

Last week, he contacted me with regards to a couple of errors that he encountred in his ACS Collector. Apparently he kept getting the following error in his Event Viewer and eventually his ACS Collector service has stopped and could not be started.

Event Type: Error
Event Source: AdtServer
Event Category: None
Event ID: 4618
Date: 04/12/2009
Time: 2:00:34 AM
User: N/A
Computer: SCOMAPP1
Description:
Error occured on database connection:
Status: 0x04080000
ODBC Error: 106
ODBC State: 42000
Message: [Microsoft][ODBC SQL Server Driver][SQL Server]Too many table names in the query. The maximum allowable is 256.
Database: SqlWriter
Connection: Maintenance
Statement:


We managed to find a hotfix to resolve this issue in the following link:
http://support.microsoft.com/kb/954948

and have the patch installed on the ACS Collector.

But then again if you read the documentation for the hotfix properly, it states that the system will only retain data of the latest 256 partitions/days of events. My friend needs to be able to pull reports for data of 2 years.

In my next blog post, I will share with you how do we open and study the ACS database and eventually found a way albeit messy and cumbersome method to achive that. At the same time, I will also share with you a product out there which will be able to resolve this problem without you getting your hands dirty. Stay tuned ...

03 April 2009

Installing of SCCM agent failed: WMI repository error

It's been a while since my last posting. Has been extremely tied up with work and community events of late. Anyways, I have something interesting to share with folks out there who has difficulties deploying their SCCM agents.

Was helping my colleague to deploy SCCM for a customer and suddenly we noticed that a couple of machines has problems when we tried to push the agents. A check on the log files showed that we have some problems with WMI repository:

- Failed to open to WMI namespace '\\.\root\ccm' (80041014) CcmExec 8/30/2007 2:25:09 PM 2148 (0x0864)
- CCMDoCertificateMaintenance failed (0x80041014). CcmExec 8/30/2007 2:25:09 PM 2148 (0x0864)
- Phase 0 initialization failed (0x80041014). CcmExec 8/30/2007 2:25:09 PM 2148 (0x0864)
- Failed to connect to CCM namespace CcmExec 8/30/2007 8:13:56 AM 2160 (0x0870)

Managed to find the solution from http://www.myitforum.com/forums/m_165955/tm.htm

What needs to be done is:
- On the client machines, shut down WMI service
- Goto C:\Windows\System32\WBEM and rename the repository to maybe "oldrepository". Basically what we do here is to rebuild the WMI repository.
- Restart WMI service and you should be able to see a new repository will be generated

27 February 2009

Jalasoft Xian Network Manager Io SP2 has RTMed !

Jalasoft Xian Network Manager Io SP2, a partner add-on which extends OpsMgr monitoring capabilities to non-Windows environment such as Linux,UNIX, network devices and even VMWare Virtual Center has RTMed.

What's new in SP2 ?
There are several improvements in Xian Io SP2. The ones that have had the most impact on functionality and performance are detailed below.
Improved Communication between Xian Io and Management Servers.
The Xian Data Server no longer uses the SDK to send counters and events to the Root Management Server; it can now communicate with secondary Management Servers using the Xian Connector Module. This feature increases the monitoring capacity of Xian while at the same time improving stability.
Improved Management Packs
Xian’s ‘Smart Management Packs’ have been redesigned to take up less memory and cause less congestion allowing those resources to be used to enhance Xian’s monitoring capacity.
Multiple Xian Environments
To increase the number of devices you can monitor with Xian you can add an independent Xian installation to your OpsMgr Management Group. The amount of additional devices you can monitor with a second installation of Xian depends on hardware configuration, number of active rules and rule execution intervals among others. You can do this as many times as necessary.
Automatic Health Reset
When a rule sends an alert to Operations Manager, it remains open until a success event is sent to close the alert. What Automatic Health Reset does, is once the rule is removed and is no longer monitoring that node, it sends a success event so that remaining alerts in Operations Manager disappear.
Improved SNMP engine
In Xian Io SP1 the snmp counter could, on rare occasions send wrong counters. Through the use of a new and more robust SNMP module this is no longer possible.
Enhanced Monitoring Capabilities
The number of devices a single installation of Xian Io SP2 can monitor has been increased to approximately 1000.
More reports

To have more possibilities to analyze your performance counters Xian Io has now even more reports to help you. Every Management Pack comes with over 100 build in Reports!

Available Smart Management Packs
The Jalasoft Smart Management Packs provide the intelligence to organize, process, and analyze the information on monitored devices into the views, computer groups, and reporting capabilities of Ops Mgr 2007 using the Xian Connector for Ops Mgr 2007.
  • Cisco Switches
  • Cisco VPN Concentrators
  • Cisco Routers
  • Cisco PIX/ASA
  • Cisco Wireless
  • HP Procurve Switches
  • F5 Big Ip
  • Linux Servers
  • Solaris Servers
  • IBM AIX Server
  • VMWare Virtual Center
  • Linux MySQL
  • APC UPS
  • Generic Network Device
  • Availability (ICMP only)
For more information, take a look at:

http://www.jalasoft.com/Web/Product/Product.aspx?id=44

21 February 2009

Configuring Email Notification channel in Service Manager

Extremely straightforward and simple.


  • Logon to your Service Manager Console
  • Go to the Administration space and expand the Administration -> Notification node
  • Click on the Channels node
  • On the Middle pane, select Email Notification Channel and click the Edit link on the right pane
  • This will launch the Email Notification Settings window
  • Tick on the Enable e-mail notifications checkbox. Click on the Add button to launch the Add SMTP Server window
  • Input your mail server settings and click OK.
  • Enter the Return Address and click OK again to save your settings and now you have added your email notification channel.