Jul 10

Of late, I have been looking at Datacenter automations and impact of DCA on costing for the datacenter.

Various solutions exist, to name a few Emerson Aperture, NLyte and Rackwise – All of which bring important things to the table. But what are the big 4 doing in this space? Do they have any solid solutions out there? Or they are just living with the situation?

Lack of investments in RFID, rack monitoring software has really taken the cost of manually managing datacenter’s higher. The bottom line is datacenter is a piece you cannot just outsource because it is the fundamental part of unique value chain for every company.

The only proposed solutions are:

  1. Virtulization mgmt
  2. Workflow management – Cable mgmt, Physical and logical asset management
  3. Spare management
  4. Space, Power, Cooling mgmt
  5. On-boarding, offloading management
  6. For cloud – customer centric solution management

Unfortunately the quest for finding the right solution continues, but I am not giving up.

Dec 13

Some product/consulting companies charge upto 25K USD for integration of FM-FM/FM-PM products. One has to be careful of such offerings because not only they have a one time cost, but also they come with a continual license fee for the gateway. BAD!! So let me save you some money by generalizing this process by an example of integrating two highly used NMS solutions – Tivoli Netcool [from IBM] and NAGIOS [Open source offering]. Integration from Nagios to Netcool is simple [not sure why people pay tones of money for this] and can be done in couple different ways:

Overview

  1. Asynchronous uni-directional data flow [from Nagios SBI to Netcool] : In this method of integration, Netcool shall receive events  as forwarded, but shall not acknowledge the event back in Nagios. This is useful when Nagios is not used by operators for RT monitoring.
  2. Synchronous bi-directional data flow: An event in Nagios will flow to Netcool and will be confirmed back in Nagios as recieved by Netcool. On every update on the event [such as journal entry, acknowledgements] the event in Netcool, status shall be updated in Nagios.

Either options work based on the business/solution requirements. So without further ado:

Implementation:

  1. Asynchronous uni-directional data flow [from Nagios SBI to Netcool]

To understand the implementation, I shall divide the steps as southbound implementation and northbound implementation. Southbound implementation refers to the changes/configuration on Nagios end, and Northbound implementation refers to updates in Netcool.

Southbound updates [On Nagios];

a) Create a script to send tcp socket messages or snmp traps or direct JDBC insert to NBI.

You can use snmptrap command for writing the script, if you are not a SNMP guy you can use a simple script to do socket message communication/JDBC inserts into Objectserver. Test this script.

sample snmp script:

Send trap

# Arguments:

# $1 = Management Station

# $2 = Community String

# $3 = host_name

# $4 = service_description (Description of the service)

# $5 = return_code (An integer that determines the state

# of the service check, 0=OK, 1=WARNING, 2=CRITICAL,

# 3=UNKNOWN).

# $6 = plugin_output (A text string that should be used

# as the plugin output for the service check)

#

# Sample

# /usr/bin/snmptrap -v 2c -c $2 $1 ” NAGIOS-NOTIFY-MIB::nSvcEvent nSvcHostname s “$3″ nSvcDesc s “$4″ nSvcStateID i $5 nSvcOutput s “$6″

b) Define a global event handler in Nagios: Global event handler will help execute the script on every state change on Nagios instance and will communicate, failure and seizure of the problem. How to configure GEH: http://nagios.sourceforge.net/docs/2_0/eventhandlers.html

Northbound updates [On Netcool]

If SNMP:

a) Download the Nagios MIB and compile with MIB2Rules

http://sourceforge.net/projects/nagiosplug/files/nagiosmib/

b) Update the rules file and include it  in mttrapd main ruleset

If socket:

a) Update the socket probe to parse message based on delimiters

b) Ensure all mandatory objectsesrver fields are accounted for

If JDBC:

a) Ensure all mandatory objectsesrver fields are accounted for

b) **CAUTION** Watch the objectserver profiler for IDUC consumption, as this is not so much of a conventional approach

DID YOU CATCH THE HEADFAKE?

Nagios an Netcool were just examples, you can integrate most FM-FM/FM-PM solutions using the aforementioned procedure, you just need to know the NBI data model, SBI data model, right triggers on the SBI system and right listner on NBI system. Made your life easy, din’t I? So start saving your company some money now!!

In the next post, I will talk about method 2 {bidirectional data flow}. Keep visiting!!

Tagged with:
Nov 11

For those in the Service and Network management industry who are not aware of what is going to hit us in the next 5 years, I would like to give an overview of what LTE and SAE and then talk about the effects of these technology evolution on our ways of working. I am Software solutions expert and not a Network scientist and had to get in touch with a lot of folks, do a lot of research and dig a lot of books to find this data. Below are very high level abstract explanation of LTE and SAE networks and the purpose they serve.

LTE [Long term evolution] is the one of the proposed 4th generation radio access network technology and if all goes as planned the world will be wireless and with much higher data rate after a successful implementation. Recent tests on the field have been successful and all the  investments planned by US telecom market indicates that this is definitely going to be the future of access Networks. The main node of this network would be the eNodeB which would encompass the functional behavior of multiple nodes of our current network paradigm. The end goal architecturally is to have a flat architecture for 4G networks. End goal from user perspective is increased data rate and quality, along with reduced cost and access anytime/anywhere.

SAE [System Architecture Evolution] on the other hand will be the core for the 4G networks, is focused on a all IP, flat architecture, improved data rate and reduced CAPEX/OPEX expenditures. Evolved packet core [EPC] to which the eNodeB will connect, serves as the central functional unit of the core architecture.

Now, what does all of the above mean to Service and Network management  as it is known today to what it would become in the coming years of 4G networks. Will get to this in my next post. Stay tuned!

References:

http://www.radio-electronics.com/info/cellulartelecomms/lte-long-term-evolution/sae-system-architecture-evolution-network.php

“Self-configuring and self-optimizing network use cases and solutions: Release 9”; 3GPP TR 36.902; Sept, 2009

Tagged with:
Oct 12

History of enterprise management paradigms is very seldom given the importance it deserves. Being an industry which is nearly 90 years, should we not retrospect the level of maturity of our industry? Very seldom do we realize that we have been stuck with one protocol for over 22 years. These are some of the questions that almost never come up. So, today I looked up archives for NMS, Service Assurance, BSM, SQM landscapes and put together a brief background of the history of all the aforementioned paradigms. If we look back to the archives; here are some of the key milestones:

1920: the birth of the term “Network Management”. ATT coined the term Network management, wherein supervisors used to roll on skates to manage the network incidents requiring attention.

1962: the first “Network Control Center” is born at ATT.

1977: the first “Network Operations Center” is born at ATT.

1987-8  was one of the most important years for Network management. The birth of SGMP and SNMP Version 1 protocols by IETF. The birth of ITSM by CCTA’ [Birth of Service Assurance concept]

1987: the first “Topology driven NOC”

1991: the birth of “Network Management as we know it”. Monitoring/surveillance/Operations & Administration assumed by NOC.

1993: birth of SNMP V2

1997: birth of SNMP V3

2001: Telecom bubble burst and highest emphasis on “doing more with less”/”Lean”. Birth of Business Service Management [Searching for hard citations, will update soon]

2003-05: Multiple Mergers and Acquisitions era starts with an effort of consolidation on Enterprise management market

2005: 3GPP goes big with the planning for LTE, SAE etc for defining the 4G networks and industry focus on Virtualization starts

2007: Forrester puts forward a study stating ManagedObjects and BMC as leaders of BSM. Lights a fire under IBM, HP and others to put forward better offerings.

References:

http://www.corp.att.com/history/nethistory/management.html

http://140.134.26.20/wbem/eng/ch2.html

http://www.ir.bbn.com/~craig/

http://www.interesting-people.org/archives/interesting-people/200603/msg00182.html

http://www.forrester.com/rb/Research/wave%26trade;_business_service_management,_q1_2007/q/id/38931/t/2

Tagged with:
Sep 11

Today I will make 3 improvement recommendations. Why? Because I strongly believe, if implemented right, these improvements will create yet another successful products for Tivoli portfolio. So here we go:

1) Make WEBTOP and ISM Web 2.0 >> “WebIsm” WebISM for BlackBerry, WebISMfor iPhone, AJAX enabled WebISM, Dynamic Maps, Name-Face associations etc.. .  No more JSP pages, Groovy based rich clients with more flexible, intuitive and highly accessible web clients. TRADITIONAL CENTRALIZED EVENT MANAGEMENT WITH PEOPLE SITTING IN THE NOC IS GROWING OLD. NOW THE TECHNICIANS WANT TO ACKNOWLEDGE ALARMS WHILE DRIVING THE CAR WITH ONE HAND AND A LATTE IN THE OTHER – AGILE INCIDENT MANAGEMENT IS THE ONLY WAY FORWARD. PLAIN AND SIMPLE: THERE IS NO OTHER OPTION.

2) Integrate Webtop and ISM :  Think about this — NOT ONE COMPANY THAT DOES FAULT MANAGEMENT DOES NOT NEED AN HTTP/S INTERFACE and INTERNET SERVICE CHECKS [ICMP, NTP, DHCP, SMTP etc]  But yet, if we consider the client that have WEBTOP is much higher than clients that have ISM. This is only because of lack of awareness of features that ISM provides with the DATABRIDGE, SLA profiles, Wizard driven rules etc.

ISM will only reach out to the masses when it is integrated where it fits best in Netcool architecture.

3) May not sound good: Open the source for WebTop and ISM to a registered development community: Let’s except the fact that products like Nagios, CACTI provide Internet service monitoring for no cost with similar reliability.  So opening the source would help achieve the 1st improvement easily in lower cost. If the community works for free, no-one pays :) Still no-one in the industry can use the product because it needs professional licenses.

Ahead of time – Not really! These are fundamental features of next generation of fault management in my humble opinion. History has proved that architectural improvements recommended by end customers have proven to be most beneficial and succesful for software products. Looking at these improvements from revenue model perspective, the revenues might consolidate but profitability attained by the end solution will be phenomenal, in my opinion.

So shout out to the wonderful Tivoli team @ IBM to consider these suggestions with further ATAM, CBAM and CBE’s.

Tagged with:
Sep 07

For a BSM/SQM/Service Assurance solution, initial solution architecture document is one of the most crucial artifacts which not only details the strategic objectives of the solution but also provides a competitive analysis and an alignment to the existing capability of the organization. Furthermore, it provides an insight into the driving requirements, architectural background and key organizational context to ensure that the solution being built for the organization and is not something rammed down the throat off-the shelf.

Detailed below is the template:

1 Executive Summary

CONTENTS OF THIS SECTION: This section an overview of the content of the rest of the report, giving key facts that management would like to know about its contents.  The executive summary should give the most important aspects of the report while omitting details and some supporting information.  Generally speaking, the summary should be not longer than 1 page and preferably as short as possible while conveying the required information.

1B  [Optional, for mature organizations] Strategic Capability Network

Analysis of how the strategy aligns with the organizations capabilities and resources. You can safely skip this section if you already have a defined BSM strategy and a competitive analysis document detailing the value propositions that drive the business. For details refer the patent here and my analysis with an example here.

2 Introduction

CONTENTS OF THIS SECTION: This section gives the name of the system and describes its high-level functions.  This is expanded upon by the history and stakeholders sections.

2.1 History

CONTENTS OF THIS SECTION: This section provides the historical context for the system.  It answers how the system was developed and by whom.

2.2 Stakeholders

CONTENTS OF THIS SECTION: This section provides a list of the stakeholder roles important to the system.  For each, the section lists the concerns that the stakeholder has that can be addressed by the system.

3 Architecture & Problem Background

CONTENTS OF THIS SECTION: The sub-parts of Section 3.1 explain the constraints that provided the significant influence over the architecture.

3.1 System Overview

CONTENTS OF THIS SECTION: This section describes the general function and purpose for the system or subsystem whose architecture is described in this SAD.  Include a high-level context diagram of the system and summarize major inputs and outputs.

If you don’t know how to build an accurate context diagram, look here.

3.2 Goals and Context

CONTENTS OF THIS SECTION: This section gives the name of the system and describes its high-level functions that the BSM solution is offering and more importantly how the solution would fit into the current value chain of the organization.

3.3 Significant Driving Requirements

CONTENTS OF THIS SECTION: This section describes behavioral and quality attribute requirements (original or derived) that shaped the software architecture. Included are any scenarios that express driving behavioral and quality attribute goals.

This section should only list the key driving requirements and not detailed requirements for the solution.

4 Competative Landscape

CONTENTS OF THIS SECTION: This section lists and briefly describes the major competitors of the system.  Competitors are those systems that do the same thing as the system or those systems that could otherwise be used in place of the system.  It also gives a high level overview of the strengths, weaknesses, opportunities and threats of the system explained in more detail in the following sections.

4.1 Strengths

CONTENTS OF THIS SECTION: This section describes the functions that the system does well either in comparison with its competition or in absolute terms.

4.2 Weaknesses

CONTENTS OF THIS SECTION: This section describes the functions that the system does poorly in relation to its competitors or in absolute terms.  Also included could be features that competitors have but the system does not, or features that the system should have but does not given the stakeholders and high-level requirements described in the previous section.

4.3 Opportunities

CONTENTS OF THIS SECTION: This section describes what the opportunities are for the system.  Opportunities are factors external to the system (e.g., in the overall environment) such as general trends or actions of competitors that enable the system to increase its market share or usefulness to stakeholders.

4.4 Threats

CONTENTS OF THIS SECTION: This section describes the threats that the system is likely to experience.  Threats are factors external to the system such as general trends or actions of competitors that decrease the market share of the system or its usefulness to stakeholders; in the extreme case, threats might render the system obsolete.

5 Referenced Materials

CONTENTS OF THIS SECTION: This section provides citations for each reference document.  Provide enough information so that a reader of the SAD can be reasonably expected to locate the document.

6 Directory

6.1 Glossary

CONTENTS OF THIS SECTION: This section provides a list of definitions of special terms and acronyms used in the SAD . If terms are used in the SAD that are also used in a parent solution description document and the definition is different, this section explains why.

6.2 Acronym List

If you work in telecom or finance world, you would know as i do, the TLA’s [Three letter acronyms] are annoying from organization to organization. So, don’t assume – take 10 minutes and add value to your BSM document.

Acknowledgements:

SEI Architecture documentation

Professor Jeff Thompson

Professor J Vayghan

Tagged with:
preload preload preload