Kartic's Musings on Corporate Information and Knowledge Management

August 20, 2015

Idiot’s guide to developing IT Strategy

Filed under: SharePoint — kartickapur @ 6:26 am

All too often CIO falls in the trap of formulating IT strategy in order to justify next year’s budget. This results in weak plan leading to IT staff constantly playing catch-up. Increasingly, business leaders including CEOs have started to embrace the notion that IT planning needs to go beyond simply including IT budget in current year’s budget process.

Here is an interesting article from forbes to ponder upon: ‘GE and GM: Old companies become a new breed of software companies‘. Here is an excerpt from the article worth considering:

In a recent interview, GE’s Jeff Immelt said that every company has to be a software company. “It’s our belief that every industrial company in the coming age is also going to have to be a software and analytics company,” he told Charlie Rose. “The people that deny that digitization is going to impact every corner of the economy is going to be left behind. So we’re investing massively in that, but really around industrial assets and not around things like the consumer Internet.”

From <http://www.forbes.com/sites/joemckendrick/2015/06/22/ge-and-gm-old-companies-become-a-new-breed-of-software-companies/>

Three German car makers, for instance just bought Nokia’s mapping business for $3.1 Billion. They have realised that car manufacturing business can only beat the competition by being ahead of the curve in driver-less car technology.

Source: http://www.businessinsider.com.au/german-automakers-buy-nokia-here-2015-8

Now that we have settled that all traditional business will have to include technology as part of their key growth and survival goals, let’s look at essential attributes of a well-defined IT strategy in order to ensure continued support from CEO for technical initiatives:

  • Align back to corporate goals.
  • Prioritise strategic outcomes in order to address most important business priorities.
  • In addition to strategy document, create a one page succinct version that you can show to stakeholders regularly.
  • Ensure agility and flexibility to support organisation’s success.
  • Enable sustainability which ensures organisation’s current and future needs.
  • Balance future initiatives with manageability, maintainability and security.
  • Plan metrics to measure real benefits/value derived from technology.
  • Lastly, refresh regularly to incorporate changes in business focus/direction/priorities.

Cerge Thorn (CIO of architecting the enterprise) has come up with a pyramid theory where he illustrates alignment between IT and business strategies. He demonstrates how IT vision (which supports company’s vision) is underpinned by three pillars of Integration, Improvement and Innovation. Overall he breaks down IT strategy into following sub-elements:

  • Infrastructure strategy
  • Application strategy
  • Integration strategy
  • Service strategy
  • Sourcing strategy
  • Innovation strategy

From <http://blog.opengroup.org/2011/11/22/what-does-developing-an-it-strategy-mean/>

Here is the pyramid from Cerge’s blog:


From <https://opengroupblog.files.wordpress.com/2011/11/it_strategy_and_vision_pyramid.jpg>

Here is step by step guide for formulating your strategy bible:

1. Reviews the current business strategic and tactical goals and anticipate future needs.

If a business strategy is available, it would be advisable to start by reviewing organisation’s strategic plan. Often after listing down these plans, areas where technology could either enhance or enable them become apparent. Although it may differ for industry but as a general trend, if a strategic plan does not exists, organisational units may have their own:

  • Growth plans
  • Performance related plans
  • Spending plans and budgets
  • Sales targets
  • Upcoming acquisitions and partnerships
  • Plans for cost reductions

In order to nut them out, interview key stakeholders from top executives to operational staff within key functions of your business. Strategies can include questionnaires, surveys, SWOT analysis but I prefer individual interviews. An interview could include pointy questions around:

  • Pain paints that keep them up at night
  • Top 5 business objectives
  • Any plan to achieve these objectives
  • What issues do you need to resolve around efficiency, processes, security, personnel etc.
  • Key frustrations from previous technology initiatives
  • What would you like to see in the future
  • Key business risks and plans for managing them. Can technology play a part
  • Any competitive pressures

In short get a good understanding of key drivers within your business. Once you list them down, you will start to get clarity about areas where technology can play a part. Some common themes would start appearing as well.

2. Review Industry trends (IT trends for industry business operates in and general IT trends).

Generally forgotten as a step in formulating a plan, overlooking this could spell disaster. As an example, current IT trends like:

  • Big data analytics
  • Cloud computing
  • Mobility
  • Internet of Things (IoT)
  • 3D Printing

Gartner defines the following top ten IT trends for 2015


From <http://www.forbes.com/sites/peterhigh/2014/10/07/gartner-top-10-strategic-it-trends-for-2015/>

Once you have listed these down, start by analysing where one or more of these can be leveraged or aligned to deliver on strategic outcomes listed in step 1 above.

Going further, there would be emerging trends in your industry that you should be aware of. Car industry (in my previous example) is going through tectonic shift. Driver less technology is fast becoming a reality and those manufacturers who fail to include this in their strategy, are setting themselves up for failure.

3. Identify capability that need to be developed and their priority.

The strategic goals, combined with industry trends will lead to list of capabilities and key themes that will need to be developed in your organisations.

4. Review your current IT capabilities, applications & systems. Benchmark your organisation on these capabilities.

This may be difficult to achieve internally without bringing in consultants but given below are the important criteria’s where it is important to benchmark yourself in order to evaluate your current maturity level:

  • Size, scope and IT spending compared to peers.
  • IT leadership – is leadership proactive or reactive. Does CIO have a seat on the executive decision making group.
  • Rank the capabilities of your internal IT staff
  • Rank satisfaction of your business users
  • Is IT considered a cost of doing business or enabler – where does your company stand
  • System upgrade levels
  • Vendor management capability
  • Operational capability

5. Review gaps between strategic goals and current operations and set targets. Include plans to measure success.

I suggest that you create a matrix which lists down all the capabilities identified in step 3. Score these capabilities on current status and where you want to be and by when. You can get creative with how you represent them but in use a simple spreadsheet. The example below could be subset of capabilities that your strategy needs to address. (generally there are about 30 to 50 capabilities in an organisation)


It is also important to come up with metrics on measuring the maturity over time.

6. Develop short term and long term project plan/timeline – Project Roadmap.

By now it would be clear that when it’s time for allocating budgets for next financial year, management is not dreaming up projects to grab biggest chunk of the budget. An cohesive IT management team would collaborate and prioritise areas of the IT where capability uplift is critical to business success. In my example above, by quick look it is apparent that System integration is number 1 priority for 2016. Resource management and Business intelligence could be other two capabilties. Here is what your project list could look like in the System Integration bucket:



August 10, 2015

Big Data Analytics with Hadoop – Case Study

Filed under: SharePoint — kartickapur @ 1:51 am

In my previous post I talked about Big Data at a high level and the merits of extending your BI strategy into Big Data Analytics roadmap. I thought it would be useful to deep dive into case study of Hadoop platform to understand the concept and capability better.

I found a handy example of detailed case study done by Hortonworks for finance sector.

What’s Hadoop?


Hadoop is an open source platform where key technology powerhouses contribute to enhance the capabilities, each bringing their unique use cases.

Following companies contribute to Hadoop open source technology:

  • Microsoft
  • SAP
  • Teradata
  • Yahoo
  • Facebook
  • Twitter
  • LinkedIn
  • Many more

Use cases and data types across industries


Given below are the use cases and data types that can be captured in the big data landscape:


Data architecture with apache hadoop on windows

Reference: Hortonworks 2014, ‘Modern Data Architecture for financial services with Apache Hadoop on Windows’, The journey to a financial services data lake.

Business case

High level business case category:

  • maximise opportunity
  • minimise risk
  • better serve customers
  • enhance financial management
  • develop innovative new business model
  • keep pace with competition

Data sources:

  • web and connected devices
  • social media
  • Partners
  • CRM systems
  • Marketing and advertising databases
  • Order management systems

Challenge with existing Data warehouse and BI architecture:

  • Exponential growth – 2.8ZB in 2012 to 40ZB (2020 estimated).
  • Varied nature – incoming data can have little or no stucture or structure that changes too frequently for reliable schema creation at time of ingest.
  • Value at High Volumes – incoming data can have little or no value as individual or small groups of records but with high volumes and longer historical perspective, data can be inspected for pattern and used for advanced analytic applications.

What gaps does Hadoop fill

Technology (high level)

  • Apache Hadoop collects and manage diverse volumes of unstructured and semi-structured data alongside traditional repositories like the enterprise data warehouse
  • Hadoop also fulfils the vision of enterprise-wide repository for big data or frequently known as ‘data lake’. This provides scalable and flexible storage system that can accept data in any format.
  • Application framework that allows different types of processing workloads to interact with a common pool of storage data.

Business (high level)

  • New efficiencies: through significantly lower cost of storage and the optimisation of data processing workloads such as data transformation and integration.
  • New Opportunities: through accelerated analytical applications, able to access all enterprise data in both batch and real-time modes.
  • New insights: through allowing data from traditional and emerging data sources to be retained, combined and mined in new and unforseen ways.

Modern Architecture with Apache Hadoop Integrated into existing data systems:


New opportunity for Analytics

  • Schema on read – unlike data warehouse where data is transformed into specified schema when it is loaded into the warehouse requiring schema on write, Hadoop empowers users to store data in its raw format. Analysts can then create a schema to suit the needs of their application or analysis at the time of use.
    • Example: combine CRM data with clickstream data (server logs from web site, sentiment data from social media etc.). It is hard to format data and structure it at the time of entry. With hadoop, structure or make sense during read:
  • Multi-use, multi workload data processing – multi access methods like batch, real-time, streaming, in-memory etc, allow analysts to transform and view data in multiple ways (across various schemas).
    • For example, credit issuer may choose to run an advanced fraud prevention application against incoming transactions in real-time, and run a series of batch reporting and analysis processes overnight – both these can happen on a single cluster of shared resources and single versions of data using hadoop.

 New Opportunities for Data Architecture

  • Lower cost of storage – compared to SANs (high end storage area networks), hadoop allows the user to reduce CAPEX (capital expenditure) because it runs on commodity hardware and also because it allows users to invest in “just enough” hardware to meet immediate needs and easily expand later as needs grow.
  • Data warehouse workload optimisation – as compared to traditional Enterprise Data Warehouse (EDW), the ETL function (which is lower value computing workload) can be offloaded to hadoop, wherein data is extracted and transformed on the hadoop cluster and the result are loaded into the data warehouse.

  Hadoop Enterprise Capabilities


  • Data management – store and process vast quantities of data in a scale-out storage layer.
    • HDFS provides Hadoop’s efficient scale-out storage layer. Yarn enables hadoop to serve broad enterprise use cases, allowing wide variety of data access methods to operate on data stored in hadoop.
  • Data access – Access and interact with data in a wide variety of ways – spanning batch, interactive, streaming and real-time use cases
    • Apache hive offers direct data connections ot microsoft excel and power BI.
  • Data Governance and Integration – Quickly and easily load data, and mange it according to policy.
    • Apache falcon provides policy based workflows for governance.
    • Apache flume and sqoop enable easy data ingestion as do the NFS and WebHDFS interfaces to HDFS.
  • Security – Address requirement of authentication, authorisation, accounting and data protection.
    • Security is provided at every layer Hadoop stack from HDFS to Yarn to Hive.
  • Operations – Provision, manage, monitor and operate Hadoop clusters at scale.
    • Apache Ambari offers the necessary interface and APIs to provision, manage and monitor Hadoop clusters and integrate with other management console software’s.


Tying it back to Finance sector example


Hadoop Architecture with microsoft Windows


Benefit Realisation in Finance Sector

  •  Improving underwriting efficiency for usage based insurance – advanced GPS and telemetry technologies have reduced the cost of capturing driving data for insurance companies issuing pay as you drive (PAYD) policies but for one company streaming vehicle data was cost prohibitive. It is now possble for this company to retain 100% of policy holders geolocation data. This has allowed this company to align premiums with empirical risk and in turn reward safer drivers.
  • Screening new account applications for risk – managing risk of fraud by identifying patterns of fraud.
  • Achieving sub-second SLAs with a Hadoop “ticker plant” – ticker plant collects and process massive data streams, displaying prices for traders and feeding computerised trading systems fast enough to capture opportunities in second. For one custom, gigabytes of data flow in from thousands of server logs per day. This data is queried more than 30,000 times per second and Apache Hbase enables super-fast queries that meet their client SLAs.

Reference: Hortonworks 2014, ‘Modern Data Architecture for financial services with Apache Hadoop on Windows’, The journey to a financial services data lake.

Nine Vendors offering Hadoop Services


  1. Amazon Web Services (AWS) The company’s Hadoop product is named Elastic Map Reduce (EMR), which AWS says uses Hadoop to offer big data management services. It is not pure open source Hadoop though, it’s been tinkered to run specifically on AWS’s cloud.
  2. Microsoft and its partners Windows Azure’s HDInsight product is a Hadoop as a service offering based on Hortonworks’ distribution of the platform but specifically designed to run on Azure. Microsoft has some other nifty projects too, including a production-ready feature named Polybase that allows information on SQLServer to also be searched during Hadoop queries. “Microsoft’s significant presence in the database, data warehouse, cloud, OLAP, BI, spreadsheet (PowerPivot), collaboration, and development tools markets offers an advantage when it comes to delivering a growing Hadoop stack to Microsoft customers
  3. Cloudera Cloudera uses open source Hadoop for the basis of its distribution, but it is not a pure open source product. When Cloudera’s customers need something that open source Hadoop doesn’t have, they build it, or they find a partner who has it.
  4. Hortonworks Unlike Cloudera, Hortonworks sticks to the open source Hadoop code stronger than perhaps any other vendor. Hortonworks’ goal is about building up the Hadoop ecosystem and Hadoop users, and advancing the open source code.
  • IBM
  • Intel
  • MapR Technologies
  • Pivotal Software
  • Teradata

August 4, 2015

Building Information Architecture Bridges

Filed under: SharePoint — kartickapur @ 3:10 am

Having being involved in numerous enterprise content management (ECM) and eDRMS implementations over last decade, if there is one thing I have noticed as being single most important bridge between technology and achieving successful business outcome, it would be a well-defined Information Architecture (IA).

To start with, I found this great definition from uasbility.gov website in an article titled ‘information architecture basics’:

Information architecture (IA) focuses on organizing, structuring, and labelling content in an effective and sustainable way.  The goal is to help users find information and complete tasks.  To do this, you need to understand how the pieces fit together to create the larger picture, how items relate to each other within the system [from ‘Information Architecture Basics’, http://www.usability.gov/what-and-why/information-architecture.html]

Any implementation of technology for managing information (whether it is on SharePoint 2013 on premises, Office 365 or any other vendor platforms) cannot and must not start before thorough consideration and planning of Information Architecture principles. Most industries have well defined industry standards for managing information lifecycle (creating, storing, accessing, presenting and disposing content). Australian government national archives for example provide information on some generic standards used in Australia and internationally http://www.naa.gov.au/records-management/strategic-information/standards/international-standards/index.aspx. Before starting on your IA journey get yourself familiarised with your Industry standards or partner yourself with someone who is familiar with them.

In order to create a successful IA it is helpful to consider the following venn diagram which encapsulates ‘information ecology’ theory proposed by Rosenfeld and Moreville. It talks about interdependent nature of users, content and context:

From <http://www.usability.gov/what-and-why/information-architecture.html>

Understanding basic considerations

Based on the three parameters of user, content and context, let me break down some further considerations:


  • Software boundaries
    • Database limits
    • Limits for site collections (in sharepoint for example)
    • List view threshold limits (in sharepoint)
  • Records Managements Requirement
    • Compliance related records for legal purposes
    • Retention and Disposal Policies
      Records management is a whole different topic that can take few articles to describe. If interested, this link from ‘National Archive of Australia’ provides a comprehensive guide and overview of classification tools for records management: http://www.naa.gov.au/Images/classifcation%20tools_tcm16-49550.pdf
  • Business Process Almost all content is tied to a process. Map the process and you will know how to manage the content.
    • Is there are requirement to apply automated workflow to certain types of content
    • Base IA on activities rather than organisational structure
      • Think about possibilities around future organisational structure
      • Having said that, there will always be some information which will rely on org structure like team administration, management
    • Consider business classification scheme relevant to the industry (Resource sector will be completely different from finance for instance)
  • Culture – any technology implementation that doesn’t align with corporate culture will most like fail.


  • Search
    • Search refiners
    • Build central search
    • Build local search centre
      (Microsoft defines some technical parameters around planning content search in SharePoint here)
  • Metadata
    • Again from search perspective
    • Views
    • Business classification
    • Data flow (consider workflow)
  • Considerations around type of content or file types
    • Media content is a classic example
    • Document
    • Lists
    • Forms
    • Web content


  • Mobility
    • A mobile sales driven team would have different requirements to staff with desk jobs
  • Usability
    Remember not to overwhelm users with too many options. Last thing you want is users restoring to saving critical business information on their pen drive because they are overwhelmed with choices to store content. What I mean is don’t get carried away while defining:
  • Number of containers or repositories
  • Number of content types to choose from
  • Number of metadata columns which users have to fill in (technologies like SharePoint allow certain columns to be defaulted).

 Get started

Here are few things to consider in every phase of implementing a successful system based on strong IA principles:


  • Get your hands on industry standards if possible.
  • Get an understanding of compliance related requirements
  • Workshop to solicit requirements from cross section of your organisation (finance, procurement, HR etc). This will not only help you understand high level requirements but will also give business a sense of ownership right throughout the process.
  • Scalability requirements would be handy if known – historical view of information sprawl in most cases will be useful to plan for future growth

 Planning (defining and information architecture framework)

  • Plan for infrastructure (based on scalability, disaster recovery, mobility etc.) while considering IA requirements and not in isolation. Infrastructure planning does not form part of IA exercise but from my experience a disconnect between the two can be disastrous.
  • Map information containers around logical information categories. Examples could be:
    • Team
    • Project
    • Records
    • External partners
    • Industry specific information with examples being:
      • Seismic information (for resources sector)
      • Insurance claims (for insurance)
  • Plan metadata
    • Define business classification scheme (BCS) based on industry standards (this is especially relevant for implementing records management). Again, refer to this link for comprehensive overview of records management related classification tools: http://www.naa.gov.au/Images/classifcation%20tools_tcm16-49550.pdf
    • Define common vocabulary (or metadata) with examples being locations, business functions (based on BCS),   asset tags etc.
    • Think about retention and disposal schedules based on compliance requirements.
    • Define Global content types (relevant for SharePoint) – remember to keep this list small (typically less than 50 global content types for a large organisation and less than 20 for a small to medium size). If you want to know more about content types, go to office support article for detailed overview here.
  • Plan how users will navigate for information
    • Increasingly organisations are resorting to mobile app based navigation as a primary navigation mechanism
    • Traditionally speaking, navigation should not reflect organisational structure
  • Plan how users will search for information (Microsoft defines some technical parameters around planning content search in SharePoint here). As a general practice plan for some high level principles around:
    • Content sources to search for (here is Microsoft technet article on result sources)
    • What would be available in global search vs local search. As an example, some specific repositories like claims in insurance need to have local search centre (and search page) rather than results appearing in global search.
    • Search refiners that users may look for to narrow down on search results (It helps to take queue from car or house sale websites to visualise what I am talking about here). (here is a Microsoft technet article to help you understand technical background behind defining search schemas)
    • Plan how search will be ranked (here is Microsoft technet article on search order results in SharePoint 2013)


All I will say is for a successful implementation, follow some of these guidelines (this is from my experience):

  • Roll out iteratively rather than one big bang approach
  • Measure success and information management maturity over time
  • Slight modifications in approach can be necessary if results are not evident over time
  • Treat the project as a change management exercise rather than technology implementation

Change Management and Training

  • This should not start once project is closed, it’s a continuous process
  • Develop a framework for selecting your power users who will champion the change from within the business
  • Train the power users and get them to train the end users
  • Develop training videos or single page training modules to reduce training budget
  • Measuring information management maturity over time can be time consuming but a simple survey can be useful tool to develop ongoing strategies


  • Develop the following 4 tiered approach to avoid having to rely on staffing up big support team:
    • Tier 1: End users themselves:
      • Continuously train end users so that they are self-reliant
      • Create a self-help FAQ portal for end users with links to training manuals/videos etc
    • Tier 2: Power Users within the business: these champions should act as first point of contact if there is a requirement or an issue.
    • Tier 3: service centre: IT service centre needs to be trained in calls that does not require extensive time commitment to resolve
    • Tier 4: Functional BA/System specialists


Although I have left it for the end, in the absence of ongoing governance structure none of the above will be sustainable beyond six months of rollout. Governance, in a nutshell is process which will ensure that whatever framework and policies have been defined are adhered to on a regular basis. Governance can be implemented with the combination of:

  • Ensuring content ownership within the business
  • Creating executive governance group which meets regularly (at least once a month) to discuss changes and issues
  • Process which will ensure right level of ownership and approval for actions
  • Strongly defined responsibility, accountability, consulted and Informed (RACI) model
  • Technology can ensure adherence of governance to certain extent
  • Communication of governance principles and educating users on benefits on regular basis

Use Social to tie it all together

Technologies like yammer provide a glue to help ever evolving user entered context around information. Conversations centred around information can drive user adoption and discoverability while pushing up innovation in your organisation. Check out this yammer use case catalogue.


If you have made it this far, congratulations! At the end, although I have blabbered on for quite some time, this may sound like an oxymoron but please don’t complicate things unnecessarily. Spend lot of time on planning but please put yourself in user’s shoe and keep it simple for that end user.

Create a free website or blog at WordPress.com.