Archive for the ‘Uncategorized’ category

The Data-Information Hierarcy, Part 3

August 31, 2011

The Data-Information Hierarchy is frequently represented as
Data –> Information –> Knowledge –> Understanding –> Wisdom.

Or it is sometimes shortened to 4 steps, omitting Understanding. But, in fact, there are two predecessor steps: chaos and symbol. These concepts have been discussed in prior blogs (https://profreynolds.wordpress.com/2011/01/31/the-data-information-hierarcy/
https://profreynolds.wordpress.com/2011/02/11/the-data-information-hierarcy-part-2/).

Chaos is that state of lack of understanding best compared to the baby first perceiving the world around him. There is no comprehension of quantities or values, but a perception of large and small.

Symbol (or symbolic representation) represents the first stages of quantification. As such, symbolic representation and quantification concepts from the predecessor to Data.

So the expanded Data-Information hierarchy is represented in the seven steps:

Chaos –>
          Symbol –>
                    Data –>
                              Information –>
                                        Knowledge –>
                                                  Understanding –>
                                                            Wisdom

Continuing with this Data-Hierarchy paradigm, we can represent the five primary steps with the simple explanation:

  • Data and Information : ‘Know What’
  • Knowledge : ‘Know How’
  • Understanding : ‘Know Why’
  • Wisdom : ‘Use It’
Advertisements

The Big Crew Change

May 17, 2011

“The Big Crew Change” is an approaching event within the oil and gas industry when the mantle of leadership will move from the “calculators and memos” generation to the “connected and Skype” generation. In a blog 4 years ago, Rembrandt observes:

“The retirement of the workforce in the industry is normally referred to as “the big crew change”. People in this sector normally retire at the age of 55. Since the average age of an employee working at a major oil company or service company is 46 to 49 years old, there will be a huge change in personnel in the coming ten years, hence the “big crew change”. This age distribution is a result of the oil crises in ‘70s and ‘80s as shown in chart 1 & 2 below. The rising oil price led to a significant increase in the inflow of petroleum geology students which waned as prices decreased.”

Furthermore, a Society of Petroleum Engineers study found:

“There are insufficient personnel or ‘mid-carrers’ between 30 and 45 with the experience to make autonomous decisions on critical projects across the key areas of our business: exploration, development and production. This fact slows the potential for a safe increase in production considerably”

A study undertaken by Texas Tech University make several points about the state of education and the employability of graduates during this crew change:

  • Employment levels at historic lows
  • 50% of current workers will retire in 6 years
  • Job prospects: ~100% placement for the past 12 years
  • Salaries: Highest major in engineering for new hires

The big challenge: Knowledge Harvesting. “The loss of experienced personnel combined with the influx of young employees is creating unprecedented knowledge retention and transfer problems that threaten companies’ capabilities for operational excellence, growth, and innovation.” (Case Study: Knowledge Harvesting During the Big Crew Change).

In a blog by Otto Plowman, “Retaining knowledge through the Big Crew Change”, we see that

“Finding a way to capture the knowledge of experienced employees is critical, to prevent “terminal leakage” of insight into decisions about operational processes, best practices, and so on. Using of optimization technology is one way that producers can capture and apply this knowledge.When the retiring workforce fail to convey the important (critical) lessons learned, the gap is filled by data warehouses, knowledge systems, adaptive intelligence, and innovation.”

When the retiring workforce fail to convey the important (critical) lessons learned, the gap is filled by data warehouses, knowledge systems, adaptive intelligence, and innovation. Perhaps the biggest challenge is innovation. Innovation will drive the industry through the next several years. Proactive intelligence, coupled with terabyte upon terabyte of data will form the basis.

The future: the nerds will take over from the wildcatter.

Multi-Nodal, Multi-Variable, Spatio-Temporal Datasets

April 21, 2011

Multi-Nodal, Multi-Variable, Spatio-Temporal Datasets are large-scale datasets encountered in real-world data-intensive environments.

Example Dataset #1

A basic example would be the heat distribution within a chimney at a factory. Heat sensors are distributed throughout the chimney and readings are taken are periodic intervals. Since the laws of Thermodynamics within a chimney are well understood, the interaction between the monitoring devices can be modeled. Predictive analysis could, conceivably be performed on the dataset and chimney cracks could be detected, or even predicted, in real-time.

In this scenario, data points consist of 1) multiple sensors or data acquisition devices, 2) multiple spatial locations, 3) temporally separated samples. When a sensor fails, it is simply removed from the processing and kept out of the processing until the sensor is repaired (during plant maintenance).

Example Dataset #2

An example would be the interconnected river and lake levels within a single geographic area. Distinct monitoring points are located at specific geo-spatial locations; geo-spatial points with interconnected transfer functions and models. Each of the monitoring points consist of multiple data acquisitions, and each data acquisition is sampled at random (or predetermined) intervals.

As a result, data points consist of 1) multiple sensors, 2) multiple spatial locations, and 3) temporally separated samples. In this scenario, sensors may fail – or become temporarily offline in a random, unpredictable manner. Sensors must be taken out of the processing until data validity returns. Due to the interconnectedness of the sensor locations, and the interrelationships between the sensors, sufficient redundant data could be present to permit suitable analytical processing in the absence of data.

Example Dataset #3

The most complex example could be aerial chemical contamination sampling. In this scenario, the chemical distribution is continuously changing at the result of understood, but not fully predictable, weather behavior. Sampling devices would consist of 1) airborne sampling devices (balloons) providing specific, limited sample sets, 2) ground based mobile sampling units (trucks) providing extensive sample sets, and fixed based (pole mounted) sampling units whose data is downloaded in relatively long intervals (hours or days).

In this scenario, multiple, non-uniform data sampling elements are positioned in non-uniformly (and mobile) located positions, with data collection performed in fully asynchronous fashion. This data cannot be stored in flat-table structures and it must provide enough relevant information to fill-in the gaps in data.


%d bloggers like this: