Starting With the Basics



image credit: CowGummy

The first step in mastering anything is to get a firm grasp of what the thing is. For this reason, I felt it would be fitting to begin this blog, which primarily deals with data, with a few definitions related to data.

Data

What is this thing we call data? We deal with it every day, but how do we define it? Wikipedia defines data as follows:

The term data means groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data (plural of “datum”, which is seldom used) are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived.

For me, this definition is a good start, but I crave something simpler for a working definition. Data are facts about things that can be observed or measured, recorded outside of the context where they naturally occur. In other words, data are facts without context. Examples of data would be 120 pounds, $33.57, blue, or the answer to the ultimate question of life, the universe, and everything when the ultimate question is unknown (if it is even knowable at all)!

One of the things I do like about the Wikipedia definition is that it states data are the basis from which information and knowledge are derived. This leads to another question. What is information?

Information

If data are facts taken out of context, and information is derived from data, then a definition of information is: data placed into context. But wait, the facts were already in context before we messed with them in the first place! Aren’t we just going in circles? Not really. By taking the facts out of context and then putting them into context we can assemble the data in many different ways to reveal relationships and interactions that may not have been visible in the original context.

Knowledge

Now that we have working definitions of data and information, what is knowledge? How is knowledge different from information? I found this concept to be a bit trickier and required a little more thought to come up with something satisfying. Information puts facts into context…when where how what why and by whom. How does information become knowledge? Since knowlege is something you know, information has to get into someone’s head before it can become knowledge. So perhaps we need to understand what it means to know something and then we can figure out how to structure information so it helps people achieve this state of knowing.

A friend of mine once made this statement to me about knowledge. He said “If you really know something, you can draw a picture of it.” As I contemplated this statement, my mind went to the whiteboard in my office and the many others like them in my coworkers offices. Each one had scribblings and visual representations of various concepts. I saw a lot of truth in what my friend was saying, but I would still refine the definition a bit. I would replace “draw a picture” with “model”. A drawing is only one type of model. While knowledge can be visually modeled in a diagram, map, or chart, it can also be modeled by a document, recipe or actions as well. A model – any type of model – serves to communicate knowledge from person to person. So, if information is to be transformed into knowledge, it must be assembled into a communicable model of something knowable.

Wisdom

If you’re fortunate enough to derive knowledge from data, the knowledge you gain is only as good as what you do with it. To me, wisdom is simply knowing and doing the right thing at the right time.

Summary

Data are facts out of context.
Information is data placed into context.
Information cannot be assembled directly into knowledge, but it can be assembled into a model to communicate something knowable.
Wisdom is knowing and doing the right thing at the right time.

Now it’s your turn. Which definition will be most useful to you in your life and work? Which definitions can be improved and how? Where did I totally miss the boat?

About Julius Campbell

Hi, I'm Julius Campbell and The Data Whisperer is my blog. I'm a software engineer who is passionate about helping people make smarter decisions by extracting the wisdom hidden in data. Career-wise, I focus on developing the data tier of enterprise applications. If you ever meet me and stand next to me long enough, you will probably hear me sing or hum a tune. I not only enjoy listening to a variety of music, but I also enjoy singing and songwriting. [Read more ...]

  • http://twitter.com/laneydoug Doug Laney

    I hear this “data->info->knowledge->whatever” discussion often, but struggle to understand it’s value or application in the real world. It’s certainly an interesting framework and fun over-a-beer argument, but what’s the practicality? Where’s the beef? I have found that using an actual information supply chain framework, e.g. acquisition->administration->application (and all the functions therein) to discuss how data flows from capture through consumption is much more helpful to organizations in planning info management activities.

  • http://www.juliuscampbell.com/datawhisperer Julius Campbell

    Hi Doug,
    I have worked with data on the back end for many years and most of my days went like this: “Julius, I need a new feature (column, table, query, report, etc.) can you provide this feature for me?” In most cases, if the feature was well-defined enough, I could provide it. As I progressed in my career, the level of complexity of features I was able to provide increased. What I began to realize, however, was that the business value of the features I provided was not necessarily proportional to the complexity or effort I put into producing them.

    This realization disturbed me. If I’m going to spend a major portion of my life cranking out complex data operations and features, I would like to think I’m producing some sort of value. That’s what got me thinking about “why data matters” so to speak. That was ultimately the reason for this blog post.

    “What is the value of this discussion?” This discussion helps me personally to think about my job from a different perspective. It is fun for me to work on complex data features. I get to use my knowledge, skill, and creativity; that in itself is rewarding. Forcing myself to think about what happens to data after it leaves my hands helps me to have a language to push back against producing features that add little to no business value – even if producing it would provide me with temporary gratification. In other words, it helps me feel comfortable that I’m “doing the right thing” and not just “doing things right”.

    I agree that an information supply chain framework is a practical way for organizations to plan information management activities. However, not every organization uses or will ever use such a framework. I have never worked anywhere that used one. Meanwhile, every organization has to deal with data. I found it useful for me to develop a discipline of thinking about data from a different perspective and thought it would be helpful for other people in my situation. That was the reason for this blog post.

    Thanks for the comment!

  • Pingback: How to be the Rumpelstiltskin of Data