1.07.2010

The Concepts of Data and Information

This week, while preparing for a class in information systems, I discovered an integration of the concepts "data" and "information" in terms of Ayn Rand's Epistemological framework as defined in Introduction to Objectivist Epistemology.  As a check on my thinking, I am posting my thoughts here for feedback.

I begin with Rand's definition of a concept:
A concept is a mental integration of two or more units possessing the same distinguishing characteristic(s), with their particular measurements omitted. - ITOE
Using my own words, concepts represent an abstraction.  By omitting measurements except for essential characteristics, a concept economizes our thinking process by allowing us to think about a vast number of things at the same time, without having to mental refer to perceptual concretes. But when it comes to thinking about the world, it is necessary for the mind to selectively re-introduce measurements to concepts in order to build context around one's thoughts.  For example, if I were to say to myself "I want to buy a new chair for my living room," I have re-introduced the measurement of location (and perhaps function) to my thinking.  Any additional thoughts about this purchase would necessarily introduce more measurements.  Is the chair a recliner?  Do you want it to rock?  What color?  Do you want it overstuffed?  Leather or fabric?  Quality of construction?  Metal or wood legs? Etc.  With these measurements re-introduced, we are prepared to evaluate real chairs for the possibility of purchase.


But what are measurements, other than facts about reality.  In terms of data, data is nothing more than specific facts about some thing.  They are measurements.  They are measurements of real world entity.  As Rand defines it:
Measurement is the identification of ... a quantitative relationship established by means of a standard that serves as a unit. - ITOE p. 7
Data are the result.  Data are the quantitative relationships recorded about a specific entity.

My next question delved into was how information relates to these concepts.  Unfortunately, I could find no definition of information in a dictionary that adequately captured the essential characteristics of this concept.  The definition I offer serves as a working definition.  With that disclaimer, I define information as a set of measurements that denote a perceptual concrete of a concept or set of concepts within a specific context.  If I say "Please provide me with all relevant information," I am requesting data about a specific context.  Not just any data though.  Information is data with meaning.  How do we establish meaning?  Meaning comes from concepts.  So in essence, information represents data re-introduced to relevant concepts within a specific context.  A short hand notation I used to help think about this was information = data + concepts.  The assumption is that the concepts define the context.  Since language serves as a set of symbols to denote concepts, a variation of the above notation could be written information = data + language.  Again, the assumption is that the language defines the context.

In computer terminology, programming languages are used to manipulate data in logically constructed ways in order to add meaning to numbers.  I believe the term programming language is very apt, in that the language is used to define the context in terms of a computer program.  This led to my last integration where information = data + computer program. With this integration, I believe it is possible to better understand the role of information systems and computer programs in helping us think and act.  While computers cannot replace our thoughts, they can add meaning to the vast abundance of data.  But the meaning is limited to the context established by program itself. 

So those are my thoughts.  I believe my overall approach to these concepts is sound, but I'm interesting in hearing your thoughts.

3 comments:

  1. Thank you. Bravo for writing this post, as it stimulates my chewing of the concept "information." Below are my thoughts. I hope that you will find them useful. Disclaimer: I am studying philosophy; I am not an expert!

    I expand your idea by taking it one-higher-level of abstraction. In the spirit of your notation, I write

    information = the output of a measurement_function applied on data
    = measurement_function(data) (in computer science notation)

    In this context, measurement_function is an operator (a "verb") that extracts and/or builds new relationship(s) from data (a "noun") within a certain context. Data goes in; information goes out.

    Two comments:

    (1) Information and its corresponding function have an hierarchical order.
    More precisely, information_level_i = measurement_level_i_function(data).
    A "low level" measurement function deals with relationships at a low cognitive context, e.g. performing a histogram on the data. A "high level" measurement function deals with relationships at a higher cognitive context, e.g. performing pattern recognition on the data: is it a car or a face of a human?

    (2) The above framework applies to both human and computers. When applied to human, the level capacity of the measurement functions that could be performed increases with age; a baby could only do low level measurement functions (compared to adults). When applied to computer science, the level capacity of the measurement functions that could be performed, analogously, are dependent on the hardware and software technology.

    ReplyDelete
  2. Thanks Thanh! I appreciate the feedback.

    Going to one higher level of abstraction is a great idea, especially if I were to ever publish the idea. I had not consider it before, as I was primarily focusing on a sophomore level class. But the notation and ideas could be taken to far more sophisticated levels of abstraction.

    ReplyDelete
  3. I got here today via the Roundup. Just a brief comment: I'm an IT professional. As a programmer years ago, I also thought of the relationship between data and information similarly, in terms of Objectivist epistemology. Data is the percept, information is the concept, and a program is the method of turning the former into the latter, namely, logic. Not much meat there, so to speak, but I always found it helpful.

    Interesting post. Thanks.

    ReplyDelete