A data discussion


What is data?

Put to the universe, compliments of Google:  

Data - Wikipedia

https://en.wikipedia.org/wiki/Data
Data is a set of values of qualitative or quantitative variables. An example of qualitative data would be an anthropologist's handwritten notes about her interviews ...

Data | Definition of Data by Merriam-Webster

www.merriam-webster.com/dictionary/data
noun plural but singular or plural in construction, often attributive da·ta \ˈdā-tə, ˈda- also ˈdä-\. Simple Definition of data. : facts or information used usually to ...



Data (/ˈdtə/ day-tə/ˈdætə/ da-tə, or /ˈdɑːtə/ dah-tə)[1] is a set of values of qualitative or quantitative variables. An example of qualitative data would be an anthropologist's handwritten notes about her interviews with people of an Indigenous tribe. Pieces of data are individual pieces of information. While the concept of data is commonly associated with scientific research, data is collected by a huge range of organizations and institutions, ranging from businesses (e.g., sales data, revenue, profits, stock price), governments (e.g., crime ratesunemployment ratesliteracy rates) and non-governmental organizations (e.g., censuses of the number of homeless people by non-profit organizations).
Data is measuredcollected and reported, and analyzed, whereupon it can be visualized using graphs, images or other analysis tools. Data as a general concept refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processingRaw data ("unprocessed data") is a collection of numbers or characters before it has been "cleaned" and corrected by researchers. Raw data needs to be corrected to remove outliers or obvious instrument or data entry errors (e.g., a thermometer reading from an outdoor Arctic location recording a tropical temperature). Data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next stage. Field data is raw data that is collected in an uncontrolled "in situ" environment. Experimental data is data that is generated within the context of a scientific investigation by observation and recording.

Information

Huh? Trust the technical marketers and sales types to over complicate a very simple concept.  The image sourced from "wikipedia" informs us that the following are data:


  • Location (geographical)
  • Cultural
  • Scientific
  • Financial
  • Statistical
  • Metrological
  • Natural
  • Transport

Sure sounds like information.  But it must mean that there is information within the pods of information.  It leads one to think that if there is lots and lots and lots of information, it gets all jumbled up and becomes data.



The metaphor that appears in my brain is my clothes hanging in my closet, all ready to be worn.  When I'm at my most organized, color coded.   Then you wear said outfit, and when you get home you change into chill-wear, where you feel most relaxed.  Those discarded clothing are tossed into your hamper and as the week progresses, it piles up.  It becomes laundry.  

Laundry is like data

When you get around to doing laundry, or lucky you show off if you have someone else to do it for you.  Or, grow up if your mom still does your laundry.  

Laundry is essentially clothes all lumped together

The best way to tackle the laundry is to separate into piles based on color (whites are orphans) and sometimes material (set aside because it is "dry clean" only);  BUT, not only within that mass of clothing that becomes a chore called laundry, they are underwear, t-shirts, jeans, socks, etc.  

Everyone does laundry in their own way, possibly influenced by the laundry doer in their life.  It is like a classification system:

  • what gets drycleaned
  • what is delicate
  • whites stay with whites
  • jeans
  • special instructions:  cold water, etc.

Do you iron right after it comes out of the dryer?  Or, do you fold right away so the creases where the iron could have been appear?  

Do your clothes just end up in a heap?  

Are you catching on?  Well, data is information.  When it is heaped together, whether by a company, government organization, or an information company like Google.  The data is completely useless unless it serves a purpose.  You would hardly wash your clothes if you were going to discard them.  So laundry, or the act of doing laundry, serves a purpose too.

Very few organizations have been disciplined and thoughtful about their data.   Now there are companies out there trying to convince us that we have a lot of data, maybe need to store it, how, and now with the CLOUD somewhere else.

If you have a small technical concept that your desktop computer sitting on your desk is your "personal computer" that only means it is designed for one user.  Huh?  Tricky hardware sellers eh?  Well, the trickier ones are the ones that sell where to store the data.  The companies, from very surprisingly small, to big mega corporations with global locations, can store a lot of information.

Similar to your laundry.  At what point do you become exasperated and decide it is time to sort through and donate some to charity.  In the world of data, companies have to decide what is no longer relevant and can be erased, deleted, gone.  The reason being, is as people work, they produce data.  They may share it, send it to multiple people within the same company, or outside, that singular piece of communication contains data.  Then, think about how much space this takes up?  It goes to the servers.

Then when the decision comes after systems start crashing and people start to get stressed out, many hide they went bezerk when systems crash, computers won't work.  Employees could be standing around waiting for systems to come back on.  Middle managers start mentally calculating productive time being lost, there goes those stats and perhaps twiddle away a meager bonus.  Like watching Niagara Falls pounding through your work area.

Data is building and building.  As more people have access to computers, whether at their workplace or home, they are generating data.    How do you manage your data as it multiplies at a fast pace?  

Someone says business is slow, so lets get the sales force engaged and pounding the pavement.  Where do they store all that information?  Oh, a rolodex with business cards stapled to them?  How 70s.  You think?

Bigger companies seem to want to police their people in greater scrutiny than small companies?  Probably no different in ratio, just technology driven with bigger companies.  

Policing people who work for you generates data.  Performance reviews, commission statements, bonus structure, all unique groups of information that turn the wheels of operation and the company going.



Who owns the data?  The idea generator or the company if the idea is generated while working for them?  How about if the idea is thought up when they are trying to fall asleep and are not even under the company roof?

Sounds like data isn't always rosy doesn't it?  

What about when it is decided that their front-facing screen to their employees, shareholders, investors, board of directors, media, and web presence look to the outside of the world?  

Someone may reach out to marketing, if there is such a department, or hire an agency who specializes in dissecting information.

Like if a friend were to drop over and you have a heap of dirty clothing in piles, sitting there.  You may be a little embarrassed (or your mother would want you to).

It is the same with a lot of companies data.  When they have someone to help them with sales or marketing, the company can be just as embarrassed because it is just heaped onto a server, not all that organized  ~ it is cheaper to buy a few more gigabytes capacity in equipment or software to host the information than it is to have someone keep it organized.  

That is the challenge with data.  That's my humble opinion.  What do you have?  Heaps of it unorganized?  Or bursting Rolodex (instead of a CRM ~Customer Relationship Management~system)?  Files crammed with photocopies of contracts and communications.    Well, before you turf anything, you should probably have it scanned and stored somewhere because the cost per square foot on storage is getting out of hand, so are people to keep track of something.  Almost extinct dinosaurs who can get you anything you want, anytime you need it.

Think about what amount of data you have?
Where do you store it?  (file cabinets or on your PC, oh, fancy you, you have a secondary hard drive).

So, that's about what Google, Apple or MicroSoft are all fighting over.  Or, big groups like IBM, Oracle, HP.  Your data!  Huh? Why would my data matter?  Well, if you are using a computer, software, buy anything online, download something, look at something, that information is being tracked that eventually becomes heap of data about you.  Where do you like to shop?  What do you like to wear?  You like that music? Nice!

Now, that information has to be important somewhere right?  Of course, the data is information about consumers.  Where are consumers buying?  Where do they live?  What do they do? Career, hobbies, sports, travel?  

That information then gets carved, crafted and wrapped in a manner that appeals to you.  Yes, after you willingly and openly gave that information.

That sounds BIG BROTHER or something for Ron Howard to make a movie out of.  

The evil vixen in all that?  Marketers or advertisers or media?  I guess we're all back to square one.  They can't be evil because they are just trying to do what their clients, the big corporations and brands ask them to do:  make sense out of the data mess.

It is a great binge watch.  What are all the data marketers saying you should do?  What are they proposing, what are they saying consumers or companies should do?  Or confirm that what they did was a good decision?  Or forewarn them against making a bad decision?  No, that is probably left to the bloggers and some journalists.

It's an important conversation is the impression I get.  Do you?