Part 1 – The Basics When Working With Data

Your analysis in Microsoft Excel will start with data which means that understanding your data is a key part to setting yourself up for success!

The following series of articles will cover what you need to know:

  1. The basics when working with data (covered in this article)
  2. Advanced understanding of your data
  3. Planning your analytical approach
  4. Does the data help you meet your objectives

Basic things you should be aware of when working with data:

  • Does your data have duplicate values?
  • What timeframe is your data capturing?
  • How is your data formatted?
  • Do you have numbers stored as text?
  • Do any of your data values have spaces (before or after)

These are some basic things that you should be paying attention to when you’re working with data in Microsoft Excel.

Now to get into a little more detail about each one!

1. Does your data have duplicate values?

Having duplicate values in any of your columns is not necessarily a bad thing and depends on how your data is set up and if it makes sense to you. This requires an understanding of the application of the data, or some open-minded thinking, or both.

However if your data is actually duplicated you will end up with wrong results. This is completely different than understanding how to work with duplicate values within certain columns in your data.

duplicate values in data

In this example we know we have 2 separate sales teams so it’s okay to have duplicates in our Location column. Some quick math gives us the following sales by location:

Toronto = $13,000
Vancouver = $28,000

The fact that George’s sales shows up twice could be an issue:

  • Are there two George’s on the team? Is it just a coincidence that they both sold $4,000?
  • Is the data showing 2 sales time periods for George and if so why only for George?
  • This could be an issue with how the data is formatted or how the data was collected.

You will have to figure out if you’re dealing with this kind of situation with your own data.

2. What timeframe is your data capturing?

Whenever you are working with data keep in mind the timeframe that your data is capturing. Is your data covering a full year? Only a week? It will impact the way you use your data to complete your analysis.

If there is nothing in your data to indicate a time period you should be asking the question to someone that can explain the data you are using.

3. How is your data formatted?

This part is a matter of being aware whether you need to clean up your data depending on the raw data format.

  • Does your data contain totals of certain rows in your data set? Do you need them or will you be double-counting if you leave them in?
  • Is your data completely filled in or do you have to populate the value into additional cells?
  • Do you have data formatted entirely as a text string? If so you might need to use the “Text to columns” feature to parse out your data into columns using a predefined set of criteria (spaces, commas, etc..)

4. Do you have numbers stored as text?

If you have numbers stored as text your numbers will show up with a green triangle in the top let corner of the cell. In the example below the first 5 numbers in the column are stored as text and the last 4 are numbers.

You can convert numbers stored as text into numbers with the following steps:

  1. Highlight (select) your data range containing the numbers stored as text
  2. A yellow warning symbol will automatically display
  3. Hover your mouse over the symbol and it will display the message shown below
  4. Click on the symbol with your mouse and it will display the menu shown below
  5. Click on ‘Convert to Number’
  6. Your data range is now converted to numbers

Numbers stored as text

5. Do any of your data values have spaces (before or after)?

You should validate whether you have spaces before or after any of your data values. If you are writing formulas to try and find a certain “lookup value” in your data, you might get errors from your formulas if the data that you’re searching in has spaces before or after your lookup value.

As an example we want to find “Joy” in our list of names using a VLOOKUP function but if her name in the data we are searching is spelled “Joy ” (with a space at the end) we will not be able to find a result.

The proper way to fix this is with the TRIM function.

The quick and easy way to fix this is by highlighting the range of values with the spaces and do a “Find / Replace All” by “finding” a space (actually hit the spacebar once) and “replace with” (leave this field empty – do not touch it). Hit Replace All and your spaces should disappear. Note: If you have spaces in between text this “quick and easy” way gets rid of those too which might not be what you want.


Go back to Excel Tutorials.