How to clean up imported data in excel using the data autofilter so that you can sort or pivot the clean data
Clean Up Imported Data in Excel
The Data AutoFilter, is an extremely useful tool, and if you have never used it before, I highly recommend that you look at our video clip on the Auto Filter basics.
In this video clip however, we are going to show you how to clean up imported data in Excel by using the Data AutoFilter tool.
In this example we have a report from an IT system, which shows a report of our sales people. We would like to summarize and graph this information. But the report has generated headers and footers which get in the way to using Excel features properly.
The AutoFilter feature will allow you to quickly cleanup the data, but before you do anything you follow the two cardinal rules of manipulating data.
You must establish and document a control total.
In this case at the bottom of the report we have a Grand Total amount. This will ensure that as you manipulate the data, you don’t add or loose any bits of information. You must make a copy of the data, and keep it safe from manipulation, because you never know when something can go wrong and you need to get back to the original data.
To activate the Data AutoFilter,
you highlight the data that is relevant to the exercise,
And you click Data
And the arrows will appear
You can now start the cleanup operation
Have a look at the report and just know in your mind what information you are looking to keep, and what information you are looking to get rid of,
Then when you click on the arrow you will be shown all the information that currently exists in that column and you can start working through it.
So these dotted lines;
if we highlight them all, you’ll see they are brought up,
you can highlight them,
delete the row,
and when you go back,
they will have been removed,
you can continue working through the items individually, or else
you can click on the Custom option,
you can show me the items where it equals, and we’ll take the first dotted lines, or,
where it equals lets say example, Company,
when you click ok,
those items will be pulled up
I advise you just page down and just make sure that data that is important to you is not included in here, because IT reports are not always consistent.
When you are happy that all those items can be deleted
you just highlight the section,
you delete the rows,
and when you go back to All
you’ll see that slowly but surely your information is being cleaned up.
Another clever use for the Custom function, is you’ll see that here we have page 1, 2, 3, 4, 5. If you had to go through and individually highlight each one it would take a bit of time, what you can rather do then is;
go to Custom
when you click on the options you’ll se that as you work down
you’ll find one the that says contains
and this case anything that contains the word Page, you want to see,
and there you’ll have all the items that contain page which appear to not contain any other data that we need,
so you can highlight them,
you delete the row
and now those items will also be removed
Looking through the report, you may have noticed that at the bottom of each page there seems to be a maximum and minimum row, which just repeats the information above. As it contains what could be information that is important, if you just looked in this column, what you could do is;
click in the other column
and highlight in this case all the maximums
and delete them,
and then all the minimums, or
you could have used the Customs feature
and highlighted maximum or minimum and that will remove all those items.
You may also want to delete these blank rows, this can easily be achieved if you
click on the arrows at the very bottom
you will see an option saying highlight the Blanks,
if you click on it, it brings up any cell in this column that is blank,
now you’ll see we’ve got subtotal here, we know that we don’t need these, however just an example if this was
important to you,
you could then go here,
click on that arrow
and say show me the Blanks here,
so now where ever the column this cell is blank, and that cell is blank is highlighted
you can then highlight selected rows,
and when you see them all again you’ll see all the blank rows have been removed.
If you continue with this process of highlighting what you don’t need
And deleting the rows,
you will eventually end up with something that looks like this,
and you can then finally eliminate the last few items
and when you look at all the information it will be nicely setup in a format that Excel handles well
and if you want, you can just delete this top row
As a final check;
you should go to the bottom,
and do a quick sum and just make sure your control total that you’ve come up with is equal to the control total of the original information,
and if that is correct you can then work on this data
generate your graphs, pivot tables or summaries that you require