Chapter 6 - Here’s the Plan
This week, I focused my efforts on the algorithm for extracting the out of range data points. However, I have yet to find a good method or finalize the steps in the process. It is proving a bit harder than I originally anticipated. I will think I have it all figured out, and then all of sudden I realize there is a problem with my idea. I am used to this however. Every time I have had to create new procedures for my job, I have found this to be the case. I will start on a method, have it all worked out in my head, only to find out later that it does not work or I have to rework it. In the end though, I usually find a solid procedure that gets the job done. So based on my prior experiences, I am confident that I will develop a good algorithm that I will soon be able to document. The need for urgency is starting to draw nearer however. My due date for all the work is March 28th and I can easily say March came far quicker than expected. But I have given myself enough time it seems. I have one month to determine an algorithm, document those procedures, and then run all of the procedures in full to make sure all of them work as intended.
With the sense of urgency kicking in, I felt I needed to wrap up some of the loose ends that I had at this point. So, I went ahead and sent Pam the documentation I developed last week so that she could review them. She was out of the office all week at a conference, so we have not had a chance to review them in full. Her preliminary responses sounded positive, so it would seem there will only need to be a few minor tweaks. After I make those tweaks, I will touch up the documentation and I will put an electronic copy on the server and I will also print a copy as well.
So this week was indeed quiet, with me mostly working on the algorithm. As such, this week, I am going to discuss a little bit about the project plan that I am writing. For the class, as assignment on top of the project, we have to a write a project plan that discusses the intricate details of our project. This week, I will discuss the alternative solutions and the proposed solution section of the project plan since it details some of my ideas and plans for the algorithm. In this section I proposed two alternative solutions for this algorithm. These solutions have been guiding my work on the algorithm. I will give details on both solutions so that you can get a feel for where I am going with my work on the algorithm. First, I must explain what will happen to the data in the six spreadsheets that are created from the documentation I developed last week. After completing the downloading procedures there will be a spreadsheet for each device, with each one having three columns, one for date and time, temperature, and relative humidity. Both solutions suggest combining the six spreadsheets into one master spreadsheet. This should make the work easier since it would put all of the data into one file. With that said, below are the details of each solution.
Solution 1
This solution begins by moving the six spreadsheets into a master spreadsheet. There will be one file with six worksheets, one for each device. You will start out by running a macro that sorts the data in each sheet by temperature. It will sort the column using descending order. This will show you the highest temperatures recorded for each device. Then the macro will go through each record to see if the temperature is above 70. If a record is above 70, it will copy out the date, time, and temperature and it will paste them to another location. Next, it will sort by relative humidity and then copy out the ones that are above 55%. Once that is completed, it will create two sections in that device’s sheet, one for temperature and one for relative humidity. Once you run the macro, everything is set up, requiring little effort from the user. From there, you now have the time and day that a particular device registered a high temperature or a high relative humidity.
Now this does come with a risk. Since it checks for temperature and relative humidity separately, there is a risk of duplicates. If the data point has a high temperature and a high relative humidity it will be pulled out twice. The risk of duplicates could be frequent to almost non-existent. It all depends on the data that is collected. There could be some data sets that have more duplicates than others and there may be some that have none at all. If there are duplicates, a way to get rid of them would be to combine the temperature and relative humidity columns and do a filter, or use Excel 2007’s new duplicate record function. So although this solution runs the risk of duplicates, it does fix the problem. It takes the data for each device and pulls out the out of range points and sets them aside. From here you can take out that data for any device and view its highest temperatures or its highest relative humidity.
Overall, this solution would get the job done, and relatively quickly. This solution is probably the most self automated. You would literally just have to put the device spreadsheets into the master spreadsheet and then just run the macro. However, in order to complete this quickly, you sacrifice some data integrity. Although this one does separate the out of range data points, duplicate records are not a good thing to have. So, I took my ideas from this solution and tried to revamp them a bit. I do this frequently when creating new procedures. As mentioned earlier, I do a lot of procedures for my job and this how it commonly goes. I will come up with an idea only to find an issue with it. Then I will build a way around that until I get to the finished product. However, I do know that if for some reason I cannot find a way around this problem with the duplicate records, I at least have a solution to fall back on that will correctly pull out the data points.
Solution 2
Like in solution one, you will combine all of the spreadsheets into one master spreadsheet. This time however, you will combine all of the data into just one sheet. So there would be one file with one sheet. When you copy the data into this one sheet, you have to mark which device the data comes from. For example, if you copy over data from device 1, you need to place L1 next to each record for that device. Then you would do the same for subsequent devices. This connects the records to its corresponding device. Then you would run another macro.
This macro will sort this one sheet by temperature and then by relative humidity. This macro will eliminate duplicates. It will first check to see if the data has a temperature above 70, if it does, it will then check to see if it has a relative humidity greater than 55%. If it has both of those, it will copy it out and put it into a new sheet, making sure that it copies out the device number. If its temperature is higher than 70 but the relative humidity is not 55%, it will still copy out to that sheet. If its temperature is not higher than 70 but its relative humidity is above 55%, it will also copy out. This will remove duplicate records. In solution one it checked to see if the temperature was above 70. If it was, it copied it out. Then later it did another check on those same records to see if its relative humidity was above 55%. If it was, it copied it out again. This led to the duplicates. This one will check both conditions at once. If it has both, it will be copied out only once. If it has one or the other, it will still be copied out, but it will never go back to that record again since it was coded to already check for the other condition. Once completed, all of the out of range data will be in one sheet, mapped by the device ID written in by the user prior to running the macro (all without duplicate records).
Solution 2 is the one I choose as my proposed solution. Like solution 1, it does get the job done. However, it will take longer to run since it has to go through many conditionals to pull out the out of range data points. This solution also depends heavily on the user placing the proper ID next to each record. If you don’t, you lose the connection between the points and the device. So there is an inherent risk with that. Although it may take more work and would take longer to run, solution 2 is far more accurate and it will give you a better picture of what is occurring. The macro will take a little while to create and will involve some low level visual basic programming, but I have worked with simple conditionals before, so I feel confident that I should get this solution to work.
That is pretty much it for the alternative and proposed solution section of the project plan that I have been writing. The project plan also includes sections for a work break down schedule, an executive summary, and assumptions and constraints. Next week, I am going to combine all of the pieces of the project plan into one document. This document will be a rough draft of the final project plan and it will be reviewed by the instructor. Next week, I will post this document so that you get a chance to see the whole project plan. For now though, you can click here to view the full alternative solutions and proposed solution section.
That wraps up this week. I covered a lot about the algorithm for extracting the out of range points. By going over the alternative solutions I have laid out my plans for the algorithm. Hopefully by next week I will have it completed and I can discuss my results in further detail. Until then, thanks for reading.