Since August 2020, I have been working as a data science researcher on the EDIFES project at Case Western Reserve University. The project’s goal is to transform the traditional, time-consuming, and costly in-person building energy audits into a virtual process using data science techniques. By doing so, we aim to reduce the time and cost associated with energy audits while improving their accuracy, ultimately leading to lower building carbon footprints and significant cost savings for companies. Our objective is to make it easier for companies to achieve both environmental and financial benefits and solutions such as Energy Modelling can ultimately help to achieve this.
As my initial task on the project, I was provided with a small subset of building data and asked to perform a comprehensive exploratory data analysis (EDA). Conducting an EDA is an excellent way to develop an intuitive understanding of the data involved in the project. This understanding is crucial as it enables me to recognize expected patterns and quickly identify anomalies or interesting trends in the data. Additionally, a thorough EDA allows me to test some of the functions already developed by the EDIFES team and potentially create new models. Moreover, I find the process of EDA enjoyable. There is something almost magical about transforming millions of lines of data into a single chart that explains all the trends or developing a model that can predict future outcomes with remarkable accuracy.
This article is the first in a three-part series (with a bonus fourth article entirely focused on machine learning). Although promising multiple parts often leads to disappointment when the author fails to deliver (looking at you, George R.R. Martin), rest assured that part 2 is already complete, and part 3 is well underway. If you choose to follow along for the entire journey, you will be introduced to the wonders of the R programming language, Python, Tableau, machine learning with Scikit-learn, and perhaps even some deep learning with TensorFlow. All of these tools are essential for data scientists, and I hope this series sparks interest in anyone looking to enter this exciting and lucrative field. As always, I welcome any feedback (email wjk68@case.edu) and encourage everyone to check out the entire project on GitHub.
Project Overview
An energy audit involves inspecting and analyzing a building’s electricity consumption patterns by documenting incoming and outgoing energy flows. When performed correctly, energy audits can significantly enhance a building’s overall efficiency, thereby reducing electricity costs and greenhouse gas emissions. However, traditional physical building energy audits are resource-intensive in terms of both time and cost. In-person audits require a team of auditors to travel to the building and conduct a series of tests and inspections using specialized equipment to assess the building’s efficiency and identify areas for improvement. Beyond the resource costs, the assessments can vary significantly between auditing teams, raising questions about the economic benefits of in-person audits. Notably, the U.S. Department of Energy estimates that building efficiency could improve by 30% by 2030 through the implementation of existing technologies. Nevertheless, it is evident that physical energy audits are not the optimal method for identifying opportunities to enhance efficiency.
EDIFES Approach
EDIFES, which stands for Energy Diagnostic Investigator for Efficiency Savings, is a project funded by ARPA-E and jointly undertaken by the Great Lakes Energy Institute and Case Western Reserve University. The project’s primary objective, under the guidance of Professors Alexis Abramson and Roger French, is to eliminate the need for physical audits by identifying areas for efficiency improvements using only overall building electricity consumption data and building metadata (such as location and square footage). A key component of this process involves developing a set of building markers derived from the electricity data, including building occupancy schedules and HVAC on/off cycle times, which can be used to characterize a building. In the initial phase of the project, EDIFES has partnered with several organizations to collect and analyze data, laying the groundwork for a software tool that can perform virtual energy audits effectively.
Exploratory Data Analysis
To begin the exploratory data analysis, I was provided with a dataset containing electricity consumption data for several buildings. The first step involved loading the data into a suitable environment for analysis. I chose to use Python for this task, given its robust data analysis libraries and widespread use in the data science community.
Once the data was loaded, I performed an initial examination to understand its structure and contents. This included checking for missing values, identifying the range of dates covered, and assessing the frequency of data collection. Understanding these aspects is crucial for ensuring the quality and reliability of any subsequent analyses.
Next, I created visualizations to explore the data further. Time series plots of electricity consumption allowed me to observe patterns and trends over time. These plots revealed daily and weekly cycles in energy usage, which are typical in building operations due to occupancy patterns and HVAC system schedules.
I also examined the data for anomalies or irregularities. Identifying periods of unusually high or low energy consumption can indicate potential issues with building systems or changes in occupancy behavior. Detecting these anomalies is essential for diagnosing inefficiencies and targeting areas for improvement.
Additionally, I explored the relationship between electricity consumption and external factors such as weather conditions. By correlating energy usage with temperature data, I aimed to understand how environmental factors influence building energy performance. This analysis can inform strategies for optimizing HVAC operations and improving overall energy efficiency.
Throughout the exploratory data analysis process, I documented my findings and developed a set of preliminary insights. These insights serve as a foundation for more advanced modeling and analysis in subsequent phases of the project. By thoroughly understanding the data, we can develop more accurate and effective tools for virtual energy audits, ultimately contributing to the goal of enhancing building energy efficiency through data-driven approaches.
Comments