Data Science Application in Power System Operation & Planning
Data Sciences has always been an essential part of Power System Operation and Planning. Statistical Analysis is the fundamental ingredient of Data Sciences. Statistical techniques such as Multiple Linear Regression, General Exponential Smoothing and Statistical Models such as ARIMA, SARIMA has been used since long for Load Forecasting. With the start of Renewable Energy integration into the power system, statistical techniques have been used extensively to evaluate reserve requirements. Another major component of Data science is data visualization. For Power system Analyst data visualization is a very important tool to effectively analyze the power system under study. Python is a very powerful platform for data analysis & visualization. Tools such as Pandas, Numpy, Matplotlib, Sci-kit learn e.t.c. are very effective in data analysis and visualization.
PSS/E is a well-known power system analysis tool which is widely used in the power system industry for Power System Studies. PSS/E integration with Python provides a unique tool to power system analysts. In simple words we can say that Python can be used to access PSS/E engine through extensive API routines. Here is an example to demonstrate how combination of Python & PSS/E can be powerful tool for power system analysis.
Generation in power system is usually dispatched based on a set merit order with cheapest generation dispatched first and so on. However sometimes expensive generation is dispatched first in order to avoid network congestion, voltage instability e.t.c in some areas. Such out of merit generation is usually termed as Minimum Must Run Generation in SEC. In complex or large power system it is difficult to study all demand scenarios since they can be very time consuming. In most of the utilities, usually Maximum & minimum load demand scenarios are studied to make sure that power system is capable of handling the two extremes. MMR can be easily calculated for these two scenarios. For minimum or maximum load demand, power system operator can easily run the calculated mount of MMR generation; but what if load demand is higher than minimum and less than maximum. Running generation as per minimum scenario may lead to reliability/stability issues whereas running generation as per maximum scenario may lead to additional operational cost. This is where data analysis comes in handy. Instead of two extreme operating points we can evaluate one or two additional operating points. For instance, if maximum operating point is the Peak load demand & minimum operating point is 70% of Peak load demand, then we can calculate MMR at one or two additional operating points such as 90% of Peak load demand & 80% of Peak load demand. Once we have load demand & MMR for these three or four points then we can use Linear Regression to estimate an equation which shows a relationship between load demand & MMR requirement. Since load demand is known to us, we can term it as independent variable. MMR is dependent on load demand therefore it can be termed as dependent variable.
Linear regression predicts a dependent variable value (y) based on a given independent variable (x). So, regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression. Statistical Model for Linear Regression is expressed as
Y = a + bX + e
Where
Y = Dependent Variable
X = Independent Variable
a = Intercept of Line
b = Slope of the line
e = Error term
Attached figure shows the result of linear regression. The calculated equation then can be implemented in SCADA to calculate load demand in real time and let system operator know how much MMR is required to be kept online to avoid any instability.
Awesome work