Data

Project Data

Table 1. Sample collected average data for 1986-1996

Sub	LULC	AREA	PCP	PET	ET	SW_INIT	SW_END	DAILYCN	TMP_AVG	TMP_MX	TMP_MN	SOL_TMP	SOLAR	USLE	LAI	SNO
1	PAST	132	392.049	736.595	310.243	112.107	105.547	63	2.73	8.68	-3.23	2.00	12.33	0.48	0.37	20.86
2	AGRR	326	392.049	732.234	364.337	58.913	44.123	70	2.73	8.68	-3.23	2.56	12.36	0.37	2.31	20.86
3	AGRR	49.7	392.049	736.859	355.672	58.978	36.631	71	2.73	8.68	-3.23	2.58	12.33	0.78	2.32	20.86

Future precipitation and snow accumulation in North Saskatchewan River Basin

Figure 1. Rainfall Precipitation and Snow Accumulation Change within NSRB

The above table is an abbreviated version of data table. The first column indicates the sub basin within the watershed. There are 174 watersheds, whcich are divided into different regions according to their average elevation. LULC stands for dominated land cover / land use type in the sub basin. Area is the area for the sub basin in km^2. PCP stands for mean precipitation in mm obtained from NCAR database, PET and ET are potential transpiration and actual transpiration within the sub basin in mm. SW_INIT and SW_END represents the soil water content at the beginning and the end of model simulation in mm. DAILYCN is the dominant curve number that represents soil permeability. TMP_AVG, MX, MN are average, maximum and minimum temperature across each basin in ℃, which were obtained from gridded data. SOL_TMP stands for simulated soil temperature in ℃, whereas SOLAR is average solar radiation in MJ/㎡. USLE stands for soil loss in the simulation in ton/hactares. LAI stands for a unitless leaf area index, showing the plantation coverage. SNO is the snow content within each sub basin in mm.

In this study, we firstly focus on how snow is impacted by the above variables, with other variables including SOL_TMP, USLE, LAI, PET and ET as response variables. The predictor variables would then be land use/ land cover, precipitation, max, average, min temperature, area, daily curve number and solar radiation. The predictor variable was simply obtained from observation. Among these predictor variables, only land use/ land cover is categorical, while the other variables are continuous.

To initiate the research, data should be checked first. The following diagrams illustrates the histogram of the input observed data.

Figure 4. Input Predictor Variables Histogram

As we can see, apart from DAILYCN and precipitation, the other datasets were relatively normally distributed. This means that in this case we do not manipulate the data to make it normally distributed firstly. However, distribution is not the only factor we need to analyze. Outliers should be removed to improve further multivariate analysis. The next figure here elaborates the outliers in our study.

Figure 5. Soil Water Content and Evapotranspiration Data Scatter Plot

The figure to the left is showing an outlier for sub basin 8. This outlier was mainly caused by the dominated land use/ land cover in this sub basin, which is water. Sub basin 8 is a very small sub basin and it is the merge point for crosses North Saskatchewan River and Beaverhill creek near the intersection of provincial highway 38 and Range Road 204. Since it is mainly a wetland, we would expect that there would be a large amount of evapotranspiration. Due to this outlier, the factor analysis output is drastically distorted and hence be removed.
The figure to the right is showing both outliers for sub basin 88 and 95. These two outliers were mainly caused by initial/final soil water content. This could be a pure computational error that occured randomly, since the curve numbers for these two sub basins are indicating that even though they have some water-retaining ability, the value should not be as high as 600 mm below soil. As a result, these two outliers were removed as well.