-
Notifications
You must be signed in to change notification settings - Fork 63
/
Copy pathTutorial3_DataSort.Rmd
659 lines (513 loc) · 22.5 KB
/
Tutorial3_DataSort.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
% Tutorial 3: Manipulating Data in R
% DPI R Bootcamp
% Jared Knowles
# Overview
In this lesson we hope to learn:
- Aggregating data
- Organizing our data
- Manipulating vectors
- Dealing with missing data
<p align="center"><img src="img/dpilogo.png" height="81" width="138"></p>
```{r setup, include=FALSE}
# set global chunk options
opts_chunk$set(fig.path='figure/slides3-', cache.path='cache/slides3-',fig.width=8,fig.height=6,message=FALSE,error=FALSE,warning=FALSE,echo=TRUE,dev='svg',comment=NA,size='tiny')
options(width=70)
library(plyr)
```
# Again, read in our dataset
```{r readdata}
# Set working directory to the tutorial directory
# In RStudio can do this in "Tools" tab
setwd('~/GitHub/r_tutorial_ed')
#Load some data
load("data/smalldata.rda")
# Note if we don't assign data to 'df'
# R just prints contents of table
```
# Aggregation
- Sometimes we need to do some basic checking for the number of observations or types of observations in our dataset
- To do this quickly and easily - the `table` function is our friend
- Let's look at our observations by year and grade
```{r table}
table(df$grade,df$year)
```
- The first command gives the rows, the second gives the columns
- Ugly, but effective
# Aggregation can be more complex
- Let's aggregate by race and year
```{r table2}
table(df$year,df$race)
```
- Race is consistent across years, interesting
- What if we want to only look at 3rd graders that year?
# More complicated still
```{r table3}
with(df[df$grade==3,],{table(year,race)})
```
- `with` specifies a data object to work on, in this case all elements of `df` where `grade==3`
- `table` is the same command as above, but since we specified the data object in the `with` statement, we don't need the `df$` in front of the variables of interest
```{r table3.2}
df2<-subset(df,grade==3)
table(df2$year,df2$race)
rm(df2)
```
# Quick exercise
- Can you find the number of black students in each grade in each year?
- hint: **`with(df[df$___==___,]...)`**
- How many in year 2002, grade 6?
- 48
- How many in 2001, grade 7?
- 39
# Answer
```{r table4,tidy=FALSE}
with(df[df$race=="B",],{table(year,grade)})
```
- Quick question, how can we understand the three types of closures we have in this function: **()** **[]** and **{}**
# Tables cont.
- This is really powerful for looking at the descriptive dimensions of the data, we can ask questions like:
- How many students are at each proficiency level each year?
```{r proftable}
table(df$year,df$proflvl)
```
- How many students are at each proficiency level by race?
```{r raceproftable}
table(df$race,df$proflvl)
```
# Proportional Tables
- What if we aren't interested in counts?
- R makes it really easy to calculate proportions
```{r proptable1}
prop.table(table(df$race,df$proflvl))
```
- Hmmm, this is goofy. This tells us the proportion of each cell out of the total. Also, the digits are distracting. How can we fix this?
# Try number 2
```{r proptable2}
round(prop.table(table(df$race,df$proflvl),1),digits=3)
```
- The `1` tells R we want proportions rowise, a `2` goes columnwise
- `round` tells R to cut off some digits for us
- Proportions are just that, not in percentage terms (we need to multiply by 100 for this)
- Can you make this table express percentages instead of proportions? How might that code look?
- A few more problems arise - this pools all observations, including students across years
- To avoid these, we need to aggregate the data somehow
# Checking Understanding
- We have seen how to chain functions together
- We have also seen how to examine a dataframe by looking at the observations in it
- We are now going to move on to aggregating data so we can look at unique cases when we have more than one observation for each unit
# Aggregating Data
- One of the most common questions you need to answer is to compute aggregates of data
- R has an `aggregate` function that can be used and helps us avoid the clustering problems above
- This works great for simple aggregation like scale score by race, we just need a `formula` (think I want variable X **by** grouping factor Y) and the statistic we want to compute
```{r aggregate1}
# Reading Scores by Race
aggregate(readSS~race,FUN=mean,data=df)
```
# Aggregate (II)
- `aggregate` can take us a little further, we can use aggregate multiple variables at a time
```{r aggregate2}
aggregate(cbind(readSS,mathSS)~race,data=df,mean)
```
- We can add multiple grouping varialbes using the `formula` syntax
```{r aggregate3}
head(aggregate(cbind(readSS,mathSS)~race+grade,data=df,mean),8)
```
# Crosstabs
- We can build a systematic cross-tab now
```{r crosstab}
ag<-aggregate(readSS~race+grade,data=df,mean)
xtabs(readSS~., data=ag)
```
- And prettier output
```{r crosstabpretty}
ftable(xtabs(readSS~.,data=ag))
```
# Check your work
- What is the mean reading score for 6th grade students with disabilities?
* __481.83__
- How many points is this from non-disabled students?
* __29.877__
# Answer II
```{r aggregatetest}
aggregate(cbind(readSS,mathSS)~disab+grade,data=df,mean)
```
# School Means
- Consider the case we want to turn our student level data into school level data
- Who hasn't had to do this?!?
- In `aggregate` we do:
```{r aggregate4}
z<-aggregate(readSS~dist,FUN=mean,data=df)
z
```
- But I want more! I want to aggregate multiple variables. I want to do it across multiple groups. I want the output to be a dataframe I can work on.
- Thank you `plyr`
# Aggregate Isn't Enough
- `aggregate` is cool, but it isn't very flexible
- We can only use aggregate output as a table, which we have to convert to a data frame
- There is a better way; the `plyr` package
- `plyr` is a set of routines/logical structure for transforming, summarizing, reshaping, and reorganizing data objects of one type in R into another type (or the same type)
- We will focus here on summarizing and aggregating a data frame, but later in the bootcamp we'll apply functions to lists and turn lists into data frames as well
- This is cool!
# The Logic of plyr
- In R this is known as "split, apply, and combine"
- Why? First, we **split** the data into groups by some factor or logical operator
- Then we **apply** some function or another to that group (i.e. count the unique values of a variable, take the mean of a variable, etc.)
- Then we **combine** the data back together
- This has some advantages - unlike other methods, the data does not have to be ordered by our ID variable for this to work
- The disadvantage is that this method is computationally expensive, even in R, and requires copying our data frame using up RAM
# An Aside about Split-Apply-Combine
- The `plyr` package has a number of utilities to help us split-apply-combine across data types for both input and output
- In R we can't just use `for` loops to iterate over groups of students, because in R `for` loops are [slow, inefficient, and impractical](http://stackoverflow.com/questions/7142767/why-are-loops-slow-in-r)
- `plyr` to the rescue, while not as fast as a compiled language, it is pretty dang good!
- And still readable
<p align="center"><img src="img/plyrfunctions.png" height="200" width="650"></p>
# The logic of plyr
- This shows how the dataframe is broken up into pieces and each piece then gets whatever functions, summaries, or transformations we apply to it
<p align="center"><img src="img/dataframesplit.png" height="300" width="650"></p>
# How plyr works on dataframes
- And this shows the output `ddply` has before it combines it back for us when we do the call `ddply(df,.(sex,age),"nrow")`
<p align="center"><img src="img/plyrddplyoutput.png" height="260" width="650"></p>
# Using plyr
- `plyr` has a straightforward syntax
- All `plyr` functions are in the format **XX**ply. The two X's specify what the input file we are applying a function to is, and then what way we would like it outputted.
- In `plyr` d = dataframe, l= list, m=matrix, and a=array. By far the most common usage is `ddply`
- From a dataframe, to a dataframe.
- We will see more of `plyr` in Tutorial 4 as well
# plyr in Action
```{r plyr1,tidy=FALSE}
library(plyr)
myag<-ddply(df, .(dist,grade),summarize,
mean_read=mean(readSS,na.rm=T),
mean_math=mean(mathSS,na.rm=T),
sd_read=sd(readSS,na.rm=T),
sd_math=sd(mathSS,na.rm=T),
count_read=length(readSS),
count_math=length(mathSS))
```
- This looks complex, but it only has a few components.
- The first argument is the dataframe we are working on, the next argument is the level of identification we want to aggregate to
- `summarize` tells `ddply` what we are doing to the data frame
- Then we make a list of new variable names, and how to calculate them on each of the subsets in our large data frame
- That's it!
# Results
```{r plyr1.1,tidy=FALSE}
head(myag)
```
# More plyr
- This is great, we can quickly build a summary dataset from individual records
- A few advanced tricks. How do we build counts and percentages into our dataset?
```{r plyr3,tidy=FALSE}
myag<-ddply(df, .(dist,grade),summarize,
mean_read=mean(readSS,na.rm=T),
mean_math=mean(mathSS,na.rm=T),
sd_read=sd(readSS,na.rm=T),
sd_math=sd(mathSS,na.rm=T),
count_read=length(readSS),
count_math=length(mathSS),
count_black=length(race[race=='B']),
per_black=length(race[race=='B'])/length(readSS))
summary(myag[,7:10])
```
# Note for SQL Junkies
- There is an alternate package to plyr called `data.table` which is really handy
- It allows SQL like querying of R data frames
- It is incredibly fast
- It will be incorporated into the next `plyr` version
- You can [read up on it online](http://datatable.r-forge.r-project.org/)
# Quick Exercises in ddply
- What if we want to compare how districts do on educating ELL students?
- What district ID has the highest mean score for 4th grade ELL students on reading? Math?
* 66 in reading, 105 in math
- How many students are in these classes?
* 12 and 7 respectively
# Answer III
```{r check2,tidy=FALSE}
myag2<-ddply(df, .(dist,grade,ell),summarize,
mean_read=mean(readSS,na.rm=T),
mean_math=mean(mathSS,na.rm=T),
sd_read=sd(readSS,na.rm=T),
sd_math=sd(mathSS,na.rm=T),
count_read=length(readSS),
count_math=length(mathSS),
count_black=length(race[race=='B']),
per_black=length(race[race=='B'])/length(readSS))
subset(myag2,ell==1&grade==4)
```
# Sorting
- A key way to explore data in tabular form is to sort data
- Sorting data in R can be dangerous as you can reorder the vectors of a dataframe
- We use the `order` function to sort data
```{r sortwrong}
df.badsort<-order(df$readSS,df$mathSS)
head(df.badsort)
```
- Why is this wrong? What is R giving us?
- Rownames...
# Correct Example
- To fix it, we need to tell R to reorder the dataframe using the rownames in the order we want
```{r sortright}
df.sort<-df[ order(df$readSS,df$mathSS,df$attday),]
head(df[,c(3,23,29,30)])
head(df.sort[,c(3,23,29,30)])
```
# Let's clean it up a bit more
```{r sortright2}
head(df[with(df,order(-readSS,-attday)),c(3,23,29,30)])
```
- Here we find the high performing students, note that the `-` denotes we want descending order, R's default is ascending order
- This is easy to correct
# About sorting
- Sorting works differently on some data types like matrices
```{r matrixsort}
M<-matrix(c(1,2,2,2,3,6,4,5),4,2,byrow=FALSE,dimnames=list(NULL,c("a","b")))
M[order(M[,"a"],-M[,"b"]),] # can't use "with"
```
# About Sorting
- Tables are familiar
```{r sorttable}
mytab<-table(df$grade,df$year)
mytab[order(mytab[,1]),]
mytab[order(mytab[,2]),]
```
# Filtering Data
- Filtering data is an incredibly powerful feature and we have already seen it used to do some interesting things
- Filtering data in R is loaded with trouble though, because the filtering arguments must be very carefully specified
- Filtering is like a mini-sort, and we've done it already
- Always, always, always check your work
- And remember, this is the place the NAs do the most damage
- Let's look at some examples
# Basic Filtering a Column
```{r columnfilter}
# Gives all rows that meet this requirement
df[df$readSS>800,]
df$grade[df$mathSS>800]
# Gives all values of grade that meet this requirement
```
- Before the brackets we specify what we want returned, and within the brackets we present the logical expression to evaluate
- Behind the scenes R does a logical test and gets the row numbers that match the logical expression
- It then combines them back with the object in front of the brackets to return the values
- This seems basic enough, let's filter on multiple dimensions
# Multiple filters
```{r filterlayers}
df$grade[df$black==1 & df$readSS>650]
```
- The **&** operator tells R we want rows where **both** of these are true
- How would we tell R we wanted rows where **either** were true?
- What happens if we type `df$black=1` or `black==1`?
- Why won't this work?
# Using filters to assign values
- We can also use filters to assign values as well
- This is how you recode variables and create new ones
- Let's create a variable `spread` indicating whether a district has high or low spread among its student scores
```{r filtersort}
myag$spread<-NA # create variable
myag$spread[myag$sd_read<75]<-"low"
myag$spread[myag$sd_read>75]<-"high"
myag$spread<-as.factor(myag$spread)
summary(myag$spread)
```
- How did we define **spread** in this block of code?
# How does it work?
- The previous block of code is a useful way to learn how to recode variables
```{r recodeexamp,eval=FALSE}
myag$spread<-NA # create variable
myag$spread[myag$sd_read<75]<-"low"
myag$spread[myag$sd_read>75]<-"high"
myag$spread<-as.factor(myag$spread)
```
- Create a new variable in `myag` called `schoolperf` for `mean_math` scores with the following coding scheme:
Grade Score Range Code
----- ----------- ----
3 >425 "Hi"
4 >450 "Hi"
5 >475 "Hi"
6 >500 "Hi"
7 >525 "Hi"
8 >575 "Hi"
- All other values are coded as "lo"
- How many "high" and "lo" observations do we have?
- By `dist`?
# Results
```{r answercoding}
myag$schoolperf<-"lo"
myag$schoolperf[myag$grade==3 & myag$mean_math>425]<-"hi"
myag$schoolperf[myag$grade==4 & myag$mean_math>450]<-"hi"
myag$schoolperf[myag$grade==5 & myag$mean_math>475]<-"hi"
myag$schoolperf[myag$grade==6 & myag$mean_math>500]<-"hi"
myag$schoolperf[myag$grade==7 & myag$mean_math>525]<-"hi"
myag$schoolperf[myag$grade==8 & myag$mean_math>575]<-"hi"
myag$schoolperf<-as.factor(myag$schoolperf)
summary(myag$schoolperf)
table(myag$dist,myag$schoolperf)
```
# Let's replace data
- For district 6 let's negate the grade 3 scores by replacing them with missing data
```{r replacedata}
myag$mean_read[myag$dist==6 &myag$grade==3]<-NA
head(myag[,1:4],2)
```
- Let's replace one data element with another
```{r replacedata2}
myag$mean_read[myag$dist==6 & myag$grade==3]<-myag$mean_read[myag$dist==6 & myag$grade==4]
head(myag[,1:4],2)
```
- Voila
# Why do NAs matter so much?
- Let's consider the case above but insert some NA values for all 3rd grade tests
```{r munge}
myag$mean_read[myag$grade==3]<-NA
head(myag[order(myag$grade),1:4])
```
# NAs II
- Now let's calculate a few statistics:
```{r means}
mean(myag$mean_math)
mean(myag$mean_read)
```
- Remember, NA values propogate, so R assumes an NA value could take literally any value, and as such it is impossible to know the `mean` of a vector with NA
- We can override this though:
```{r meansna}
mean(myag$mean_math,na.rm=T)
mean(myag$mean_read,na.rm=T)
```
# Beyond the Mean
- But for other problems it is tricky
- What if we want to know the number of rows that have a `mean_read` of less than 500?
```{r length}
length(myag$dist[myag$mean_read<500])
head(myag$mean_read[myag$mean_read<500])
```
- And what if we want to add the standard deviation to these vectors?
```{r addtomissing}
badvar<-myag$mean_read+myag$sd_read
summary(badvar)
```
# So we need to filter NAs explicitly
- Consider the case where two sets of variables have different missing elements
```{r moremissing}
myag$sd_read[myag$count_read<100 & myag$mean_read<550]<-NA
length(myag$mean_read[myag$mean_read<550])
length(myag$mean_read[myag$mean_read<550 & !is.na(myag$mean_read)])
```
- What is `!is.na()` ?
* `is.na()` is a helpful function to identify TRUE if a value is missing
* `!` is the reverse operator
* We are asking R if this value is not a missing value, and to only give us non-missing values back
# Merging Data
- It is unlikely all the data we will want resides in a single dataset and often we have to combine data from several sources
- R makes this easy, but that simplicity comes at a cost - it can be easy to make mistakes if you don't specify things carefully
- Let's merge attributes about a student's school with the student row data
- We might want to do that if we want to evaluate the performance of students in different school climates, and school climate was measured in part by the mean performance
# Merging Data II
- We have two data objects `df` which has multiple rows per student and `myag` which has multiple rows per school
- What are the variables that **link** these two together?
```{r mymerge1}
names(myag)
names(df[,c(2,3,4,6)])
```
- It looks like `dist` and `grade` are in common. Is this ok?
- Why might we want to consider re-aggregating with `year` as well?
- For this example we won't just yet
# Merge Options
- We have a few options with `merge` we want to consider with `?merge`
- In the simple case we let `merge` **automagically** combine the data
```{r simplemerge}
simple_merge<-merge(df,myag)
names(simple_merge)
```
- It looks like it did a good job
# Merge Options
- In complicated cases, merge has some important options we should review
- First is the simple sounding 'by' argument:
- `simple_merge(df1,df2,by=c("id1","id2"))`
- We can also specify `simple_merge(df1,df2,by.x=c("id1","id2"),by.y=c("id1_a","id2_a"))`
- This allows us to have different names for our ID variables
- Now, what if we have two different sized objects and not all matches between them?
- `notsosimple_merge(df1,df2,all.x=TRUE,all.y=TRUE)`
- We can tell R whether we want to keep all of the `x` observations (df1), all the `y` observations (df2) or neither, or both
```{r createwidedata,echo=FALSE,results='hide'}
widedf<-reshape(df,timevar="year",idvar="stuid",direction="wide")
```
# Reshaping Data
- Reshaping data is a slightly different issue than aggregating data
- Let's review the two data types: long and wide
```{r longpreview}
head(df[,1:10],3)
```
- Now let's look at wide:
```{r widepreview}
head(widedf[,c(1,28:40)],3)
```
- How did we reshape this data?
# Wide Data v. Long Data
- The great debate
- Most econometrics, panel, and time series datasets come wide and so these seem familiar
- R for most cases prefers long data, including for most graphing and analysis functions
- So we have to learn both
# The reshape Function
- `reshape` is the way to move from wide to long
- The data stays the same, but the shape of it changes
- The long data had dimensions: `r dim(df)`
- The wide data has dimensions: `r dim(widedf)`
- How do we get to these numbers?
* The rows in the wide dataframe represent unique students
# Deconstructing reshape
```{r reshapeanaly, eval=FALSE}
widedf<-reshape(df,timevar="year",idvar="stuid",direction="wide")
```
- `idvar` represents the unit we want to represent a single row, in this case each unique student gets a single row
- In this simple case `timevar` is the variable that differenaties between two rows with the same student ID
- Note that `timevar` needn't always represent time!
- `direction` tells R we are going to move to wide data
- As written all data will move, but using the `varying` argument we can tell R explicitly which items we want to move wide
# What about Wide to Long?
- We often need to do this to plot data in R
- Luckily the `reshape` function works well in both directions
```{r widetolong}
longdf<-reshape(widedf,idvar='stuid',timevar='year',
varying=names(widedf[,2:91]),direction='long',sep=".")
```
- If our data is formatted nicely, R can do the guessing and identify the years for us by parsing the dataframe names
# Subsetting Data
- We have already seen a lot of subsetting examples above, which is what filtering is, but R provides some great shortcuts to this
- Let's look at the `subset` function to get only 4th grade scores
```{r subset1}
g4<-subset(df,grade==4)
dim(g4)
```
- This is equivalent to:
```{r subset2}
g4_b<-df[df$grade==4,]
```
- These two elements are the same:
```{r testofequality}
identical(g4,g4_b)
```
# That's it
- Now you can filter, subset, sort, recode, and aggregate data!
- Let's look at a few exercises to test these skills
- Once these skills are mastered, we can begin to understand how to automate R to clean data with known errors, and to recode data in R so it is ready to be used for analysis
- Then we can really take off!
# Exercises
1. Say we are unhappy about attributing the school/grade mean score across years to student-year observations like we did in this lesson. Let's fix it by **first** aggregating our student data frame to a school/grade/year data frame, and **second** by merging that new data frame with our student level data.
2. Sort the student-level data frame on `attday` and `ability` in descending order.
3. Find the highest proportion of black students in any school/grade/year combination.
# Other References
- [Quick-R: Data Management](http://www.statmethods.net/management/index.html)
- [UCLA ATS: R FAQ on Data Management](http://www.ats.ucla.edu/stat/r/faq/default.htm)
- [Video Tutorials](http://www.twotorials.com/)
- [Quick-R: Sorting Data](http://www.ats.ucla.edu/stat/r/faq/sort.htm)
- [UCLA R Data Sorting Tutorial](http://www.ats.ucla.edu/stat/r/faq/sort.htm)
# Session Info
It is good to include the session info, e.g. this document is produced with **knitr** version `r packageVersion('knitr')`. Here is my session info:
```{r session-info}
print(sessionInfo(), locale=FALSE)
```
# Attribution and License
<p xmlns:dct="http://purl.org/dc/terms/">
<a rel="license" href="http://creativecommons.org/publicdomain/mark/1.0/">
<img src="http://i.creativecommons.org/p/mark/1.0/88x31.png"
style="border-style: none;" alt="Public Domain Mark" />
</a>
<br />
This work (<span property="dct:title">R Tutorial for Education</span>, by <a href="www.jaredknowles.com" rel="dct:creator"><span property="dct:title">Jared E. Knowles</span></a>), in service of the <a href="http://www.dpi.wi.gov" rel="dct:publisher"><span property="dct:title">Wisconsin Department of Public Instruction</span></a>, is free of known copyright restrictions.
</p>