Skip to contents

The Batting does not contain batting statistics derived from those present in the data.frame. This function calculates batting average (BA), plate appearances (PA), total bases (TB), slugging percentage (SlugPct), on-base percentage (OBP), on-base percentage + slugging (OPS), and batting average on balls in play (BABIP) for each record in a Batting-like data.frame.

Usage

battingStats(data = Lahman::Batting, 
             idvars = c("playerID", "yearID", "stint", "teamID", "lgID"), 
             cbind = TRUE)

Arguments

data

input data, typically Batting

idvars

ID variables to include in the output data.frame

cbind

If TRUE, the calculated statistics are appended to the input data as additional columns

Details

Standard calculations, e.g., BA <- H/AB are problematic because of the presence of NAs and zeros. This function tries to deal with those problems.

Value

A data.frame with all the observations in data. If cbind==FALSE, only the idvars and the calculated variables are returned.

Author

Michael Friendly, Dennis Murphy

See also

Examples

  bstats <- battingStats()
  str(bstats)
#> 'data.frame':	113799 obs. of  29 variables:
#>  $ playerID: chr  "aardsda01" "aardsda01" "aardsda01" "aardsda01" ...
#>  $ yearID  : int  2004 2006 2007 2008 2009 2010 2012 2013 2015 1954 ...
#>  $ stint   : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ teamID  : Factor w/ 149 levels "ALT","ANA","ARI",..: 117 35 33 16 116 116 93 94 4 80 ...
#>  $ lgID    : Factor w/ 7 levels "AA","AL","FL",..: 5 5 2 2 2 2 2 5 5 5 ...
#>  $ G       : int  11 45 25 47 73 53 1 43 33 122 ...
#>  $ AB      : int  0 2 0 1 0 0 0 0 1 468 ...
#>  $ R       : int  0 0 0 0 0 0 0 0 0 58 ...
#>  $ H       : int  0 0 0 0 0 0 0 0 0 131 ...
#>  $ X2B     : int  0 0 0 0 0 0 0 0 0 27 ...
#>  $ X3B     : int  0 0 0 0 0 0 0 0 0 6 ...
#>  $ HR      : int  0 0 0 0 0 0 0 0 0 13 ...
#>  $ RBI     : int  0 0 0 0 0 0 0 0 0 69 ...
#>  $ SB      : int  0 0 0 0 0 0 0 0 0 2 ...
#>  $ CS      : int  0 0 0 0 0 0 0 0 0 2 ...
#>  $ BB      : int  0 0 0 0 0 0 0 0 0 28 ...
#>  $ SO      : int  0 0 0 1 0 0 0 0 1 39 ...
#>  $ IBB     : int  0 0 0 0 0 0 0 0 0 NA ...
#>  $ HBP     : int  0 0 0 0 0 0 0 0 0 3 ...
#>  $ SH      : int  0 1 0 0 0 0 0 0 0 6 ...
#>  $ SF      : int  0 0 0 0 0 0 0 0 0 4 ...
#>  $ GIDP    : int  0 0 0 0 0 0 0 0 0 13 ...
#>  $ BA      : num  NA 0 NA 0 NA NA NA NA 0 0.28 ...
#>  $ PA      : num  0 3 0 1 0 0 0 0 1 509 ...
#>  $ TB      : num  0 0 0 0 0 0 0 0 0 209 ...
#>  $ SlugPct : num  NA 0 NA 0 NA NA NA NA 0 0.447 ...
#>  $ OBP     : num  NA 0 NA 0 NA NA NA NA 0 0.322 ...
#>  $ OPS     : num  NA 0 NA 0 NA NA NA NA 0 0.769 ...
#>  $ BABIP   : num  NA 0 NA NaN NA NA NA NA NaN 0.281 ...
  bstats <- battingStats(cbind=FALSE)
  str(bstats)
#> 'data.frame':	113799 obs. of  12 variables:
#>  $ playerID: chr  "aardsda01" "aardsda01" "aardsda01" "aardsda01" ...
#>  $ yearID  : int  2004 2006 2007 2008 2009 2010 2012 2013 2015 1954 ...
#>  $ stint   : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ teamID  : Factor w/ 149 levels "ALT","ANA","ARI",..: 117 35 33 16 116 116 93 94 4 80 ...
#>  $ lgID    : Factor w/ 7 levels "AA","AL","FL",..: 5 5 2 2 2 2 2 5 5 5 ...
#>  $ BA      : num  NA 0 NA 0 NA NA NA NA 0 0.28 ...
#>  $ PA      : num  0 3 0 1 0 0 0 0 1 509 ...
#>  $ TB      : num  0 0 0 0 0 0 0 0 0 209 ...
#>  $ SlugPct : num  NA 0 NA 0 NA NA NA NA 0 0.447 ...
#>  $ OBP     : num  NA 0 NA 0 NA NA NA NA 0 0.322 ...
#>  $ OPS     : num  NA 0 NA 0 NA NA NA NA 0 0.769 ...
#>  $ BABIP   : num  NA 0 NA NaN NA NA NA NA NaN 0.281 ...