St. Lawrence University

Mathematics Computer Science, and Statistics Department

FOS 2006

Scroll down the page to view abstracts.

CLICK HERE to view Complete Itinerary

Jeff Cluckey

Patti Frazer Lock, Mathematics

The Optimal Assignment Problem

The optimal assignment problem discusses how to assign workers to jobs in the most effective way, given a measure of how effective each worker is at each job. We discuss a solution to this problem and some applications using methods in Graph Theory.

Jeffrey DiGeronimo

Faculty Sponsor: Dr. Robin Locke, Statistics

A Player’s Salary Through Statistical Analysis of Past Performance

What is a current Major League Baseball(MLB) player worth? Today controversy over a player’s salary is debated every season. Teams are forced to make the decision on whether a player is worth keeping or be traded/released. Also many players face contract arbitration with the organization(team) they are playing for. This process involves a decision about the worth of a player. I have collected a large data set made up each MLB hitter’s statistics for the previous three seasons and career statistics. These statistics include statistics such as batting average, runs batted in, on base percentage, home runs, ect. I also collected other variables that could play a role in the salary of a player such as age, years of MLB experience, and position. Then using regression models, with the response variable as the natural logarithm of salary, I can find out what attributes or statistics of a MLB player are correlated to salary. I formulated my own analysis of a player’s stats to derive what I find to be the salary that the individual player deserves. I also looked at some of the past arbitration deals with my model and compared them to the actual arbitration ruling.

Raluca Dragusanu

Faculty Sponsor: Dr. Robin Lock, Mathematics

Minimizing Risks: Modeling Financial Time Series Data with
ARCH and GARCH

Traditional time-series models such as Autoregressive (AR) and Moving Average (MA) models are based on the homoskedasticity assumption, which translates into a constant variance for the errors of a model. This assumption has been shown to be inappropriate when dealing with some economic and financial market data. A new class of models - conditional heteroskedastic models – was developed to deal with data that does not exhibit constant variance of the errors. The most well known models in this class are the Autoregressive Conditional Heteroskedastic model (ARCH) and its generalized version (GARCH). Stock market volatility, the square root of the variance of stock returns presents a very good application of this type of model. In finance, volatility is the expression of risk. Since we must take risks to achieve rewards, finding appropriate methods to forecast volatility is necessary in order to optimize our behavior and, in particular, our portfolio. I will present the general properties of the ARCH and GARCH models and use both Monte Carlo simulations and known financial time series data to test their performance.

Travis Gingras

Faculty Sponsor: Robin Lock, Statistics

Applications of a Graphical Information System to Ice Hockey

Statistics and sports have been related for many years, and recently the art of using statistics to observe players tendencies has become more and more common among coaches. This project looks to investigate patterns of shots taken by the St. Lawrence Men’s hockey team using a geographic information system. ArcGIS is a mapping program generally designed for geographical data, but in this project we have defined a database to store information about individual shots in multiple hockey games while placing them on a map of the offensive zone of a hockey rink. We can then study patterns and look for the trends that might benefit individual players or the team as a whole.

James Hall

Faculty Sponsor: Robin Lock, Mathematics

Investigating the Effectiveness of the Bootstrap

The statistical procedure known as “bootstrapping” is used to approximate a sampling distribution for any statistic by resampling from an original sample with replacement in order to draw conclusions about the shape, center and variability of the sample statistic. These methods avoid traditional assumptions such as assuming a certain population is normally distributed. We give a brief description of bootstrapping techniques and demonstrate via computer simulation (using the statistical software packages R and Fathom) the effectiveness, in terms of coverage and average width, of bookstrap confidence intervals compared to traditional confidence intervals in standard situations and in cases where standard assumptions fail.

Kristen MacMurray

Faculty Sponsor: Dr. Patti Frazer Lock, Mathematics

The Gossip Number and The E-mail Gossip Number

Assume every person in a group of people has a unique tidbit of gossip to share. How many conversations must occur before everyone in the group knows all the gossip? It depends on what we assume about the conversations. The gossip number assumes that conversations occur between two people who tell each other everything they know. The e-mail gossip number assumes that one person shares all the gossip that he or she knows with all his or her friends in a mass mailing. We discuss some interesting results about the gossip number and the email gossip number of a graph.

Yordan Minev

Faculty Sponsor: Michael Schuckers, Statistics

ROC Confidence Regions using radial sweep methods

One methodology for evaluating the matching performance of biometric authentication systems is the receiver operating characteristics (ROC) curve. A biometric authentication system matches physiological characteristics to a database of such characteristics. The ROC curve graphically illustrates the relationship between type I and type II statistical errors when varying a threshold across a genuine and an imposter match distributions. In biometric authentication, genuine users are generally those that the system should accept and imposters are those that the system should reject. In this project ROC confidence regions are created using radial sweep methods. Radial sweep is based on converting the type I and type II errors to polar coordinates. The goal of the project is to estimate the performance of each biometric system via a confidence region and to identify the most effective method for computing such a confidence region for a ROC curve of that system’s performance.

Emily Sheldon

Faculty Sponsor: Michael Schuckers, Statistics

Sequential Analysis of the Beta Binomial

In this presentation we attempt to derive an equation from the Beta-binomial distribution that can be used to apply sequential probability ratio testing to biometric devices. We first examine sequential analysis testing methods and then apply them to examples of multiple independent bernoulli trials. We use these examples to illustrate the decision of when to stop testing. Lastly we examine the Beta-binomial distribution and derive an equation that can be used in sequential analysis methodology.

Joshua White

Faculty Sponsor: Dr. Robin Lock, Statistics

Predicting a Pitcher’s Salary Using Statistical Techniques

When looking at a baseball player’s performance compared to his salary, there should be some kind of correlation, the better the player the better the salary. However, a player’s worth does not always reflect his salary. Throughout the season owners and general managers are faced with decisions dealing with a player’s salary and/or whether or not they should keep them. This may create conflict with a player and his team forcing the player to deal with arbitration, a process where an external person chooses a salary figure based on arguments presented by the team and player. The main goal of this project is to examine models for predicting a pitcher’s salary based on past performance. I started by compiling a database of Major League Baseball pitcher’s statistics from a Microsoft Access fill downloaded on the internet. This will include the obvious variables of a pitcher (ERA, walks, strike outs, etc) and will also include other ones such as age, years in the league, free agent status, and starting pitcher vs. relief. Once the database is established, I plan to perform statistical analysis, using techniques such as multiple regression, to create and assess models for predicting pitchers’ salaries. Once all of the statistical analysis is complete, individual case studies such as actual arbitration cases for current pitchers will be examined and tested using the models developed.



St. Lawrence University
Homepage
/--/ Academics Page



Created: May 2, 2006
Peg Barkley
Math, CS & Stats. Dept.