Week 8 August 16th- August 20th

Sites were set according to the latitude-longitude of each observation; if two sites had the same latitude-longitude then their site ID will be the same. To do so, using the function unique(), we removed all duplicated sites with the same latitude-longitude from the Weta-covarites dataframe. After doing this, we numbered all unique observations, and merged this data frame with the Weta-covarites dataframe. With this, we completed our second task of creating a site column. We decided to train our data next. We splitted the data into 80%-20% randomly and used the 80% part to make predictions. We decided to only use unique sites so we removed all duplicated sites, however this brought up one challenge. The csvToUMF() fucntion from the UNMARKED__ package expects to have at least two observation columns, and we only had one. At first we thought we could use the duplicated sites qand merged them to have multiple observations, however, each duplicated site varies in the number of duplicates it has. While some may only repeat two times, other repeat more than 20 times. Our approach was creating an empty obeservation column to avoid genarating simulated data. We also realized that the fucntion only need the necessary variable columns, what I mean by this is that originally our training data had about 41 column, only from which we will only be using 10 to get predictions. So we decided to deleted the 31 columns that were not needed and finally we were able to get some estimate predictors. Both my teammate and I are starting our next semester of college next week, so we are wrapping up our project and starting to write a final paper which will be shared with one of our mentors Mark Roth, where we can hopefully give him a good insight of our findings and help him continue with this project.

Written on August 20, 2021