This week we decided we wanted to test how the estimates look when we increment the number of sites and reduce the number of visits. We started with 200 sites and 16 visits and ended with 1600 visits and 2 visits.
We find out that, detection seems to be better with 400 sites and 8 visits. Moreover, detection is not good with only two visits. For occupancy, 800 sites and 4 visits was the best case escenario.
We relaized that there isnt a pattern between occupancy and detection where both have the same number of sites and visists and both are the lowest.
This week we moved on from simulated data, and started using the eBird data. The first task, was to merge two files into one: WETA and covariates file. To do this, we tested which variable will be the best to join both files. Our options were: locality_id, checklist_id, latitude and longitude.
With locality_id, when we merged WETA file( size: 13010 observations after filtering) and covarites file(size: 9170 observations) we ended up witha file of size 1, 097, 769 with 5256 unique locality_ids after filtering.
With checklist_id we ended up with a file of size 9170 after filtering, so we decided this would be the data to be used.
This week we want to focus on figuring out what do we want to measure about the results. To do this, we first decided to explore the data by visualizing it and understand it.
Our next task, is figuring out how do we want to group to observations to make a sites.
Do we want to have each observation to have its unique site, do we want to follow the eBird best parctices paper and make sites based on observer ID, lat/long, number of checklists, or do we want to create sites based on latitude and longitude.