My main efforts lately have been to transform data from their raw form to an entirely unintended purpose: serving as parameters for a model to predict our future forests. Along those lines, I've developed a lot of different tools to deal with each section of the process. I wanted to zoom out a little this morning, so I took some time to make a flowchart. Hopefully, this will help conceptualize the process for people who aren't me. It might help others who pick up the reins after I'm gone, and it might inspire others to start modeling projects of their own. Maybe!
So as you can see, we've got quite a few steps. The first step is the data sources, themselves. We use FIA data and semi-private USGS data. The FIA data are freely available from the USFS, and the USGS data are more… available on request. The FIA data deal with trees in permanent plots scattered across a given state (California), and the USGS data deal with every aspect of permanent plots also in California. The USGS people have measured every tree, sapling, and seedling, but have only measured diameter. The FIA people measure a lot of different characteristics, but their data are limited to adult trees at first glance.
I developed two packages, essentially one for each data type. I was able to get most allometric, growth, and mortality parameters from the FIA data initially; and I used the USGS data to calculate spatial seed dispersal, since we have seed trap data from the USGS. Both of these packages, MakeMyForests, and disperseR, are available on my GitHub.
Depending on what inputs, you might also need to generate some tree maps. SortieTreeMaps is still in its infancy and not currently on GitHub, but that may change soon enough.
After that, you just need to do some general clean up, and then manually enter the parameters into SORTIE-ND. SORTIE-ND is an open source modeling program that currently only runs on Windows systems. After tinkering with the parameter files and setting up your runs, you can take the summary output files from SORTIE-ND and bring them back over to R to process using SortieOutputs. This way is much easier when you're dealing with bulk data, compared to the SORTIE-ND inherent workflow. Finally, you can clean up, generate figures and stats, and publish when you're ready!
I have to admit, I'm extremely jealous of all of my friends at the Ecological Society of America's 100th meeting – so I'm going to cache the ESA100 tweets here. That way, I can browse them as needed ;)
I'm currently working on translating SORTIE-ND output files into a human readable format, and a format that R can parse well. The files are standardized for their program, but not for every program. In general, they contain 5 lines of unstructured code, following by other lines in tab-delimited format, without quotes.
In the spirit of openness and versatility, I've been developing packages like MakeMyForests and disperseR as I go. These packages, for R, are intended to help others replicate my work, especially after I've moved on. In making R packages, you realize quickly (the minute you read some documentation on them) that you need to test your functions to have any shot of getting them put in a public repository. I've been trying to construct tests for both functions that I've already written, as well as tests for every function that I write in the future.
For data input and output, you need to give R a way to test that your import works correctly. You can do this with “dummy data” files that are written and then remove from the hard drive during testing. R has a neat little function, called “cat”, that gives you the option to write your data to file; and then you can use “unlink” to get rid of it when you're ready.
Writing a File with cat()
cat("Col1\tCol2", "text1\ttext2", file="ex.data", sep="\n")
Using cat() Written Files
read.table("ex.data", sep="\t", header=T, stringsAsFactors=F)
Destroying a File with unlink()
So I've been working on expanding my language set lately, and I came across a really great resource that is probably under-shared. Damien, the owner and operator of “beginnerscpp.com”, decided that he wanted to build some learn-how-to-code videos for C++.
Given that I already have a coding background, I'm not sure how great these are for newbies. But for people with some programming experience already, who aren't going to freak out at the mention of if/else statements, these videos are great.
They feature simple examples, and watching him type the code in – and correct his own mistakes – really helps demonstrate where the errors can be in C++, and how to catch them. He originally made the videos for University of Reddit, but they are freely available on Youtube and at his website. Check them out!
I read an interesting opinion piece the other day entitled, “Why America's National Parks Are So White.” The article discussed statistics about the park visitors, employees, and a recent incident that the author had witnessed while holding a research event at the park. Basically, white-ish colleagues were allowed to enter the park and attend the symposium without scrutiny, yet black colleagues were detained, questioned, and humiliated by park staff who were seemingly unable to believe that black people could be professional scholars.
There are many, many national efforts to include American minorities in the scientific process, whether it be elementary, secondary, or graduate training; attendance at conferences; or “diversity hires” at the professorial level. Yet, there is still a huge gap, even larger than the one that exists for women in STEM careers. It's easy to see, with these stats, how park officials would be distrustful of the idea that a non-white person could be an academic. Though the National Park Service has opened an investigation into that incident, the author goes on to highlight just how “white” our parks are.
It's an excellent article, and the best part: it was penned by a faculty member at my current home institution, University of California Merced. Check it out!
Popular Blog Topics
Popular Site Topics