Tuesday, June 20, 2017

GIS Programming Module 5: Geoprocessing

This week we looked at geoprocessing using ArcGIS tools, as well as ArcGIS's Model Builder. I've encountered Model Builder before, but learning that you can use a model as the base for a standalone Python script (and then, a Python script as the base for a tool in ArcGIS) is a game-changer. I've already thought of a real-world application to make my life easier outside of class.

Anyway, for this week's lab assignment, we were given several datasets and asked to make a model and then a Python script. I liked doing a model first instead of diving right into scripting, because the models are set up like flow charts, and it's a good way to visualize all the steps you need to go through to get the desired result. From there, it's easy to finalize the model, then export it to a script and finalize that so it can run independently.

For this assignment, it went something like this: all map layers as variables > clip each variable to "basin" layer shape > from clip output for "soils" layer, select all that are not prime farmland > use selection result to erase areas of basin layer

So the final output, of both the model and the Python script, is a shapefile of the basin layer minus parts that aren't prime farmland, like this:

Wednesday, June 14, 2017

GIS Programming: Peer Review 1

Article Citation:
Cerrillo-Cuenca, E. (2017). An approach to the automatic surveying of prehistoric barrows through LiDAR. Quaternary International, 435, 135-145. Retrieved June 14, 2017, from http://www.sciencedirect.com/science/article/pii/S1040618216002159

The author of this paper proposes a methodology for automated identification of prehistoric barrows (burial mounds) in western Europe using Python code to systematically analyze publicly available LiDAR data. Essentially, he has invented a method of creating a predictive model to help identify unknown barrow sites. 
Cerillo-Cuenca argues for the significance of his methodology on the basis that understanding the spatial patterning of barrows, particularly their relationship to settlement sites, is essential for understanding other aspects of prehistoric societies. Remote sensing and systematic survey are an efficient way to gain a better understanding of large areas, especially understudied ones, but, according to Cerillo-Cuenca, other methods used to assess LiDAR data rely on visual interpretation, which has many limitations.  
In the method outlined in this paper, the imagery is first processed to create a bare-earth topography model, which is then analyzed by a series of algorithms to identify locations that match the characteristic topography and shape of barrows, and return coordinates of those locations so that they can be confirmed, either on the ground or from aerial imagery.
I appreciated that Cerillo-Cuenca tested his method in a controlled way by comparing predicted locations to previously recorded sites, and it does appear that the process is fairly successful in identifying well-preserved barrows. It does not do well with poorly-preserved barrows, which is to be expected, but it also seems to consistently miss barrows that are smaller than average. The author acknowledges this shortcoming and suggests the use of higher-resolution imagery, with the possible trade-off of getting more false identifications—of which there are already quite a few based on this article. I would like to read more about how both of these issues might be mitigated, although, to be fair, it sounds as if the methodology is still being refined at this time. I’m also curious as to how the author would make the case that this is a significant improvement over other LiDAR survey methods (it’s likely faster, although I don’t think he explicitly says so), given that it also has clear limitations and still requires review and confirmation of predicted locations.   
On a more immediately practical level, the paper could also benefit from a clearer (and dare I say simplified?) explanation of the process used to identify potential sites, which was a bit over my head as someone with only a rudimentary understanding of this kind of analysis.  
Overall, though, this is an innovative approach that reflects the trend in archaeology towards greater use of technologies like remote sensing and digital modeling, and any method—particularly one that can be automated—for identifying new sites has big implications not only for research but for risk assessment and preservation. No predictive model can be perfect, and this one does seem to need more work, but it sounds very promising.

GIS Programming Module 4: Debugging and Error Handling

This week's programming topic was debugging, or finding and fixing errors in code, and error handling, which means writing code that assumes certain kinds of errors may occur and "traps" them, allowing the code to complete anyway.

For the assignment, we were given three pre-written Python scripts containing errors, two of which we had to fix so that the code would run correctly, and the last of which was an exercise in error handling. All of the scripts this week use ArcGIS geoprocessing functions, so this was also my first official exposure to how Python and ArcGIS interact.

The corrected first script takes a shapefile and prints a list of the field names in the attribute table. It works something like this: import ArcGIS modules > assign workspace (map document) > assign feature class > make list of fields > for each item in fields list, print field name

Here are the results:

The second script is a bit more complex and was harder to fix, with more errors and more complicated ones. (I also ran into some added complications on this script and the third one because I'm working from my local computer rather than the UWF GIS remote desktop and thus have to change the file paths in the scripts, which introduces opportunities for new errors!) I'm not sure about everything this script is doing, because we haven't learned much about the geoprocessing functions yet, but in the end, it prints a list of the layers within a specified data frame:

I was nervous about the error handling part of the assignment, but it turned out to be very easy--a generic try-except statement rather than trying to deal with specific kinds of errors separately. The third script has two parts, the first of which is supposed to turn on the visibility and labels for a layer within an ArcMap document, but is written so that it raises an exception. I enclosed most of the code for this section of the script in the try-except statement, which means that the script now prints "Error:" and the error message for this first part, before continuing the run the second part, which prints the name, spatial reference, and scale for the specified data frame:

Another issue I had this week is that I was using IDLE as my Python editor instead of the recommended PythonWin, and I have two versions of it on my computer, a standalone copy and the one that installs along with ArcGIS. The standalone version doesn't recognize arcpy, the module containing the GIS-specific functions, but it is the one my scripts open in by default, so I kept getting error messages about arcpy that weren't part of the assignment. So frustrating! But at least I figured it out, and now know that if I want to use IDLE, I have to open the right version. Or maybe I should just stick to PythonWin.

Wednesday, June 7, 2017

GIS Programming Module 3: Python Fundamentals, Again

We're still going over the basics of using Python, and this week's topic was conditional statements and loops, which are some of the building blocks of more sophisticated code. For the lab assignment, we were asked to complete a partial script. The finished script prints out the results of a dice-throwing game. Then it generates a list of random numbers, removes any instances of a specific number from that list, and prints the final result. Here's the output:

The trickiest things for me this week were remembering the names of the functions and methods I want to use (I have a problem with knowing what Python can do but not how to do it) and remembering basic details like converting integers to strings in a print statement and adding spaces between things that are going to be printed. I also briefly forgot that the remove method only takes out the first instance, not all of them, and had to look that up to figure out the right code to finish the script. But overall, so far so good.

Saturday, May 27, 2017

GIS Programming Module 2: Python Fundamentals

This week in GIS Programming was a crash course (for me, a welcome review) of the basics of Python syntax and how to manipulate different data types. The lab assignment was to write a short script that began with a string containing my full name, extracted the last name, calculated its length, and ultimately returned the last name and a number three times the number of characters it contains. This output is pictured below:

The steps in the coding process went something like this: assign full name to a variable --> split string into list --> print the last item in the list --> calculate the length of the last item in the list --> multiply it by 3 --> print that result.

Tuesday, May 23, 2017

GIS Programming Module 1: Introduction to Python

Not much to report from the first week of GIS Programming. This week was a very basic overview of the Python scripting language and how the course is going to work. For the first assignment, we were given a prepared Python script to automatically set up a folder tree to be used for file storage throughout the course. All we had to to was open it in PythonWin and run it, but it was pretty neat:

12 folders with more folders inside them, and all I had to do was click a button.

I took some programming classes just for fun a few years ago, so I'm pretty excited to learn more about using Python this semester, and especially how to apply those skills to GIS. 

Wednesday, May 3, 2017

Cartographic Skills Final Project

For very last Cartographic Skills assignment, we were tasked with making a publication-quality map using two thematic datasets and all of our skills from the past semester. I opted to find my own data and make a map with a political theme. Following in the footsteps of other mapmakers who've created election results maps that show politically divided states or counties as shades of purple rather than just the red and blue of the winning party, I mapped the county-level results of the 2012 presidential election using a red-purple-blue color scheme. Then I overlaid cities and towns symbolized according to their population size as of the 2010 census. The point of this was twofold: illustrating the variation that underlies the question of red or blue (in some counties the win was by a large margin; in others it was extremely close), and comparing voting patterns with population distribution in an attempt to investigate that so-called "urban-rural divide" plaguing modern politics.

I focused on Washington, DC and the adjacent states of Maryland and Virginia, partly because it's a region I know well and partly because DC is the capital, and is a major city in region that has both other major cities and really rural areas, so what better place to start looking at how election results map out? If you're wondering why I chose the 2012 election rather than 2016, it's because 1) the data was easily available, already compiled and published by someone else, 2) it happened closer to the most recent census, so the population data seemed more relevant, and 3) 2016 had a lot of surprises and I wanted to start with something that might be a little more typical.

As you can see below, there's a lot of red in those "blue" states, especially in the west, and there's also a lot of purple, which indicates counties where there a lot of Republican AND a lot of Democratic voters. The actual value being mapped is the difference (in percentage points) between the major candidates' shares of the total vote. You can also see that there's something going on with regard to urban areas trending blue and sparsely populated ones trending red, but there's also some areas that don't appear to fit that trend, which is interesting.

Both the main map and the inset maps, which provide close-ups of the most crowded urbanized areas, were created in ArcMap, and the fine-tuning and final layout were done in Adobe Illustrator. I used a choropleth map for the election results, because color is the obvious way of illustrating election results and a choropleth map made a good base to overlay the population data on. The data was manually classified to help simplify a kind-of-confusing legend by using nice round numbers to group the data, although the classes are similar to what ArcMap came up with using natural breaks. Eight classes is kind of a lot, but I wanted to show us much variation as possible rather than just a few classes with huge data ranges (there's a big difference between a 5-point margin and a 30-point margin, after all), and eight classes seemed about the upper limit of what would keep those classes visually distinguishable. For city and town populations, my stand-in for population distribution, I used proportional symbols, which worked better than classed graduated symbols given the huge range in population sizes. I opted to only display cities and towns of at least 5,000 and only label a selection of those, lest the map get too complicated, and I also made the city symbols semi-transparent to help minimize the degree to which they obscure not only each other but the underlying choropleth layer. Finally, I made the city symbols a contrasting color, and made sure everything else in the layout was grey or black to ensure that both bright-colored datasets really stand out.

This project was a lot of work, but I'm please with the result. I've really enjoyed this class and have learned a lot about good map design.