Thursday, 7 January 2016

The Bayesian classifier function in the dissertation - Part 4: Constructing and evaulating the classifier

I made global histograms for all of the HRSC DTM tiles of the feature vector (NDR, DTM, slope, Asp, LgtCrv, CrsCrv) , and also histograms for the zonal statistics (I did this for 'head' areas, 'extent' areas and 'context' areas.)

Histograms for zonal statistics for 'extent' shapefile polygons:

Histograms for all of the HRSC tile area:

This is the area of study, I only used the HRSC tiles that I had already identified as having a Souness object. It would perhaps have been better to got all the HRSC tiles between 25° and 60° in both hemispheres, for a more 'global' set of background.

I made a 4 component gaussian fit to these, and then had analytic functions of feature variables, and then made ratios of the zonal/global as functions of the feature vector.
The abundance of Souness GLFs when expressed as a ratio looks less bimodal than the absolute number, since the bimodality is also present in the underlying elevation distribution (though with a caveat for the 'background' excluding areas where there are no Souness objects)
Slopes of 10-20° are favoured.
Both longitudinal and cross-sectional curvature is favoured to be negative, ie. concave slopes within hollows.

I defined the classifier function K as the product of all of these (excluding NDR since it was locally normalised in the HRSC data), and used ln(K) in my interpretation, and in the dissertation made a basic z-test showing the classifier was better than random at predicting the glaciers locations, but looking at a number of fields, it was clear the classifier was not specific to the particular type of object Souness catalogued, but perhaps could be useful as a search tool.

Wednesday, 6 January 2016

Souness GLF elevation profiles

I have decided, despite the general poor quality of code, to put many of my Python scripts that I used to process my dissertation online. It is not likely to be very useful to someone else, because it is particularly built around the dataset that I amassed from HRSC Mars Express DTMs and written in a very ad-hoc way.

The scripts are at

One thing I have developed a little beyond my dissertation is the profile along the midpoint of the glacier. I have extended my script that generated the shapefiles of the glacier extents to do circles following the mid-line all the way from head to terminus.

Souness 1273, 944, 945, 948, 946 and 947 in the north of Greg crater

I used gdallocationinfo called from the command-line via subprocess.check_output in Python to retrive the elevation from the HRSC data for all of the Souness GLFs that have HRSC data available. 

All of the Souness GLFs superplotted on one crowded looking PNG file

The same figure with scaling distance along midline and elevation difference to dimensionless number between 0 and 1

Monday, 4 January 2016

The Bayesian classifier function in the dissertation - Part 3: Segmenting the layerstacked feature vector

In the previous post, I made an aside to mention what I did to create shapefiles of the Souness glacier-like-form extents.

Now I will mention the actual segmentation of the layerstacked feature vector.

A figure from my dissertation showing the process of segmenting the layerstacked feature vector with RSGISlib, then using gdal_polygonize to convert this to a shapefile, and extracting the Raster Attribute Table as a .csv file, to which further processing can be done before rejoining the shapefile with it in QGIS.
The RSGISlib function that performs the segmentation is described on the RSGISlib website at

I wrote a wrapper script around it to perform the segmentation itself, setting the smallest size of segments at 0.2 square kms. I calculate the number of pixels that is depending on the DTM resolution. This can be extracted via the gdalinfo command-line script.

One detail is that I added 9999 to all fields before using the segmentation, since the segmentation didn't work properly with negative values, which are encountered in Martian DTM data, and also in the DTM curvature fields.

There is a subtlely not mentioned in the above flowchart, is that I add the statistics from the original image file (i.e. without the +9999) to the clumps file from the segmentation to the RAT table.
Clumps file in TuiView for tile h2224 covering Greg crater, after the addition of the clump statistics to the RAT.

In one version of RSGISLib, it didn't include the objectID in the raster attribute table, which I then added back in manually with a Python script.

The gdal_polygonize command takes the segment clumps file and makes it into shapefile polygons.

The Bayesian classifier function in the dissertation results - Part 2: getting the Souness catalogue into shapefile form

In order to focus on the areas of the Souness et al. 2012 glacier-like forms, I first needed to get the actual areas of interest as shapefiles.

I wrote some really not very good Python code to get the information from the Souness catalogue (a .csv exported from a spreadsheet which is available as a supplement from the paper itself) and then convert the coordinates of the GLF head, centre, terminus and left/right mid-channel from lat/long to an equicylindrical coordinate system with a standard parallel of 40°.

The Nilosyrtis Mensae area is fairly densely populated with Souness GLFs.

I have uploaded the code itself to my Bitbucket repository:

A flowchart from my dissertation showing the usage of shapefiles of the Souness objects to create zonal statistics.
Later on in the process, these are intersected with the segmented DEM to find out zonal statistics on the subset of DEM segments that intersect with the Souness GLFs.

While digging around in the scripts I found that I in fact lost the script that actually did the zonal statistics for each DTM tile. I will have to rewrite it from a related one in the backups.

I used one script to loop through each of the HRSC fields and do zonal statistics with the GLF shapefiles. This produced a .csv file in the directory of each tile, and I used a further script to gather them all together into a single .csv file using the manually specified HRSC tile for each field.

Meanwhile, I recently adapted the script that made the Souness regions of interest, to make a series of intermediate areas along the midpoint of the glacier. :
A series of GLFs along the northern wall of Greg crater, with intermediate points along them, that could be useful for making approximate profiles, provided I can manage the rather opaque Python code involved in making the zonal statistics.

The Bayesian classifier function in the dissertation results - Part 1: acquiring and preprocessing data

I have mentioned before a little about having a Bayesian classifier for the Martian terrain segments, which measures the similarity to the Martian terrain.

This is detailed in section 3.4.2 on pages 36-43 of the dissertation, or 43-52 of the tablet-optimised version I will try to recap it here.

The first step was to gather the data:
Beginning with the tabulated Souness et al. 2012 objects, I identified and downloaded the relevant High Resolution and Stereo Camera DTM tiles, reprojected to a common equicylindrical cordinate system with a standard latitude of 40°, and then create derived topographic variables.
I originally did this manually, inputting Souness GLF coorindinates manually into the web interface at PDS Geosciences Node Mars Orbital Data Explorer (ODE). 
I retrieved both the nadir image files and the areoid elevation.

It is also possible to download shapefiles of coverage footprints of various Mars satellite data and identify coverage of Souness GLF footprints in a more automated fashion.

The actual 6 band feature vector was
  1. Nadir image
  2. DTM elevation
  3. Slope (degrees)
  4. Aspect (in absolute degrees from north)
  5. Cross-sectional curvature
  6. Longitudinal curvature

The software I used for this was Jo Wood's LandSerf. In his PhD thesis there is a good explanation of the various different types of topographic curvature. Unfortunately this is no longer easily accessible, except via the Internet Archive however the various types of curvature are also explained in this website forming part of an online textbook on Geospatial Analysis.

After processing in LandSerf I used gdal at the command-line to generate a layerstacked .kea file for each tile.

The longitudinal and cross-sectional curvatures are defined in the slope's own plane, i.e. the plane containing the slope vector with normal pointing in the aspect direction.

The longitudinal curvature is a measure of convexity/concavity of the slope, and cross-sectional curvature the rate of change of the slope vector, when considered in the slope's own reference frame (as opposed to plan curvature which considers the same from vertically above).

The 10 band layerstacks, which I didn't process further due to the processing time also included Mean curvature, plan curvature, profile curvature and Landserf feature classification.

Later on, in reprocessing the results after the dissertation I used the original aspect rather than the absolute aspect from N.

TuiView screenshot of the h0037 HRSC DTM tile. I have coded red:slope, green:nadir image and blue:elevation.