S2GLC - Phase 1

S2GLC - Phase 1

S2GLC (Sentinel-2 Global Land Cover) was a project funded by the European Space Agency and carried out by a consortium led by CBK PAN (Space Research Centre of Polish Academy of Sciences).

The goal of the project was to develop a classification methodology with a high degree of automation, ready to be used for global land cover mapping based on Sentinel-2 imagery. According to the developed methodology each Sentinel-2 tile is classified separately using a set of multi-temporal images. To maintain the 10 m resolution of Sentinel-2 imagery a pixel-based approach has been chosen. Multi-temporal images are classified separately with random forest (RF) classifier applying training samples selected randomly from existing low resolution databases. The final result is calculated using a dedicated aggregation procedure. This aggregation analyses, for each individual pixel, all multi-temporal results together with their probability scores returned by RF classifier. The last step of the classification workflow is the post-processing applied mostly to the pixels classified with low probability.

Test sites

In the first phase of the project five test sites (Prototype Sites) were analysed. They cover five areas on four continents and include the whole areas of two European countries: Germany (360 000 km2) and Italy (approx. 300 000 km2), an area of 200 000 km2 in China (Asia), an area of 200 000 km2 in Colombia (South America) and an area of 220 000 km2 in Namibia (Africa).

The pictures below illustrate each of the Prototype Site with the red boundary line depicting the extent of analysed areas.














The classification workflow developed within the S2GLC project has been operationally used to produce land cover maps of five Prototype Sites in Germany, Italy, China, Columbia and Namibia. The resulting maps are presented below.








The presented land cover maps  have been prepared according to the legend shown in the table below.


The best classification results have been achieved for the German (85.2% of overall accuracy (OA)) and Italian (72.5%) Prototype Sites. These results are comparable with the quality of German and Italian CORINE LC 2012 database validated with LUCAS points (82.8% and 76.0% of OA, respectively) even though the S2GLC classification approach is completely different to the one applied for CORINE LC database. Accuracy of the results from the Chinese Prototype Site (72% OA) is also on a relatively high level and is characterized by a high degree of details. The results obtained for the Prototype Sites in Namibia and Columbia received lower level of OA, 56.1% and 52.5%, respectively. High disagreement between the existing GLC databases in these areas results in lower quality of reference data (comparing to Europe) and hampers collection of fully reliable training samples. This is considered as the main reasons for lower classification accuracy. Additional problems in the case of Columbian site are related to difficulties in obtaining cloudless images, high diversification of elevation (different lighting conditions) within the test area and, additionally, no seasonal changes in vegetation cover during the year. The latter decreases applicability of multi-temporal data due to minimal differences in the appearance of vegetation within the year. The lower results for the Namibian site could be explained by severe drought occurring in this part of Africa in recent years (2014 - 2016). This contributed to abnormal water regime, i.e. drying up of water bodies and changes in vegetation cover. Additionally, in many areas a problem occurs with clear distinction of boundaries between specific classes, e.g. un-consolidated – grasses, grasses – bush and shrubs, bush and shrubs – tree cover. These areas are transitional zones between the classes and represent a mixture of their components. Besides that, not irrigated agriculturally-used areas closely resemble other classes like shrubs or grassland.