[Translate to English:] Innenstadt Musterdatenkatalog

Sample Data Catalogue grows to 30,000 datasets

We have now published an updated version of the Sample Data Catalogue. Around 70 municipalities and 5,000 open datasets have been added, out of the approximately 30,000 datasets that the catalogue now contains.

Contact

Foto Mario Wiedemann
Mario Wiedemann
Senior Project Manager

The Sample Data Catalogue of open data from Germany’s municipalities is updated twice a year. The most recent update was released in November 2023. Compared with the previous version, some 70 new communities have been added, including cities such as Erlangen, Göttingen and Ingolstadt, and towns such as Deggendorf, Sulingen and Varel.

The following graphic shows which municipalities publish open data and can be found on govdata.de – and are therefore included in the current version of the Sample Data Catalogue.

The number of communities publishing data openly in a given state has no bearing on the quality of the data. For example, there are numerous cities and towns (especially in Rhineland–Palatinate) that have only published one dataset – often a PDF of their zoning plan.

Most datasets assigned to the topic “Spatial Planning”

The taxonomy used for the Sample Data Catalogue comprises 25 topics, which in turn have 241 labels grouped hierarchically under them (e.g. “Tourism – Sights” or “Waste Disposal – Waste Disposal Fees”). A topic and label make up one sample dataset. The following graphic shows the distribution of datasets across the 25 different topics. As can be seen, the algorithm underlying the Sample Data Catalogue assigns more than half of the datasets to the topic “Spatial Planning” since they often contain a zoning plan as a PDF (Sample Dataset “Spatial Planning – Zoning Plan”). 

According to the current complete list, there are 117,807 datasets on GovData (as of Aug. 20, 2024). That means the Sample Data Catalogue contains approximately one-quarter of the data available on GovData. The remaining datasets are either non-municipal data or have been filtered out due to a lack of information about the organization providing the data.

How to find the sample dataset that matches your data

You want to publish new datasets as open data and are wondering how your data would be classified in the Sample Data Catalogue? You can use the platform Hugging Face to enter the names of your datasets and receive suggestions about where they would be integrated into the Sample Data Catalogue. You can then identify the URI of the corresponding sample dataset and insert the appropriate link in the metadata (if you are using DCAT-AP.de) in the dct:references field. The best way to do this is explained in more detail in the DCAT-AP.de conventions manual. This procedure will make it easier to update the Sample Data Catalogue in the future. Until now, this has been done using an algorithm that only takes the dataset’s name into account.

The new version of the Sample Data Catalogue can be accessed through musterdatenkatalog.de. That is where you will find the most comprehensive overview of open municipal data in Germany, i.e. information on which communities publish which kinds of data. The next update of the Sample Data Catalogue will take place in approximately six months and will include local-level data that have been added to govdata.de. The Sample Data Catalogue was initially developed in 2018 as a joint project with GovData, the Open Knowledge Foundation Deutschland, and KDZ – Centre for Public Administration Research.