Monday 19 January 2015

Filtering by Vocabulary Tags in CKAN

CKAN has great documentation for learning how to add vocabulary tags to a dataset form, but the tutorial is cut a bit short in that once you've added a vocabulary to a dataset; a common use-case would be to filter your search results by your tags. The tutorial doesn't show you how to. This blog post gives some advice on how to accomplish this but I didn't find this to work (at least on the version I'm using - v2.2.1).

What worked for me was to add my vocabulary to the datasets' facets using the 'dataset_facets()' function in the IFacets interface.

To do that we need to first find the name of the vocabulary. I think the format rule is vocab_<vocabulary_name>. But to be sure, you can check as I did (though I'm sure there are better ways than this to find out):

First I tagged one of my datasets with one of the tags in my vocabulary (in this case 'Finland'). I then went to solr (at http://localhost:8983/solr/admin/) and queried for the string.
The XML results had a snippet containing this:
<arr name="vocab_country_codes">
<str>Finland</str>
</arr>
This shows that vocab_country_codes was the internal name used by CKAN. Once you've assertained the internal name for your vocabulary, go to your extensions plugin.py and add it as a facet to your dataset.
    p.implements(p.IFacets, inherit=True)
    def dataset_facets(self, facets_dict, package_type):
        facets_dict['vocab_country_codes'] = p.toolkit._('Country Codes')
        return facets_dict
Restart your server and you should find the filter option for your vocabulary on the dataset index page. The IFacets interface also has group_facets() and organization_facets() which can be used in the same way.


No comments:

Post a Comment