3 points of attention that are underestimated for successful data catalogs
You will be disappointed in your data catalog when the search results do not come close to your expectations. You have a quick scroll through the list of results and pause. Then you look for an option to apply filters you recognize. Then, probably within 5 minutes, you close the catalog screen never to return again.
Lost in search results
Data catalogs that I’ve come across return long lists and technical information. When you enter a search term like ‘customer’ or ‘cashflow’ you are flabbergasted by the tremendous amount of results.
Imagine you’re in the library:
Hugo de Gooijer
Are you looking for chapters, paragraphs, words, font types & sizes or would you like to see titles, summary, author and source of a book when searching?
Make the user journey as easy as possible
Data catalog vendors explain their strength in collecting all the technical metadata from a variety of sources. Putting it together in a big pile and maybe apply algorithms that suggest what sort of data it is. Though I agree that the foundation is based on complete and accurate technical metadata, this approach is not leading to a widely used data search function for the entire organization. This is because they pass by the needs of the user.
Here are 3 points of attention for the user experience:
- show relevant results – are the best hits shown first?
- show reliable results – are the results close to my search terms?
- organize the results – do I understand the result categories in business terms?
Below are some criteria to get you started:
Relevancy
Presentation form
Precision
Timeliness
Reasonableness
easy visualization
result close to my search
responsiveness and accessibility
result related to my search
Reliability
Completeness
Accuracy
Consistency
Currency
scope & source of datasets
correctness of metadata
same results each search
age, time of release, in sync with data
Result categories
The catalog must support the creation of custom categories. As each organization has his own ‘language’, datasets are quickly understood when you can put them into those categories. My advise is to use stable labels like business processes and not apply department or business domain names as these may change with a reorganization.