The Amazing Power of Wikidata

We have recently switched Histropedia’s primary source for event dates from Wikipedia to Wikidata. We are still using Wikipedia as a date source and all the events remain linked to a Wikipedia article, but now whenever possible we are trying to use dates from Wikidata.

This is a major improvement for us because Wikidata stores dates in a very different way to Wikipedia, where there is no set format for date entry. Wikidata on the other hand has dates stored as structured data that is easily machine readable. Not only does this mean no more parsing errors, it also provides us with much better date precision. This has recently become very important as we have just increased our own date properties to include days and months. When we extract dates from Wikipedia they are only ever accurate to the nearest year.

“See previous post New feature – Zooming in further

Wikidata (a project of the Wikimedia foundation) is a free knowledge base that can be read and edited by humans and machines alike. The project is making rapid progress in collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other Wikimedia projects, and well beyond that. The scope for this project is huge and as it grows it will have a huge impact across the entire Wikimedia foundation, allowing a new level of interconnectivity and accuracy for all projects within the foundation, as well as providing a resource for other data driven applications, such as Histropedia.

Visit Wikidata’s website to learn more about the project

Now that Histrobot (our event adding assistant) knows about Wikidata he will always try and use Wikidata dates when he automatically creates new events. Histrobot will also check for new dates in both Wikidata and Wikipedia every time the event editing/creating window is opened. If dates are found they can be added to the date input fields by clicking the “use” link. Whenever Wikidata dates are available they should be used for Histropedia events.

There is also a link to the Wikidata item for the event that is being edited, so if no dates are found you can either add the dates directly to the event on Histropedia, or follow the link to go straight to the source and add the dates to the Wikidata item. The dates will automatically be available on Histropedia by simply clicking the ‘reload’ button. Users who edit Wikidata in this way will be lending support not only to Histropedia, but also to Wikidata and all of the other projects that use Wikidata as a source. By taking that extra step and editing Wikidata you will be helping a cause that is much bigger than Histropedia and we strongly encourage our users to do this.

We are starting with dates as they are the most important data property to our project, but that’s not the end of the story of our Wikidata integration. Dates make up only a fraction of the data items Wikidata are collecting and in the future we will be using more of the available data properties to unlock more of the amazing power of Wikidata.

We plan to create an advanced filtering system that can be used to build custom timelines, for example we could use such a system to automatically create a timeline to show all buildings in London constructed before 1900 that are still standing. The possibilities for such a system are endless and we are looking forward to doing whatever we can to help further the efforts of Wikidata.


4 comments

  • I’m interested to know whether your using wikidata dumps as your datasource, or if you look up wikidata over its web APIs. If you’re using the dumped data, are you using a third party library to access it, or have you rolled your own code? If you’re using the web API, how is the performance of the wikidata site for you?

    • Hi Jim,

      Thank you for commenting. I’m sorry for the late reply, we are having some issues with the comments at the moment.

      We use a combination of data dumps (rolled into our own code) and live API calls.
      For our use the API we have had no performance issues, however we try and keep our use of the API as light as possible. There is a limit on API calls that can be made so we only use the API to check the latest date data when a user enters our edit screen. Everything is else is done via the data dumps.

  • That is nice that you improve. Precise dates are really important for a source like yours. Although I have read some bad things about Wikidata, for instance, here they write that it has lots of vulnerabilities. Another issues that it is still quite not ready for prime time – we have tried to use Wikidata data in our apps, which we build primarily for business use, but we were unable to find and retrieve facts to use, exactly as they write here. That’s cool that you are going to further Wikidata’s efforts, I think it will be huge in the future.

    • Ok Final one, again sorry for the late replies

      Great to hear you are taking an interest in using Wikidata as part of your development, sorry to hear you were unsuccessful in getting the data you required. The database is growing at an incredible rate and most of what is in the smallbiztrends article is out of date. The project was launched in 2012, so was very new at the time that article was written, but obviously there is still lots of missing data today. We find that there is potential for using Wikidata for building apps, but it depends heavily on the completeness of data in that area of Wikidata.

      I wont go into my responses to the article on the register, it appears to be written from a very negative standpoint, but I will just say that I would prefer an imperfect store of knowledge that is community driven and open over anything controlled by a single organisation.

      We have just released a JavaScript library (HistropediaJS) which renders timelines using our core engine, and are already experimenting with a combination of client data and Wikidata to build new timeline applications. Really happy to discuss this further or answer any question you have about Wikidata, feel free to drop me an email anytime: sean@histropedia.com

Leave a Reply

Your email address will not be published. Required fields are marked *