How to better share and find

With Topic Maps, Semantic Web and Linked Data moving onto the web, it is URLs that are connecting it all togethe, and http://subj3ct.com could be the place to find and manage URLs that matter. Subj3ct.com was first announced at Topic Maps 2009 and has since then grown to contain over 16 million subject identifiers and related resources. Graham Moore will tell us why and how, and even share different usage scenarios on Topic Maps in Oslo.

How long have you been working with topic maps?
I first got interested in Topic Maps at the Basilage Conference in Montreal in 1999. This is where I first met Steve Newcomb and Michel Biezunski and thought this thing they were talking about sounded really cool and super-powerful.  Since then I've been involved as editor of XTM 1.0 produced by TopicMaps.org and as editor with Lars Marius Garshol on ISO13250 for many years.

Graham Moore and co-partner Khalil Ahmed decided to found Networked Planet in 2004 at the Topic Maps Conference here in Oslo. They have been selling their Topic Maps software for the .NET platform here since 2005, and their first customer was udir.no. Since then many government and commercial organizations use the products. But what are the reasons for the new subj3ct.com service being such a success in just one year?

Subj3ct.com has been a great success but we still have a lot of things we want to do with it. While we do have a lot of identifiers from some key sources there is still work to do to get more people to use them to tag content and link to additional data for a given subject. The attraction of subj3ct to people is that it is a free, online service where they can find and reuse identifiers which helps them to share and link related content on the web. It also is a way to gain benefit from existing investments in Topic Maps technology.

“The attraction of subj3ct to people is that it is a free, online service where they can find and reuse identifiers, which in turn helps them to share and link related content on the web”

Compared to other  directories subj3ct.com isn't a repository in the traditional sense. Most repositories require users to create the content, in this case the identifiers, in the repository. This tends not to scale well and the data becomes stale. Subj3ct processes data feeds and data sources that are already being published by organizations. These feeds are already in use and the organization has plenty of incentive to keep them up to date.  Another key aspect why people choose to use subj3ct is that it has a built in model of trust and provenance, Graham explains. We record and publish where all the data comes from and authenticate sources so that while anyone can publish into the service, clients reading data can choose which data they want and effectively who to trust. Trust is a very big issue on the web and this extends to the notion of identifiers for things.

Subj3ct.com is a service that aims to be the glue for the Linked Web of Data in the same way that DNS connects together the web of today.

So in short, what are the basic concepts in a non technical way?
In general, up until now, most data on the web is HTML pages. These are retrieved by a browser and the HTML tells the browser what to show to the user. It’s basically publishing information for consumption by humans reading text. However, more recently and in an increasing trend called Linked Data, led by Tim Berners-Lee
(http://www.youtube.com/watch?v=3YcZ3Zqk0a8) and Nigel Shadbolt, organisations and governments are publishing data on the web. This data is about specific things, places, people, themes etc. Now, all of these things are to be identified by URLs. Here is the URL for the School 'Alpha Community School At Springfield Methodist Church',
http://education.data.gov.uk/doc/school/100867 and here is the link for the Linked Data representation of this school:http://education.data.gov.uk/doc/school/100867.rdf. The .rdf link returns the data in RDF/XML format. This can be processed by machines, mashed up with other Linked Data from other sources.

Now, with more and more organisations publishing identifiers for things, how are people meant to go and find out what identifiers exist and where on the web they can get the Linked Data about that thing.
And how can different organisations use common identifiers for common notions, e.g. School, Education. And how can authorities expose their identifiers to a wider community.

Subj3ct.com provides the place for organisations to publish what identifiers they have along with human readable descriptions. Users of subj3ct.com can then do two major things: if they are a publisher they can search subj3ct.com to see if there are common identifiers they want to reuse, or if they are searching the web they can search subj3ct first to find the thing they are looking for and then that will link them to all the structured linked data around the world that is about that subject.

Can you give a real world example of this in use?
Yes, udir.no publish the National Curriculum for Norway as a Topic Map, we have taken that and created a feed into Subj3ct for all the things in the national curriculum. At the topic maps conference there is a talk that describes how using the GREP identifiers in Subj3ct.com content from NRK Skole and Naturfag can be linked together to share content..