How can data about third-party resources be published as part of a dataset?


Datasets are rarely completely self-contained. They will often contain facts or data about third-party resources. These may be entities from other systems or existing web documents that are part of an external system or website. It should be possible to surface this information alongside the data that originates in the dataset.


Just publish RDF documents containing the statements about the external resources


Linked Data available from when retrieved might contain the following data which contains annotations about two additional resources:

<> a foaf:Person.

  dc:title "Ubuntu Tips";
  dc:creator <>.

  owl:sameAs <>.


With RDF Anyone can say Anything Anywhere, there are no restrictions about who may make statements about a resource; although clearly a processing application might want to pick its sources carefully.

It is entirely consistent with the Linked Data principles to make statements about third-party resources. While in some cases it is useful to use the Proxy URIs pattern, in many cases simply making statements about external resources, whether these are entities (e.g. a person) or documents (e.g. a web page) is sufficient.