DEPUIS banner

Engineering analysis and the Semantic Web

2008-03-19, version 1.1

author: David Leal - mailto:david.leal@caesarsystems.co.uk
CAESAR Systems Limited - http://www.caesarsystems.co.uk

Abstract

This document describes how the Semantic Web can be used for the management of engineering analysis data. This technology requires two initial steps:

  1. the creation of a vocabulary for engineering analysis by an authoritative organisation such as NAFEMS;
  2. the assignment of URIs to analysis codes and file formats by vendors.

The document also describes how NAFEMS could assist vendors to assign permanent URIs.

Acknowledgement

This document has been funded by the EU FP6 project DEPUIS - http://www.depuis.enea.it/ (Design of Environmentally friendly Products Using Information Standards). The support from the European Union is gratefully acknowledged.

1 What is the Semantic Web

1.1 Basics of the Semantic Web

The "Semantic Web" is a web of information on the Internet (on an Intranet) which is annotated so that it can be accessed by precise queries, such as;

What activities used the results of the natural frequency analysis carried out on "MyTown by-pass bridge"?

Today you might use Google and search for human readable documents which contain the words or phrases "result", "natural frequency", "MyTown by-pass", "bridge". Perhaps you might get something relevant, but you may also get a lot of junk.

The "Semantic Web" is called "semantic" because a query can be formulated in terms of object types and properties with precise meaning - a vocabulary. In this example, the query relies upon a vocabulary which includes:

Each of these object types and properties could be picked from a menu relevant to engineering analysis. The menu can be presented using words in any language - English, Dutch, Mandarin or Farsi.

A precise query also relies upon a precise identification of the object of interest - "MyTown by-pass bridge" in this example. This identification requires a new business practice - each object of interest is identified as an Internet "resource", and given a URI (see What is a URI). If the bridge is a project of MyTown District Council, then it may be given a URI by the council, such as http://www.my.town.gov.uk/projects/by-pass/bridge.

NOTE 1 Dereferencing the URI for an analysis type, such as heat diffusion, could obtain a representation of the governing equations:

∇·(kT) + Q = D
T

t

The equations could be represented using MathML. The definition of the analysis type could be explicit about whether non-linear behaviour, such as temperature dependence of the conductivity k, is taken into account.

NOTE 2 Many of the objects of interest of engineering analysis are fields, which can be described with respect to a mesh. An input loading for one simulation could be a result from another. Even the material properties could be a distribution resulting from a manufacturing process simulation.

Each of these fields could be represented in an open format, but initially the cost of moving from vendor formats to an open format may be too great. The Semantic Web approach works, and gives benefits, even if fields are still represented in vendor formats.

1.2 Why the Semantic Web

The Semantic Web allows you to record the information you have - whatever it is, however incomplete - precisely. The Semantic Web does not constrain what that information is, except that it must be recorded using a defined and public vocabulary.

The Semantic Web has advantages over traditional approaches to data warehouses, or PDM (Product Data Management) as follows:

1.3 More about the Semantic Web

The Semantic Web is a W3C (World Wide Web Consortium) activity.

A good overview is provided by W3C Semantic Web Activity - http://www.w3.org/2001/sw/

W3C SW Logo

The first popular explanation of the Semantic Web was and article in the Scientific American Tim Berners-Lee, James Hendler and Ora Lassila in May 2001, see The Semantic Web - http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21.

A follow-up article in the Scientific American by Lee Feigenbaum, Ivan Herman, Tonya Hongsermeier, Eric Neumann and Susie Stephens was published in December 2007, see The Semantic Web in Action - http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=3734452E-3048-8A5E-1068474BA8D770C8. Access to this article on the Web requires a Scientific American subscription.)

1.4 Who is already using the Semantic Web

The development of the Semantic Web, just like the development of the Internet, was largely funded by the US DoD. The Semantic Web is currently in use for many military applications, including logistics. A presentation by John Gilligan, Chief Information Officer of the USAF, is The Semantic Web - Imagine the Possibilities - http://www.daml.org/meetings/2005/04/pi/DOD_Venues.pdf.

Early adopters of the Semantic Web are in health care and life sciences. This community has similar requirements to the engineering analysis community because:

For more information, see the W3C Semantic Web Health Care and Life Sciences Interest Group - http://www.w3.org/2001/sw/hcls/.

2 Semantic Web for engineering analysis

2.1 What a Semantic Web for engineering analysis can be

Engineering analysis involves lots of data sets in different formats:

These data sets are about different objects, and are outputs from and inputs to different engineering activities. If we identify the objects (which the data sets are about), and the activities (which use and create the data sets), then we have a "web" of information.

To make the web of information a "Semantic Web", it is necessary to annotate each data set, and specify:

It is necessary to ensure that the are no missing nodes in this web. If there are a number of different data sets about an object, then it is necessary to specify:

It is necessary to ensure that the are no missing links in this web. If one data set is linked to another by an activity, then it is necessary to specify:

The Semantic Web annotation is "glue" which joins existing data sets together. These data sets can be in a open computer interpretable format (such as ISO STEP), in a proprietory computer interpretable format defined by an analysis software vendor (such as Catia or Siemens PLM), or in a human readable document format (such as MicroSoft .doc, Adobe .pdf).

NOTE In the long term, some of the data sets can be replaced by the "glue". For example, there may be no need to have a data set which describes an activity, if there are precise statements within the Semantic Web which specify its type, date, performer, inputs and outputs. Instead, all that is necessary is a URI to identify the activity.

2.2 The benefits of a Semantic Web for engineering analysis

A Semantic Web for engineering analysis will give the following benefits:

One day, "due dilligence" will require a Semantic Web for engineering analysis - or something like it.

Early adopters will get benefits, if only because they will be able to find:

3 How to create a Semantic Web for engineering analysis

3.1 Who will create a Semantic Web for engineering analysis

The semantic web requires that every thing of interest ("resource" in web jargon) has a unique identifiers on the Web (a URI - Uniform Resource Identifier). The things are defined and identified by different people, as follows:

The essential first steps to create a Semantic Web for engineering analysis have to be taken by NAFEMS and the vendors. Problem owners and analysts can then modify their business practices to take advantage of what is available.

3.2 A NAFEMS core

NAFEMS can define a basic vocabulary for engineering analysis.

The Dublin Core - http://dublincore.org/documents/2008/01/14/dc-rdf/ is a basic vocabulary for document meta-data, defining terms such as title, author, publisher, language, subject. NAFEMS can do the same for engineering analysis.

NAFEMS could to do this in liaison with ISO TC184/SC4. The ISO STEP standard for engineering analysis (ISO 10303-209) contains an activity model for engineering analysis which defines types of analysis activity and types of analysis information. This activity model is now nearly 20 years old and needs updating, but it is nonetheless a useful starting point.

3.3 The role of the vendors

The vendors control the versions of native file formats, analysis codes and analysis types. The vendors have an obligation to give unique identifiers:

For the Semantic Web, these identifiers need to be URIs.

One approach would be for vendors to allocate URIs within their own Internet domain. Hence if Fred Bloggs and Co. has has the domain http://www.fred.bloggs.co.uk, it could allocate:

A drawback to this approach is that URI are expected to persist unchanged (see Cool URIs don't change - http://www.w3.org/Provider/Style/URI). Unfortunately, the owners of analysis codes do change. It could be said that this doesn't matter very much because a URI is only an identifier. However, this is not the full story because:

Quite reasonally, if you go to a URI which identifies a code, you expect information about that code, such as:

NAFEMS could help by offering a registry service. Hence a file format or code could be given a NAFEMS URI, such as:

The NAFEMS site could host a brief description of the code or file format, which would remain unchanged. The NAFEMS site could provide a link to the web site of the current analysis code owner. This link could change from time to time.


A What is a URI

A.1 Use of a URI

A URI (Uniform Resourse Identifier) is a unique identifier of a thing, for use by the Internet.

Anybody can assign a URI to any thing. Hence I can assign the URI http://www.caesarsystems.co.uk/animals/Babar to "BaBar the Elephant". It does not matter that:

The first part of a URI, http://www.caesarsystems.co.uk in this case, determines whether or not you trust it. If you believe that CAESAR Systems Limited is an appropriate authority for identifying fictitious animals, then you are free to use this identifier.

It is good, if HTTP access to an HTTP URI actually returns something. If access to http://www.caesarsystems.co.uk/animals/Babar obtains a file which is readable by your browser (an HTML file say), and which tells you what/who "Babar the Elephant" was/is, then this is useful. If HTTP access obtains a file which says "the CAESAR Systems dictionary of fictitious animals is available from all good bookshops", then this is useful too - but less so.

If NAFEMS were to assign the URI http://www.nafems.org/vocabulary/NaturalFrequencyAnalysis to the concept of natural frequency analysis, then many would trust that NAFEMS has provided an authoritative definition.

A.2 Types of URI

There are two principal types of URI:

Since the use of URNs is six orders of magnitude smaller than the use of HTTP URIs, we can safely forget about them.

A.3 What a URI identifies

A URI can identify anything. Sometimes a URI identifies an electronic document which can be downloaded over the Web. Sometimes a URI identifies something else.

A URI can identify Babar the Elephant or the Eiffel Tower. Neither can be downloaded - the first because it is a fictitious animal, and the second because it is 2000 tonnes of steel.

Dereferencing an HTTP URI may cause a document to be downloaded to your browser. This does not mean that the URI identifies the document. The document is a "representation" of the object identified by the URI, which the owner of the domain has chosen to provide. The owner of the domain, may not choose to provide a document at all - so in this case you will get "HTTP error 404".

Sometimes an HTTP URI identifies an electronic document, and when you dereference the URI the document is what you get.

NOTE There is an ambiguity about what an HTTP URI identifies - is it a thing, where the document is merely a representation, or is it the document itself. In practice, the ambiguity is something which we can live with.


B Semantic Web technologies

B.1 Semantic Web standards and software

The semantic web relies upon two basic technologies:

RDF does what it says - it is a methodology for describing resources. The resources can be data sets, or other things. RDF statements are published on the Web. The statements can be queried using SPARQL Query Language for RDF - http://www.w3.org/TR/rdf-sparql-query/. A free implementation of SPARQL is provided by Jena - http://jena.sourceforge.net/, which was initial developed by HP Labs Semantic Web Research - http://www.hpl.hp.com/semweb/.

RDF is intended to be extended by vocabularies. OWL - http://www.w3.org/2004/OWL/ is a basic vocabulary for vocabularies, which is usually the first extension to RDF.

B.2 Telling a story with the Semantic Web

The story is about a bridge:

Lorry on bridge

Figure 1: My Town by-pass bridge

The story is:

The web of objects is as follows:

Stress result graph

Figure 2: About the analysis of My Town by-pass bridge

This web of objects can be thought of as just meta-data for the file http://www.a.d.vance.co.uk/projects/MT_BB/run3/result#HA_MidSpan.stress, but really it is much more - it is a record of the problem and of what was done.

Each object in Figure 2 is defined, and assigned a URI, by somebody. Each player has his or her own namespace (the front bit of the URI) as follows:

Unfortunately, computers cannot process the diagram shown in Figure 2. Hence there has to be a text representation of the diagram. An representation of RDF as XML is widely used. Using XML, we can represent the statements:

as:

<rdf:RDF
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:owl="http://www.w3.org/2002/07/owl#
         xmlns:nafems="http://www.nafems.org/vocabulary/">

 <owl:Thing rdf:about="http://www.my.town.gov.uk/projects/by-pass/bridge">
  <rdf:type rdf:resource="http://www.ice.org.uk/vocabulary/Bridge">
  <nafems:hasState rdf:resource="http://www.a.d.vance.co.uk/projects/MT_BB/HA_MidSpan">
 </owl:Thing>

 <owl:Thing rdf:about="http://www.a.d.vance.co.uk/projects/MT_BB/HA_MidSpan">
  <rdf:type rdf:resource="http://www.nafems.org/vocabulary/State">
 </owl:Thing>

</rdf:RDF>

This is OK for computers, but not easily readable by people. Fortunately, there is an alternative - Notation 3 "A readable language for data on the Web" - http://www.w3.org/DesignIssues/Notation3. The same statements can be represented in Notation 3 (or "N3") as:

@prefix ice:    http://www.ice.org.uk/vocabulary/ .
@prefix nafems: http://www.nafems.org/vocabulary/ .
@prefix myTown: http://www.my.town.gov.uk/projects/ .
@prefix adv:    http://www.a.d.vance.co.uk/projects/ .

myTown:by-pass/bridge a               ice:Bridge ;
                      nafems:hasState adv:MT_BB/HA_MidSpan .

adv:MT_BB/HA_MidSpan  a               nafems:State .

This is simpler and more readable (once you have got past the namespace specifications). The next statements are:

Using N3, these statements can be represented simply as follows:

@prefix nafems: http://www.nafems.org/vocabulary/ .
@prefix adv:    http://www.a.d.vance.co.uk/projects/ .
@prefix fbc:    http://www.fred.bloggs.co.uk/ .

adv:MT_BB/run_3  a                        nafems:StressAnalysis ;
                 nafems:analyses          adv:MT_BB/HA_MidSpan ;
                 nafems:runsAnalysisCode  fbc:application/SuperStruct/4.5.7 ;
                 nafems:givesResult       adv:MT_BB/run3/result#HA_MidSpan.stress .

Two presentations on the use of the Semantic Web for engineering analysis are:


C Questions for analysis system vendors

Do you already assign a unique identifier to each version of a file format, application code, or analysis type?

This will enable your users to record precise meta-data about file types and about analysis activities.

If you do, is information about the version of the file format, and the version of the creating software included within output data files?

Having the information within the file is good. The Semantic Web requires the information be available as meta-data outside the file as well.

Do you already assign a URI to each version of a file format, application code, or analysis type?

A URI makes the identification unique on the Web, and enables a Semantic Web approach.

Do you already provide information about a version of a file format, application code, or analysis type, on the web? If you do is this information obtained by dereferencing the URI of the file format, application conde or analysis type?

If the format of an archived file is specified by a URI, then one day your customer might want to access that URI to find out what it is.

Would you register a file format, application code, or analysis type with an outside body in order to obtain a permanent URI?

Perhaps your customers would feel happier if there was an access route to information about your file formats, analysis codes and analysis types using the Web, which would remain unchanged even if you were taken over by another company.