Engineering analysis and the Semantic Web
This document describes how the Semantic Web can be used for the management of engineering analysis data. This technology requires two initial steps:
The document also describes how NAFEMS could assist vendors to assign permanent URIs.
This document has been funded by the EU FP6 project
The "Semantic Web" is a web of information on the Internet (on an Intranet) which is annotated so that it can be accessed by precise queries, such as;
What activities used the results of the natural frequency analysis carried out on "MyTown by-pass bridge"?
Today you might use Google and search for human readable documents which contain the words or phrases "result", "natural frequency", "MyTown by-pass", "bridge". Perhaps you might get something relevant, but you may also get a lot of junk.
The "Semantic Web" is called "semantic" because a query can be formulated in terms of object types and properties with precise meaning - a vocabulary. In this example, the query relies upon a vocabulary which includes:
natural frequency analysis;
output, valid for most types of activity);
input,valid for most types of activity);
subject of analysis(valid for an analysis activity).
Each of these object types and properties could be picked from a menu relevant to engineering analysis. The menu can be presented using words in any language - English, Dutch, Mandarin or Farsi.
A precise query also relies upon a precise identification of the object of
interest - "MyTown by-pass bridge" in this example. This identification requires
a new business practice - each object of interest is identified as an Internet
"resource", and given a URI (see What is a URI).
If the bridge is a project of MyTown District Council, then it may be given a URI
by the coucil, such as
The Semantic Web allows you to record the information you have - whatever it is, however incomplete - precisely. The Semantic Web does not constrain what that information is, except that it must be recorded using a defined and public vocabulary.
The Semantic Web has advantages over traditional approaches to data warehouses, or PDM (Product Data Management) as follows:
you do not need a data base schema;
For a traditional approach, you have to define a data base schema, i.e. make a decision about the what data which can be recorded and what its structure is.
data can be distributed over the Web, if required;
The Semantic Web does not do security, but does not prevent it either. If some data is password protected, or behind a firewall, so be it.
it is cheap.
Companies such as SAP will create a bespoke system for you, at a price. The system will do what you specify and no more - it will define your business processes. If you don't want your business processes defined by a bespoke system, or don't have the money to buy one, then the Semantic Web may be for you.
There are free implementations of semantic web technologies, such as the query language SPARQL (see Semantic Web technologies).
The Semantic Web is a W3C (World Wide Web Consortium) activity.
A good overview is provided by
W3C Semantic Web Activity
The first popular explanation of the Semantic Web was and article
in the Scientific American Tim Berners-Lee, James Hendler and Ora Lassila
in May 2001, see The Semantic Web -
A follow-up article in the Scientific American
by Lee Feigenbaum, Ivan Herman, Tonya Hongsermeier, Eric Neumann and Susie Stephens
was published in December 2007,
see The Semantic Web in Action -
Access to this article on the Web requires a Scientific American subscription.)
The development of the Semantic Web, just like the development of the Internet,
was largely funded by the US DoD. The Semantic Web is currently in use for many
military applications, including logistics. A presentation by John Gilligan,
Chief Information Officer of the USAF, is
The Semantic Web - Imagine the Possibilities -
Early adopters of the Semantic Web are in health care and life sciences. This community has similar requirements to the engineering analysis community because:
For more information, see the
W3C Semantic Web Health Care and Life Sciences Interest Group -
Engineering analysis involves lots of data sets in different formats:
These data sets are about different objects, and are outputs from and inputs to different engineering activities. If we identify the objects (which the data sets are about), and the activities (which use and create the data sets), then we have a "web" of information.
To make the web of information a "Semantic Web", it is necessary to annotate each data set, and specify:
natural frequency results;
SuperStruct version 4.5 results file.
It is necessary to ensure that the are no missing nodes in this web. If there are a number of different data sets about an object, then it is necessary to specify:
It is necessary to ensure that the are no missing links in this web. If one data set is linked to another by an activity, then it is necessary to specify:
create analysis model;
material specification, and
SuperMesh version 4.5.3.
The Semantic Web annotation is "glue" which joins existing data sets together. These
data sets can be in a open computer interpretable format (such as ISO STEP), in a
proprietory computer interpretable format defined by an analysis software vendor
(such as Catia or Siemens PLM), or in a human readable document format (such as
NOTE In the long term, some of the data sets can be replaced by the "glue". For example, there may be no need to have a data set which describes an activity, if there are precise statements within the Semantic Web which specify its type, date, performer, inputs and outputs. Instead, all that is necessary is a URI to identify the activity.
A Semantic Web for engineering analysis will give the following benefits:
One day, "due dilligence" will require a Semantic Web for engineering analysis - or something like it.
Early adopters will get benefits, if only because they will be able to find:
The semantic web requires that every thing of interest ("resource" in web jargon) has a unique identifiers on the Web (a URI - Uniform Resource Identifier). The things are defined and identified by different people, as follows:
The problem owners define all the important things - the products, the operating environments of the products, the loading cases, the manufacturing activities. It is up to the problem owners to identify these things.
The analysts define some things - the individual activities which they perform, and different models they create for different behaviours of the products, and lots and lots of data sets. It is up to the analysts to identify these things.
Data suppliers define some things - material product types, standard loading cases, standard assessment criteria. It is up to the suppliers of data about these things, such as ASTM, DoD, and regulatory authorities, to identify them.
The key information about a product is "what sort of thing is it" - a bridge, a building, a transmission tower, a pressure vessel. This is a classification of a product with respect to a standard class. This classification may determine what types of analysis are required, and what codes of practise are relevant. It is up to standardisation bodies, such as ISO, IEC, API, to identify these classes.
A basic vocabulary for engineering analysis is required, containing terms such as:
natural frequency analysis;
analysis boundary conditions;
subject of analysis.
NAFEMS can define and identify these terms.
analysis system vendors;
It is necessary to identify:
It is up to the vendors to identify codes, analysis types and file formats.
The essential first steps to create a Semantic Web for engineering analysis have to be taken by NAFEMS and the vendors. Problem owners and analysts can then modify their business practices to take advantage of what is available.
NAFEMS can define a basic vocabulary for engineering analysis.
The Dublin Core -
is a basic vocabulary for document meta-data, defining terms such as
subject. NAFEMS can do the same for engineering analysis.
NAFEMS could to do this in liaison with ISO TC184/SC4. The ISO STEP standard for engineering analysis (ISO 10303-209) contains an activity model for engineering analysis which defines types of analysis activity and types of analysis information. This activity model is now nearly 20 years old and needs updating, but it is nonetheless a useful starting point.
The vendors control the versions of native file formats, analysis codes and analysis types. The vendors have an obligation to give unique identifiers:
For the Semantic Web, these identifiers need to be URIs.
One approach would be for vendors to allocate URIs within their own Internet
domain. Hence if Fred Bloggs and Co. has has the domain
http://www.fred.bloggs.co.uk, it could allocate:
A drawback to this approach is that URI are expected to persist
(see Cool URIs don't change -
Unfortunately, the owners of analysis codes do change. It could be said that this
doesn't matter very much because a URI is only an identifier. However, this is
not the full story because:
Quite reasonally, if you go to a URI which identifies a code, you expect information about that code, such as:
NAFEMS could help by offering a registry service. Hence a file format or code could be given a NAFEMS URI, such as:
The NAFEMS site could host a brief description of the code or file format, which would remain unchanged. The NAFEMS site could provide a link to the web site of the current analysis code owner. This link could change from time to time.
A URI (Uniform Resourse Identifier) is a unique identifier of a thing, for use by the Internet.
Anybody can assign a URI to any thing. Hence I can assign the URI
http://www.caesarsystems.co.uk/animals/Babar to "BaBar the
Elephant". It does not matter that:
http://www.caesarsystems.co.uk/animals/Babarwith your web browser. the only thing that happens is that you get "HTTP error 404" - i.e. the server returned nothing.
The first part of a URI,
in this case, determines whether or not you trust it. If you believe that
CAESAR Systems Limited is an appropriate authority for identifying ficticious
animals, then you are free to use this identifier.
It is good, if HTTP access to an HTTP URI actually returns something. If
http://www.caesarsystems.co.uk/animals/Babar obtains a
file which is readable by your browser (an HTML file say), and which tells
you what/who "Babar the Elephant" was/is, then this is useful. If HTTP access obtains
a file which says "the CAESAR Systems dictionary of ficticious animals is available from
all good bookshops", then this is useful too - but less so.
If NAFEMS were to assign the URI
to the concept of natural frequency analysis, then many would trust that NAFEMS
has provided an authoritative definition.
There are two principal types of URI:
HTTP URI, formerly called URL (Uniform Resource Locator) which starts
http://, and which uses
as a field separator thereafter.
There are many billions of these URIs in use. All you need is an Internet domain, and you can assign them at will.
URN (Uniform Resourse Name) which starts
and which uses
: as a field separator thereafter.
There are many thousands of these in use. You have to negotiate an agreement with IETF (Internet Engineering Task Force) in order to use them, and a few organisations such as ISBN and ISO have done so.
Since the use of URNs is six orders of magnitude smaller than the use of HTTP URIs, we can safely forget about them.
A URI can identify anything. Sometimes a URI identifies an electronic document which can be downloaded over the Web. Sometimes a URI identifies something else.
A URI can identify Babar the Elephant or the Eiffel Tower. Neither can be downloaded - the first because it is a ficticious animal, and the second because it is 2000 tonnes of steel.
Dereferencing an HTTP URI may cause a document to be downloaded to your browser. This does not mean that the URI identifies the document. The document is a "representation" of the object identified by the URI, which the owner of the domain has chosen to provide. The owner of the domain, may not choose to provide a document at all - so in this case you will get "HTTP error 404".
Sometimes an HTTP URI identifies an electronic document, and when you dereference the URI the document is what you get.
NOTE There is an ambiguity about what an HTTP URI identifies - is it a thing, where the document is merely a representation, or is it the document itself. In practice, the ambiguity is something which we can live with.
The semantic web relies upon two basic technologies:
RDF does what it says - it is a methodology for describing resources. The resources
can be data sets, or other things. RDF statements are published on the Web. The
can be queried using
SPARQL Query Language for RDF -
A free implementation of SPARQL is provided by
http://jena.sourceforge.net/, which was
initial developed by
HP Labs Semantic Web Research -
RDF is intended to be extended by vocabularies.
http://www.w3.org/2004/OWL/ is a basic vocabulary
for vocabularies, which is usually the first extension to RDF.
The story is about a bridge:
The story is:
The web of objects is as follows:
This web of objects can be thought of as just meta-data for the file
but really it is much more - it is a record of the problem and of what was done.
Each object in Figure 2 is defined, and assigned a URI, by somebody. Each player has his or her own namespace (the front bit of the URI) as follows:
Unfortunately, computers cannot process the diagram shown in Figure 2. Hence there has to be a text representation of the diagram. An representation of RDF as XML is widely used. Using XML, we can represent the statements:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/2002/07/owl# xmlns:nafems="http://www.nafems.org/vocabulary/"> <owl:Thing rdf:about="http://www.my.town.gov.uk/projects/by-pass/bridge"> <rdf:type rdf:resource="http://www.ice.org.uk/vocabulary/Bridge"> <nafems:hasState rdf:resource="http://www.a.d.vance.co.uk/projects/MT_BB/HA_MidSpan"> </owl:Thing> <owl:Thing rdf:about="http://www.a.d.vance.co.uk/projects/MT_BB/HA_MidSpan"> <rdf:type rdf:resource="http://www.nafems.org/vocabulary/State"> </owl:Thing> </rdf:RDF>
This is OK for computers, but not easily readable by people. Fortunately, there is
an alternative -
Notation 3 "A readable language for data on the Web" -
The same statements can be represented in Notation 3 (or "N3") as:
@prefix ice: http://www.ice.org.uk/vocabulary/ . @prefix nafems: http://www.nafems.org/vocabulary/ . @prefix myTown: http://www.my.town.gov.uk/projects/ . @prefix adv: http://www.a.d.vance.co.uk/projects/ . myTown:by-pass/bridge a ice:Bridge ; nafems:hasState adv:MT_BB/HA_MidSpan . adv:MT_BB/HA_MidSpan a nafems:State .
This is simpler and more readable (once you have got past the namespace specifications). The next statements are:
Using N3, these statements can be represented simply as follows:
@prefix nafems: http://www.nafems.org/vocabulary/ . @prefix adv: http://www.a.d.vance.co.uk/projects/ . @prefix fbc: http://www.fred.bloggs.co.uk/ . adv:MT_BB/run_3 a nafems:StressAnalysis ; nafems:analyses adv:MT_BB/HA_MidSpan ; nafems:runsAnalysisCode fbc:application/SuperStruct/4.5.7 ; nafems:givesResult adv:MT_BB/run3/result#HA_MidSpan.stress .
Two presentations on the use of the Semantic Web for engineering analysis are:
This was presented at the NAFEMS-ESA seminar on Engineering Analysis Quality, Verification and Validation, in December 2007
This was presented at the Open Technical Forum of ISO TC184/SC4 in March 2008.
Do you already assign a unique identifier to each version of a file format, application code, or analysis type?
This will enable your users to record precise meta-data about file types and about analysis activities.
If you do, is information about the version of the file format, and the version of the creating software included within output data files?
Having the information within the file is good. The Semantic Web requires the information be available as meta-data outside the file as well.
Do you already assign a URI to each version of a file format, application code, or analysis type?
A URI makes the identification unique on the Web, and enables a Semantic Web approach.
Do you already provide information about a version of a file format, application code, or analysis type, on the web? If you do is this information obtained by dereferencing the URI of the file format, application conde or analysis type?
If the format of an archived file is specified by a URI, then one day your customer might want to access that URI to find out what it is.
Would you register a file format, application code, or analysis type with an outside body in order to obtain a permanent URI?
Perhaps your customers would feel happier if there was an access route to information about your file formats, analysis codes and analysis types using the Web, which would remain unchanged even if you were taken over by another company.