Você está na página 1de 14

Multimed Tools Appl

DOI 10.1007/s11042-015-2895-8

Design and implementation of a system for creating


multimedia linked data and its applications in education
Jeongmin Chae 1 & Yoonah Cho 2 & Minkyung Lee 1 &
Sangmi Lee 1 & Munsuk Choi 1 & Seongbin Park 1

Received: 22 December 2014 / Revised: 25 July 2015 / Accepted: 17 August 2015


# Springer Science+Business Media New York 2015

Abstract The number of multimedia data has been constantly increasing and recently due to
the popular SNS services as well as applications that run on smartphones, almost anyone can
easily post video files or audio files on the Web. There are many tools by which lay users who
do not have technical backrounds on the multimedia data format can create video files or audio
files. While most existing tools for creating multimedia data have good user interfaces so that
non-expert users can create different kinds of multimedia data easily, few tools allow users to
semantically connect multimedia data so that semantics based searching can be supported.
Linked data is a concrete example of the Semantic Web that aims for representing data in a
form that machines can understand. In this paper, we present an easy-to-use system that helps
novice users create multimedia linked data and show how the system can be used in education.
The system has been implemented using open sources and all that users have to do is prepare
data that needs to be provided as linked data in a simple format. Then the system automatically
generates multimedia linked data and users can run a relation-based service that finds
meaningful relations that exist in the linked data inside the system.
Keywords Multimedia . Linked data . Semantic Web

1 Introduction
Recently, the number of multimedia data on the Web has been increasing. People can easily
create multimedia data in various types such as image files, audio files, and video files etc.
without difficulties thanks to well-known tools for multimedia data creation. Once they create
multimedia data it is also relatively easy to post them on public sites so that friends can access

* Seongbin Park
hyperspace@korea.ac.kr
1

Department of Computer Science Education, Korea University, Seoul, South Korea

Wolgye High School, Seoul, South Korea

Multimed Tools Appl

them through the Web. Especially since we are living in the Web 2.0 era [1], the number of
user-generated contents is constantly increasing. In this paper, we address the question of how
users can semantically link multimedia data on the Web. In addition, we show how our system
can be used in an educational scenario. Our research was motivated by the fact that many of
multimedia data on the Web are fragmented in the sense that there do not exist hyperlinks that
connect semantically related multimedia data [8]. Our perspective is that an easy-to-use tool
that does not require technical backgrounds should be provided so that almost anyone can use
the tool to easily interconnect semantically related multimedia data. To this end, we adopt
linked data principles [2] and present a system that can help lay users easily create linked data
with various types of multimedia data. In fact, the importance of interlinking multimedia data
using linked data principles has been well understood [9].
One advantage of our system lies in the fact that novice users who do not have
technical backgrounds on the linked data principles can easily create multimedia linked
data by simply designing the structure of the resulting multimedia linked data that they
expect to create. This is in contrast to existing systems such as Xturtle (http://aksw.org/
Projects/Xturtle.html), rdfEditor (https://bitbucket.org/dotnetrdf/dotnetrdf/wiki/UserGuide/
Tools/rdfEditor) etc. that can be used to create linked data because other systems
require at least basic understanding of linked data format such as Resource Description
Framework (http://www.w3.org/RDF/). We have experimented our system against college
students who do not have any technical backgrounds on the linked data principles. With a
simple instruction about how to use the system, they could easily create multimedia
linked data using our system.
This paper is structured as follows. Section 2 describes related work to our research and
section 3 introduces the proposed system using a motivation example about creating a
Semantic Web document (SWD). Section 4 explains the implementation details about the
proposed system. Section 5 and section 6 describe experimental results about how the system
can be used by non-expert users as well as a possible application in education, respectively.
Section 7 concludes the paper.

2 Related work
The Semantic Web is an extension of the current Web where information is represented in a
way that machines such as computer programs can understand the meanings. Linked data can
be considered as a concrete example of the Semantic Web in which data are represented in a
machine-friendly format and hyperlinks exist among the set of data.
Recently, the number of linked data has been increasing and there are well-known
tools such as Tabulator (http://www.w3.org/2005/ajar/tab) by which information on
linked data can be accessed and analyzed. However, it is not easy for lay users to
create linked data because it requires technical backgrounds on the Semantic Web.
Existing systems such as Xturtle, rdfEditor, etc. are relatively easy to use and in fact,
we used rdfEditor in our experiment to measure how effective our system is compared
to an existing system. Quantitative results that show the difference between our
system and rdfEditor are described in section 5, but students who participated in
the experiment found that it took more time to create linked data using rdfEditor than
our system. In addition, our system supports additional functions such as finding
relations, autocompletion, and a preview service.

Multimed Tools Appl

b
n

zG

c
gGaGcaVVUUVVJeU
gGaGcaVVUUVVWXVJeU
gGaGcaVVUZUVYWWWVWXVTJeU
anGaGaiU
anGaGamU
anGaGazLYWU
anGaGInIU
aiGaGIiIU
amGaGImIU
azLYWGaGIzGIU

Fig. 1 Source image file (a), a document that a user prepares (b), and the result SWD (c)

Our system is similar to a content management system (CMS) such as Wordpress (https://
wordpress.org) and Moodle (https://moodle.org) in that they support documents which contain
multimedia data. However, unlike our system, it is difficult to find relations automatically
inside a system. So, while a CMS can be useful in producing documents that contain
multimedia data, it is not easy for users to find related information to a given information
systematically inside the system.
Table 1 Conversion rules
Type

Conversion rule

Example

external resource

{URL}{URL}

http://dbpedia.org/resource/Road
http://dbpedia.org/resource/Road

user resource

#{STR}
http://{H}/{U}/r#{STR}

#Road
http://ah.withcat.net/onacloud/r#Road

graph resource

{STR}
http://{H}/{U}/{G}/r#{STR}
{URL}
{URL}

Roadhttp://ah.withcat.net/onacloud/graph1/
r#Road
http://rdfs.org/sioc/ns#container_of
http://rdfs.org/sioc/ns#container_of

{STR}
http://{H}/{U}/p#{STR}

wearing
http://ah.withcat.net/onacloud/p#wearing

external predicate
user predicate

Multimed Tools Appl

3 A motivating example
In this section, we explain how a user can create an SWD that can become a part of linked data
using our system. Creating an SWD is done in two steps in our system. First, parts of an input
document that a user starts with are assigned URLs so that they can be interlinked. Second,
hyperlinks among parts that have URLs are created. Lets assume that a user wants to create an
SWD that describes the image shown in (a) of Fig. 1. (b) shows a text made by the user
according to the syntactic rules of our system. (c) is the result SWD that our system creates.

nGsGGjGl

ymG

nG

Fig. 2 The screenshots of the user interface (a) and possible functions (b), (c)

Multimed Tools Appl

As can be seen in this example, all that a user needs to know is a set of syntactic rules which
are a lot easier than syntactic rules for Resource Description Framework.
Precisely speaking, what a user needs to prepare for creating an SWD is an organized
list of lines, where each line is either a word that the user wants to use to describe
something or an URL that identifies some information. The word and URL can be used
for components inside triples (subject-predicate-object format data) and they are called
either resources or predicates depending on whether they can be used as subjects and
objects, or predicates, respectively. Both resources and predicates can be defined by users
inside the system, but externally defined resources as well as predicates can be used. (b)
of Fig. 1 is called a graph in our system and a user can define multiple graphs in the
system. Table 1 shows how the system converts a string that a user types into the
corresponding URL. For example, if a user types http://dbpedia.org/resource/Road in a
line, the system converts it into http://dbpedia.org/resource/Road. In other words, if a user
types a URL, the system uses it directly without modifying it. It is possible that the user
simply types a word such as Road in a line. Then, the system converts it into http://ah.
withcat.net/onacloud/r#Road automatically. As can be seen in this table, users can simply
organize lists which describe the contents of documents.
Fig. 3 The components of the
proposed system

Multimed Tools Appl


Table 2 Available endpoints
Endpoint

Role

http://localhost:3030/ds/query

the SPARQL query endpoint

http://localhost:3030/ds/update

the SPARQL Update language endpoint

http://localhost:3030/ds/data

the SPARQL Graph Store Protocol endpoint

http://localhost:3030/ds/upload

the file upload endpoint

Figure 2 shows the user interface of the proposed system where the left part of the screen on
top (a) shows a list of graphs that can be editable and the right part shows a graph that is being
edited. As can be seen, the graph that is currently being edited is essentially a list of lines,
where each line is either a word that starts with mark or a URL. A user can always select a
graph from a list of editable graphs shown in the left part of the screen. It is also possible that
the user can run RelFinder [5] against the graph being edited so that the information about how
lines are interconnected is shown in (b). While editing a graph, the user can see how lines are
interconnected in real time as shown in (c).

4 Implementation of the proposed system


In this section, we explain the structure of the proposed system and how implementation was
done. Figure 3 shows the components of the proposed system. The structure is divided into
four groups: (1) a local triple store (2) SPARQL endpoint (3) linked data developement tools
(4) linked data navigations tools. The role of a local triple store is to store triples created by a
user and provides users with triples. The SPARQL endpoint is a REST-based interface that
helps linked data development tools managing triples. Linked data developement tools provide
detailed information about triples created by users and relations among triples.
The implementation of the system has been done using several open sources. The system
has been implemented using Visual Studio 2012 C#. In order to store triples created, we used
TDB and Fuseki server which are available from Apache Jena Framework (https://jena.apache.
org). TDB is a high performance store and Fuseki is a SPARQL/update endpoint which
supports REST-style interaction. We used dotNetRDF (http://www.dotnetrdf.org) for

Table 3 Summary of the first experiment


Student The
ID
number of
completed
images

The
number
of
triples

The
number of
distinct
subjects

The
number of
distinct
predicates

The
number of
distinct
objects

The most frequently used predicates


except for http://www.w3.org/2000/01
/rdf-schema#label and http://rdfs.org/
sioc/ns#container_of

20

177

85

29

149

ride, in, on, ware, is

17

118

61

19

103

in, ride, play, on_the, see

3
4

15
20

78
142

46
71

13
25

72
121

do, play, ride


in, wearing, wait, ride, see

20

136

74

23

127

play, ride, in_the, touch

Multimed Tools Appl


Table 4 The precision about creating triples
Student
ID

The total
number
of triples

The number of
system-generated
triples

The number of
container_ofs

The number
of user
predicates

The number
of errors

Precision

177

85

41

51

13

85.9 %

118

60

29

29

94.8 %

78

42

18

18

91.7 %

4
5

142
136

71
74

29
33

42
29

2
2

97.2 %
96.8 %

average

130.2

66.4

30

33.8

4.6

93.3 %

manipulating triples and querying in SPARQL (http://www.w3.org/TR/rdf-sparql-query/). In


addition, we used RelFinder [5] so that users can easily see relationships that exist among the
multimedia linked data. The Content Editor was implemented using FastColoredTextBox
(https://github.com/PavelTorgashov/FastColoredTextBox) and CefSharp (https://github.com/
cefsharp/CefSharp) was used for HTML5 Preview Window.

Table 5 Three linked data that students created


Graph

Linked data summary

http://ah.withcat.net/onacloud/Ursidae
Subject: animals that belong to the
category, Bear

Triples148
Top2 Subjects (total: 65)
http://dbpedia.org/resource/Grizzly_bear:10
http://dbpedia.org/resource/Giant_panda:9
Top2Predicates (total: 7)
http://www.w3.org/2000/01/rdf-schema#label:65
http://ah.withcat.net/onacloud/p#Eat:27
Top2 Objects (total: 119)
watch:9
http://ah.withcat.net/onacloud/Ursidae/r#Fruits:6

http://ah.withcat.net/onacloud/Galaxy
Subject: Galaxy where we live

Triples309
Top2 Subjects (total: 133)
http://dbpedia.org/resource/Saturn_(astrology):14
http://dbpedia.org/resource/EartH:11
Top2 Predicates (total:18)
http://www.w3.org/2000/01/rdf-schema#label:133
http://ah.withcat.net/onacloud/p#atmosphere:39
Top2 Objects (total: 261)
http://ah.withcat.net/onacloud/Galaxy/r#helium:6
http://ah.withcat.net/onacloud/Galaxy/r#hydrogen:6

http://ah.withcat.net/onacloud/Greek_mythology
Subject: Greek mythology

Triples271
Top2 Subjects (total: 111)
http://dbpedia.org/resource/Zeus:20
http://dbpedia.org/resource/Athena:18
Top2 Predicates (total: 10)
http://www.w3.org/2000/01/rdf-schema#label:111
http://ah.withcat.net/onacloud/p#personification:67
Top2 Objects (total: 216)
http://dbpedia.org/resource/Olympians:14
http://dbpedia.org/resource/Zeus:8

Multimed Tools Appl

Table 6 Time spent for creating


linked data (in seconds)

Linked data

rdfEditor

Our system

Ursidae

1009

439

Galaxy
Greek mythology

1831
1783

546
674

The current system is available through Apache2 license. In order to run the system,
Microsoft .NET Framework 4.5 (http://www.microsoft.com/download/details.aspx?id=30653)
needs to be installed. In addition, it requires Java JRE (https://java.com/en/download/). Once
these are correctly set up, users can download our system (https://simple-semantic-web-editor.
googlecode.com/svn/dist/SSWEditor.0.0.2.27.zip) and execute it. Fuseki server supports an
HTTP-based default triple store and some endpoints that are shown in Table 2.

5 Experiments
In this section, we describe two experiments that were done to test whether the proposed
system would be useful for lay users who do not have technical backgrounds on the Semantic
Web.
The first experimentation was conducted on five undergraduate students who had not heard
of the Semantic Web and linked data before the experiment. More specifically, an instructor
explained to students about how the system could be downloaded and installed. The instructor
also showed an example about how to use the system using the image in Fig. 1 mentioned in
section 2. All of these took about 15 minutes and students were given an hour to create their
own SWDs by making annotations about 20 images that were provided. The experiment that
we conducted was to annotate images which required searching keywords and connecting
them to images. This needs to select some keywords from available words and has been used
in the field of vision recognition [11, 12] and we used 20 images randomly chosen from
Flicker8k [7]. Table 3 shows a summay of what five students created using our system.
Table 4 shows that triples created by students using our sytem were relatively valid in the
following sense. Five students created some number of triples by themselves and the system
generated some number of triples automatically. More specifically, lets consider student ID 1.
The total number of triples that the student created was 177 and out of these, 85 triples were
automatically created by the system (i.e., http://www.w3.org/2000/01/rdf-schema#label). So,
92 triples were essentially created by the student and there were 13 triples which had errors.
Therefore the precision for the student is calculated as {(177 85) 13} / (177 85) and it is
approximately 85.9 %. Similarly, we can calculate the average precision and it is about 93.3 %.
The second experiment was done in order to see how the proposed sysem would be helpful
in creating linked data compared to rdfEditor which is a well-known tool for editing RDF and

Table 7 Time spent for describing resources and relations (in seconds)
Activity

rdfEditor

Our system

Description of resources

2048 (44 %)

400 (24 %)

Description of relations

2575 (56 %)

1259 (76 %)

Multimed Tools Appl


Table 8 Example relations that students found

Q1. What is the relation between Giant Panda and


Red Panda?
Finding) Both Giant Panda and Red Panda eat
Bamboo and they are endangered.
Q2. What are species in the family of Phocidae
that eat Fish?
Finding) Harbor Seal, Ross Seal, Beared Seal eat
Fish. Harbor Sealbelons to Phoca, Ross
Sealbelongs to Ommatophoca, and Beared
Sealbelongs to Erignthus.
Q3. What are common properties among M31,
Andromeda Galaxy and Milky Way Galaxy?
Finding) Globula cluster can be found in M31,
Andromeda Galaxy and Milky Way Galaxy .

SPARQL. Participants in this experiment were different undergraduate students from those
who were involved in the first experiment. There were three new undergraduate students who
did not know Semantic Web and linked data and they were asked to create linked data using
both rdfEditor and our system. Table 5 shows the linked data that students created.
Table 6 shows how much time it took to create three linked data mentioned in Table 5 using
both systems. From this table, we see that it took less time to create linked data when students
used our system compared to rdfEditor.
We also analyzed how much time students spent for describing resources and describing
relations. Table 7 shows that about a quarter of the total time was spent for describing
resources when students used our system. In contrast, about half of the time was spent for
describing resources when students used rdfEditor. One reason for this was because syntactic
rules for describing resources in our system were relatively easier than rules used in rdfEditor.
On top of this, an autocompletion service provided by our system helped students describing
resources since description of resources involved finding appropriate resources.
Besides this, students could run RelFinder in our system and Table 8 shows some of
relations that students found from linked data that they created using our system.
aVVUVVu
N
aVVUVV{i{
aVVUVVy~

aVVUVV{i{

Fig. 4 Part of an SWD and the result of running RelFinder

Multimed Tools Appl

Students mentioned that they did not have difficulty in using the system even though they
did not know what linked data or Semantic Web was. They commented that our system could
be used to summarize learning materials in the form of linked data so that relevant materials
could be easily found. We believe that a lot of available educational linked data [3, 6] can be
easily used as search spaces for finding relations.

6 An application in education
In this section, we describe how the proposed system can be used in an education scenario
where a student is asked to write an essay about representative figures of romanticism for
homework. To do the homework, the student can use our system to create SWDs and find
unexpected relations that may not look apparent by running RelFinder inside our system. For
example, Fig. 4 shows that Nietzsche commented favorably on Richard Wagner in the book,
The Birth of Tragedy.
In addition, the student can find that Nietzsche liked Dionysus and both Richard Wagner
and Dionysus are related to romanticism as shown in Fig. 5. The figure also shows a reason
why Nietzsche commented favorably on Richard Wagner.
Based on these findings, the student can continue exploring the linked data and find more
meaningful relations to finish the homework.
aVVUVVu

aVVUVVk
aVVUVVy~

aVVUVVyOP
aVVUVVk

aVVUVVyOP

Fig. 5 Part of an SWD and a connection between Nietzsche and Richard Wagner

Multimed Tools Appl

7 Conclusions
In this paper, we presented a system by which non-expert users can easily create a
multimedia linked data. The system also allows users to find relations among available linked data so that it can be applicable to educational scenarios. The number of
available linked data is ever-increasing and it is argued that Semantic Web technologies can be helpful in higher education [10]. There are already many systems and
tools that can be used for linked data, but using those systems is not always easy for
lay users who do not have technical backgrounds on the Semantic Web [4]. Experimental results indicate that our system is easy to use and it has potential applications
in education. Especially, the system can be used to create and view different types of
multimedia files because our system supports HTML5 preview that can replay many
multimedia formats available on the Web. Users can interlink different types of
multimedia in such a way that simple learning materials can be generated and stored
in linked data format which can be combined with existing linked data such as
DBpedia (http://dbpedia.org).

References
1. Andriole SJ (2010) Business impack of Web 2.0 technologies. Commun ACM 53(12):6779
2. Bizer C, Heath T, Berners-Lee T (2009) Linked data the story so far. Int J Semant Web Inf Syst
5(3):122
3. dAquin M, Adamou A, Dietze S (2013) Assessing the educational linked data landscape. Proceedings of the
WebSCi 4346
4. Dadzie AS, Rowe M (2011) Approaches to visualizing linked data: a survey. Semant Web 2(2):89124
5. Heim P, Hellmann S, Lehmann J, Lohmann S, Stegemann T (2009) RelFinder: revealing relationships in
RDF knowledge bases. Proceedings of the 4th International Conference on Semantic and Digitial Media
Technologies 182187
6. Herder E, Dietze S, dAquin M (2013) LinkedUp Linking Web data for adaptive education, UMAP
Extended Proceedings, http://ceur-ws.org/Vol-997/
7. Hodosh M, Young P, Hockenmaier J (2013) Framing image annotation as a ranking task: data, models and
evaluation metrics. J Artif Intell Res 47:853899
8. Li Y, Wald M, Wills G (2012) Applying linked data in multimedia annotations. Int J Semant Comput Spec
Issue Semant Multimed 6(3):289313
9. Schandl B, Haslhofer B, Burger T, Langegger A, Halb W (2012) Linked data and multimedia: the state of
affairs. Multimed Tools Appl 59(2):523556
10. Tiropanis T, Davis H, Millard D, Weal M (2009) Semantic technologies for learning and teaching in the Web
2.0 era. IEEE Intell Syst 24(6):4953
11. Vinyals O, Toshev A, Bengio S, Erhan D (2014) Show and tell: a neural image caption generator. arXiv
preprint arXiv:1411.4555
12. Yao BZ, Yang X, Lin L, Lee MW, Zhu SC (2010) I2t: image parsing to text annotation. Proc IEEE 98(8):
14851508

Multimed Tools Appl

Jeongmin Chae received the B.E., M.E., and Ph.D. degrees in computer science from the Korea University,
Seoul, South Korea, in 2003, 2005, and 2012, respectively. Since 2012, he has been with the Gifted Education
Center, Korea University, where he is currently a research professor. His main areas of research interest are
textmining and semantic web.

Yoonah Cho received the B.E. degrees in computer science education from the Korea University, Seoul, South
Korea, in 2008. Since 2010, she joined Wolgye High School, Seoul, where she has teached mathematics as a high
school teacher. Her main areas of research interest are mathematics and semantic web.

Multimed Tools Appl

Minkyung Lee is an undegraduate student in computer science at Korea University in Seoul, Korea. Her
research interests include Web-based applications and computer science education.

Sangmi Lee is an undegraduate student in computer science at Korea University in Seoul, Korea. Her research
interests include Semantic computing and knowledge representation.

Multimed Tools Appl

Munsuk Choi is an undegraduate student in computer science at Korea University in Seoul, Korea. Her research
interests include Semantic Web and computer science education.

Seongbin Park is a professor at the computer science education department of Korea university in Seoul, Korea.
His research interests include computer science education, adaptive hypermedia, and the Semantic Web.

Você também pode gostar