Você está na página 1de 4

Applications Technical Data Analysis Skeleton

Extracted from Camera Kinect for Grading Gesture of


Vietnamese Martial Arts
Nguyen Tuong Thanh, Nguyen Dang Tuyen, Le Dung, Pham Thanh Cong
Hanoi University of Science and Technology
Vietnam Academy Science and Technology
Ha Noi, Viet Nam
thanh1277@gmail.com,nguyendangtuyen@quangtrung.edu.vn,ledung1974@gmail.com, cong.phamthanh@hust.edu.vn

Abstract—The paper has proposed the inclusion of data on the detection. Cling to the characteristics of the human body. The
depth of the scoring of camera movements Kinect users [1] to the main contribution of this article is given below:
traditional martial arts movements Vietnam, has built a database
of 36 sample 3D martial movements between now being put into - Implement image depth data to identify the grading
teaching at schools in Vietnam and grading software static dynamics of a traditional martial art according to a given 3D
movements. This is the first step of the implementation of self- sample database. Throughout the survey, no author has ever
training and evaluation of traditional martial Vietnam, identified the traditional martial arts movement with the Kinect
contributing to the preservation and development of ethnic elites. camera.
The program promises to have more expansive when can integrate
online assessment later. Statistics show that the program was - Build a 3D sample database to mark traditional martial arts
initially relatively accurate assessment of martial arts movements. for mid-day physical education programs that are included in
System work effectively with low computational complexity and high school curricula from 2016.
strong resistance to interference. - Build software program so that users can select difficulty
level of inputting through input parameters.
Keywords—skeleton data; gesture grading; limb vectors; self-
training; martial Vietnam

I. INTRODUCTION
Recent advances in motion recognition with Microsoft
Kinect have fueled many new ideas in applying motion
recognition and virtual reality. Over the years, extensive
research has been done on the basis of human motion
identification, using various behavioral patterns such as gait,
ballet performance or psychoactive identity. People through
facial expressions, facial expressions, voice ... with the Kinect
camera [2],[3].
Traditional Vietnamese martial art is a characteristic of the
Vietnamese people formed through the process of struggle for
national liberation. There is one characteristic so far that there is Fig. 1. Camera Kinect of Microsoft
no uniform consistency between the sects, so there is still no
standardized modeling system around the world like other
martial arts like karatedo or taewondo ... II. DESCRIPTION OF HUMAN MOVEMENT
Identifying human movements is not a simple matter of
In this paper, we propose to apply the image depth data
applied to the human performance scoring system based on the object identification [4]. We have to choose a suitable database
skeleton of the Kinect camera by Linwan Liu [1] to carry out to identify exactly one particular action. At that point, the
chapter development. Marking the movement of a traditional gesture must contain all the features of the action so as to
Vietnamese martial art by retrieving the skeleton data differentiate between them and the descriptions must be
acquisition for comparison with the sample data set and constant when the position of the person or the size of the
conducting the scoring according to the deviation comparison person changes [5].
formula. The Kinect Camera (Figure 1) is a Microsoft product
that is being sold at low cost to enable human interaction through
gestures based on two key characteristics: image depth and
A. Position of the joints Fig. 3. Kinect skeleton data and limb vectors.
The Kinect Camera gives the results of the coordinates of the
20 determined joints in real time to form a skeleton map of the 1) VECTOR HEAD: The head vector consists of segments
performer [6] [7] (Figure 2). The coordinates of the 20 joints are inside the head, shoulders and hips (Figure 3, with red
unique and can completely represent a movement, and the data endpoints). This vector rarely shows a single and strong
settings are relatively small for the computer. But changing movement. The bending and bending of this part of the body is
descriptions based on pixel geometry changes is very difficult to associated with the limbs. Therefore, we remove these vectors
determine distance figures. from the representative group.
2) Level 1 vector: Level 1 vector includes elbows and thighs
(Figure 3, with orange end point). Level 1 vector contains a lot
of information about movements and gestures. Therefore, we
arrange them in the representative group.
3) Level 2 vector: Level 2 vector includes the arms and legs
(Figure 3, with blue endpoints). Second level vector is extended
beyond vector level 1, And they make a dramatic visual
impression. Therefore, we classify them as representative
groups.
4) Hands and feet: Kinect can track the hands and feet of the
button (Figure 3, the black buttons), but during the receiver is
often unstable. The wrists and ankles are usually negligible. So
in this initial phase, we removed the hands and feet from the
representative group to ensure the robustness of the description.
In summary, the representative group consists of eight
genera vectors (Table I). As we can see, the data size is
significantly reduced while the necessary information of the
movement is retained well.
Fig. 2. Position the Kinect camera joints provided
TABLE I. GROUP VECTOR LIMB

Number Number
B. Limb Vectors
1 Upper Arm Left 5 Lower Arm Left
Camera Kinect acquires and gives us the coordinates of the
2 Upper Arm Right 6 Lower Arm Right
joints so we use vertebrates to represent the skeleton image of
the performer [8]. A limb vector is defined as the ramp between 3 Upper Leg Left 7 Lower Leg Left
two joints (Figure 3). As vector geometries, the descriptive 4 Upper Leg Right 8 Lower Leg Right
vector represents a single gesture that does not shift or change C. Rating movements
rate. The body movement can also be perfectly simulated by the
rotation of each vector around its original joint. In the reference [1], it is used for 2D-acting actions, but with
the movement of the martial arts, there is always a change in
Consider the characteristics of human motion and body depth and movement, so the author has proposed inserting the
structure, as suggested by [1]: We can reduce redundancy by image depth data. The formula for calculating the angle
analyzing vector readings in different groups. between the actual chi and the standard motion using the cosine
equation is defined as:
𝑥 𝑥 +𝑦 𝑦 +𝑧 𝑧
cos  = 𝑠𝑡 𝑟𝑒 𝑠𝑡 𝑟𝑒 𝑠𝑡 𝑟𝑒 (1)
2 +𝑦 2 +𝑧 2 √𝑥 2 +𝑦 2 +𝑧 2
√𝑥𝑠𝑡 𝑠𝑡 𝑠𝑡 𝑟𝑒 𝑟𝑒 𝑟𝑒

Where (𝑥𝑠𝑡 , 𝑦𝑠𝑡 , 𝑧𝑠𝑡 ) denote the coordinates of a chi variable


of the standard gestures, (𝑥𝑟𝑒 , 𝑦𝑟𝑒 , 𝑧𝑟𝑒 ) denote the coordinates
of a hand vector in real time. We define a set consisting of eight
corners as AngleDiff = {𝛼1 , 𝛼2 , 𝛼3 , 𝛼4 , 𝛼5 , 𝛼6 , 𝛼7 , 𝛼8} (see
Table I), which contains most of the distance information
between real time action and standard action [1].
III. GRADING FORMULA
A. The basic of scoring dynamics
In each martial arts, there is a time to stop hands and feet,
then to switch to another martial arts; Thus we base on the
traverse velocity of the elbow and hand points to select the point
of grading. Based on the velocity tracking chart of the 4 elbow • f (M) = 0, when the maximum access angle exceeds or exceeds
and left and right wrist points, we see velocity near zero at times the threshold value, the access function value is zero or
of stopping, where we can accurately mark the position. becomes infinity.
With the above features, the score will be properly limited
when the deviation between the real time gesture and the
standard gesture, ie to achieve a high score requires smooth
execution of all limbs.
Fig. 4. Two-hand velocity tracking chart.
IV. EXPERIMENTAL RESULTS
B. Grading Formula We collected 36 movements given by a martial artist that is
said to have the same level of performance (Figure 5). We set
Considering the fact that we observed from the experiments, the user input Dst = 50, Sst = 80, M = 35, and 36 using gesture
we assign different weights to each member in the distance array class (4).
and use the weighted sum to compensate for the difference in
vision. Using the scoring formula proposed [1], there is
additional image depth data from [1]
D=[𝑓1(1 + 2 ) + 𝑓2 (3 + 4 ) + 𝑓3 (5 + 6 ) + 𝑓4 (7 + 8 )] (2)
where 𝛼𝑖 denotes a member of AngleDiff, D denotes the
only distance parameters calculated from the array, 𝑓1 , 𝑓2 , 𝑓3 , 𝑓4 ,
representing the weight values of the upper arm, upper legs,
lower arms, and lower legs.
To calculate weight values, the system automatically
collects the last 10 pairs of gestures sorted from the database
that score between ± 15% and the average of each reference.
The weighted values are calculated as
1
𝑓𝑖 =
1 (3)
𝐴𝑣𝑔𝑖 . ∑4𝑖=1
𝐴𝑣𝑔𝑖
where Avg1, Avg2, Avg3, Avg4 indicates the average of
upper arms (1 + 2), upper legs(3 + 4), lower arms (5 + 6)
and lower legs (7 + 8). This will assign greater weight to the
limbs that are more rigorously assessed by the dots.
After we got the D value, our goal was to D to divide the Fig. 5. Receive standard data from the master
points percentage. The task is relatively simple using linear
transformations. We allow users to enter a standard value Dst
and its reference score Sst.
100−𝑆𝑠𝑡 (4)
Score = f(𝛼𝑚𝑎𝑥 ).[(Dst –D) x + Sst]
𝐷𝑠𝑡
where f(αmax) denotes the deflection limit function. Dst can
be manually set by the user according to their needs, smaller Dst
indicates higher classification standards. Sst provides the user
an option to control the points in the desired range.
In the case of gesture judgment a performer is often not
allowed to deviate from too many normative gestures. We
define a deflection limit function that allows the user to enter a
M threshold value to limit unacceptable deviation gestures:
0.4 2
f(max ) = 1 – 2 𝛼𝑚𝑎𝑥 (5)
𝑀

where αmax represents maximum in AngleDiff. The


function has several features:
• f (0) = 1. When the real-time gesture is identical to the
standard gesture, the function value is 1.
• f '(αmax) <0, f "( αmax) <0, where the angle of deviation
maximizes, the value of the function drastically decreases.
Fig. 6. Standard sample collection
A. Grading result
Fig 7 shows some results in which the movements that
performers turn away from the camera are not as high as the
face movements. As well as some of the highly obscure
movements, the precision as well as the dotted point decreased
considerablylumns.

Fig. 9. Final score results.

V. CONCLUSION
The continued insertion of image depth data into the static
motion scoring formula [1] to perform the dynamic impact of
skeletal data obtained from the Kinect camera to develop
performance evaluation software. Traditional Vietnamese
martial arts have been introduced for the application of self-
marking assessment. From there, the learner can practice the
original martial arts and self-assess through the Kinect camera
Fig. 7. Statistics on the average score of 36 traditional martial arts.) developed by Microsoft with a variety of application programs
and sold at low cost, contributing to the preservation and
B. Implement continuous scoring system playback Traditional Vietnamese martial arts.
The dynamic grading system was developed on the basis of The article has built a 3D sample database to mark
Visual Studio 2013 using C #. Users will be required to input traditional martial arts for mid-day physical education programs
parameters in the edit box and select a standard motion data set that have been included in high school curricula since 2016,
from the library before starting. Users will do calibration work from which there are many openings out for self-promotion and
to correct before starting to perform the movements according can be assessed automatically through the network only need a
to the set of libraries. The standard movements will in turn be Kinect camera.
displayed on the screen as static images of the skeleton (Figure However, there are still many ways to further refine this
8, At right). problem, such as correcting grading for more standard
The practitioner will continuously perform his movements obscuration. The next task is to solve the speed of receiving and
and the camera will record and compare the patterns correctly handling of the faster movements to meet the needs of the actual
displayed on the screen or if you forget, you can look at the martial art is a more demanding requirements for development.
pattern to perform properly. After the computer receives the .
performer's data from the camera, it will compare and score REFERENCES
immediately, then give the image of the next move to the end
of the song. At the end of the program will give the total score
of performers. [1]. Linwan Liu, X.Wu, L.Wu, Static human gesture grading based on
Kinect, 5th International congress on Image and signal processing 2012.
[2]. Zoe. M, Sebastian.K, Isabel.P, Joao.B, Super Miror: A Kinect interface
for ballet dancers, CHI May 5-10, Austin, Texas, USA, 2012.
[3]. Michalis.R, Darko.K, Hugues.H, Real-Time classification of dance
gesture from skeleton animation, ACM SIGGRAPH Symposium on
computer animation, 2011.
[4]. D. Kim and J. Paik, "Gait recognition using active shape model and
motion prediction”, vol. 4, pp. 25-36,IET Computer Vision 2010
[5]. J. Lazar, J.H. Feng, and H. Hochheiser, “Research methods in human
computer interaction,” John Wiley & Sons, 2010
[6]. J. Shotton, A.Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore,
A. Kipman, A. Blake, “Realtime human pose recognition in parts from
single depth images,” In Proc. Conf. Computer Vision and Pattern
Recognition, 2011
[7]. Kinect Natural User Interface Overview, Microsoft Research, 2011
Fig. 8. Interface program scoring.

[8]. M. Raptis, D. Kirovski, and H. hoppe, “Real-time classification of dance


gesture from skeleton animation,” Proceeding of the 2011 ACM
SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 147-
156, ACM, New York, 2011

Você também pode gostar