The implementation of sax and random projection for motif discovery on the orbital elements and the resonance argument of asteroid

Document Type : Research Paper

Authors

1 Department of Computer Science Education, Faculty of Mathematics and Natural Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia.

2 Department of Physics Education, Faculty of Mathematics and Natural Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia.

3 Astronomy Research Division, Faculty of Mathematics and Natural Science, Institut Teknologi Bandung, Bandung, Indonesia.

4 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Cawangan Melaka Kampus Jasin, Melaka, Malaysia.

Abstract

Motif discovery has emerged as one of the most useful techniques in processing time-series data. One of the implementations of motif discovery is in case study 1:1 mean motion resonance (MMR) in the astronomy field. This study aims to build a computational model and its implementation to process time-series data and predict 1:1 MMR from asteroid orbital elements in time-series form. This model proposes Symbolic Aggregate approximation (SAX) and Random Projection (RP) algorithms implemented in the Python programming language. Some experiments involving ten asteroids’ orbital elements data have been carried out to validate the program. From the results obtained, we conclude that our computational model can predict the location of the motif and with which planet the motif is found for 1:1 resonance to occur.

Keywords

[1] H. Abe and T. Yamaguchi, Implementing an integrated time-series data mining environment-a case study of
medical kdd on chronic hepatitis, 1st Int. Conf. complex Med. Eng., (2005) 1–4, 2005.
[2] I. Androulakis, J. Wu, J. Vitolo, and C. Roth, Selecting maximally informative genes to enable temporal expression
profiling analysis, in Proc. of Foundations of Systems Biology in Engineering, (2005) 23–26.
[3] C. P. Asmoro, L. S. Riza, N. D. Ardi, and Y. R. Tayubi, Analysis of meteorological parameters wind speed,
temperature, and pressure profiles during Tropical Cyclone Cempaka Dahlia 2017 using time series analysis, In
Journal of Physics: Conference Series , 1280 (2) (2019) 022076.
[4] J. Buhler and M. Tompa, Finding motifs using random projections, J. Comput. Biol., 9 (2) (2002) 225–242.
[5] B. Celly and V. Zordan, Animated people textures, in 17th International Conference on Computer Animation and
Social Agents (CASA), (2004) 1–8.
[6] R. K. Chawda and G. Thakur, Big data and advanced analytics tools, in 2016 Symposium on Colossal Data
Analysis and Networking (CDAN 2016), (2016) 1-8.
[7] E. Forg´acs-Dajka, Z. S´andor, and B. Erdi, ´ A fast method to identify mean motion resonances, Mon. Not. R.
Astron. Soc., 2018, 477 (3) (2018) 3383–3389, doi: 10.1093/mnras/sty641.
[8] T. Guyet, C. Garbay, and M. Dojat, Knowledge construction from time series data using a collaborative exploration system, J. Biomed. Inform., 40 (2) (2007) 672–687, doi: 10.1016/j.jbi.2007.09.006.
[9] N. C. Jones, P. A. Pevzner, and P. Pevzner, An introduction to bioinformatics algorithms,MIT press, 2004.
[10] M. I. Jordan and T. M. Mitchell, Machine learning: trends, perspectives, and prospects, Science, 349 (6245) (2015)
255-260.
[11] I. Lenz, H. Lee, and A. Saxena, “Deep learning for detecting robotic grasps,” Int. J. Rob. Res., 34(2015) 705–724,
doi: 10.1177/0278364914549607.
[12] H.F. Levison and M.J. Duncan, The long-term dynamical behavior of short-period comets, Icarus, 1994, 108 (1)
(1994) 18–36,doi: 10.1006/icar.1994.1039.
[13] T. Li, H. Shen, Q. Yuan, X. Zhang, and L. Zhang, Estimating ground-level pm2.5 by fusing satellite and station
observations: a geo-intelligent deep learning approach, Geophys. Res. Lett., 44 (23) (2017) 11-985.
[14] J. Lin, E. Keogh, L. Wei, and S. Lonardi, Experiencing sax: a novel symbolic representation of time series, Data
Min. Knowl. Discov., 15 (2) (2007)107-144.
[15] J. Lin, E. Keogh, S. Lonardi, and P. Patel, Finding motifs in time series, Proc. 2nd Work. Temporal Data Min.,
(2002) 53–68.
[16] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, A symbolic representation of time series, with implications for
streaming algorithms, in Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining
and Knowledge Discovery, DMKD, (2003) 2-11.
[17] B. Liu, J. Li, C. Chen, W. Tan, Q. Chen, and M. Zhou, Efficient motif discovery for large-scale time series in
healthcare, IEEE Trans. Ind. Informatics, (2015) 583–590, doi: 10.1109/TII.2015.2411226.
[18] S. Londhe and S. Mahajan, Effective and efficient way of reduce dependency on dataset with the help of mapreduce
on big data, Int. J. Students’ Res. Technol. Manag., 15 (1) (2015) 401-405.
[19] H. M. Martinez, An efficient method for finding repeats in molecular sequences, Nucleic Acids Res., 11 (13) (1983)
4629-4634.
[20] A. Mueen and E. Keogh, Online discovery and maintenance of time series motifs, in Proceedings of the
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2010) 1089–1098, doi:
10.1145/1835804.1835941.
[21] A. Mueen, E. Keogh, Q. Zhu, S. Cash, and B. Westover, Exact discovery of time series motifs, in Society for
Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in
Applied Mathematics, (2009) 473–484, doi: 10.1137/1.9781611972795.41.
[22] S. Mullainathan and J. Spiess, Machine learning: an applied econometric approach, J. Econ. Perspect., 31 (2)
(2017) 87–106.
[23] K. Pariwatthanasak and C. A. Ratanamahatana, Time series motif discovery using approximated matrix profile,
in Advances in Intelligent Systems and Computing, (2019) 707–716, doi: 10.1007/978-981-13-1165-9 64.[24] P. A. Pevzner and S. H. Sze, Combinatorial approaches to finding subtle signals in DNA sequences, Proc. Int.
Conf. Intell. Syst. Mol. Biol., 8 (2002) 269-278.
[25] E. Pilat-Lohinger, A. S¨uli, P. Robutel, and F. Freistetter, ´ The influence of giant planets near a mean motion
resonance on earth-like planets in the habitable zone of sun-like stars, Astrophys. J., 681 (2008) 1639–1645, 2008,
doi: 10.1086/587501.
[26] L. S. Riza, J. A. Utama, S. M. Putra, F. M. Simatupang, and E. P. Nugroho, “Parallel exponential smoothing
using the bootstrap method in r for forecasting asteroid’s orbital elements,” Pertanika J. Sci. Technol., 26 (1)
(2018) 441–462.
[27] L. S. Riza, Y. Wihardi, E. A. Nurdin, N. D. Ardi, C. P. Asmoro, A. F. C. Wijaya and A. B. D. Nandiyanto,
Analysis on atmospheric pressure, temperature, and wind speed profiles during total solar eclipse 9 March 2016
using time series clustering, In Journal of Physics: Conference Series, 771 (1) (2016) 012009.
[28] L. S. Riza, T. F. Dhiba, W. Setiawan, T. Hidayat, and M. Fahsi, Parallel random projection using R high
performance computing for planted motif search, Telkomnika, 17 (3) (2019) 1352-1359.
[29] L. S. Riza, F. D. Pratama, E. Piantari, and M. Fashi, Genomic repeats detection using Boyer-Moore algorithm
on Apache Spark Streaming, Telkomnika,18 (2) (2020) 783-791.
[30] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, The Hadoop distributed file system, in 2010 IEEE 26th
Symposium on Mass Storage Systems and Technologies (MSST2010), (2010) 1-10.
[31] P. Wiegert, M. Connors, and C. Veillet, A retrograde co-orbital asteroid of Jupiter, Nature, 543 (2017) 687–689,
doi: 10.1038/nature22029.
[32] M. Zaharia, R.S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M.
J. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker, and I. Stoica, Apache spark: A unified engine for big data
processing, Commun. ACM, 59 (11) (2016) 56–65, doi: 10.1145/2934664.
[33] S. M. Zobaed and M. A. Salehi, Big Data in the Cloud, in Encyclopedia of Big Data Technologies, 47 (2019)
98-115.
[34] Y. Zhu, A. Mueen, and E. Keogh, Matrix Profile IX: Admissible Time Series Motif Discovery with Missing Data,
IEEE Trans. Knowl. Data Eng., (2019) 2616–2626, doi: 10.1109/TKDE.2019.2950623.
[35] Y. Zhu, Z. Zimmerman, N.S. Senobari, C.C.M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, Matrix
profile II: exploiting a novel algorithm and GPUs to break the one hundred million barrier for time series motifs and joins, in Proceedings - IEEE International Conference on Data Mining (ICDM), (2017) xxx–xxx, doi:
10.1109/ICDM.2016.126.
Volume 12, Special Issue
December 2021
Pages 959-970
  • Receive Date: 06 June 2021
  • Revise Date: 24 August 2021
  • Accept Date: 08 September 2021